amd
/

Text-to-Video
AMD-Hummingbird-T2V / README.md
hecui102's picture
Update README.md
34ced90 verified
---
license: gpl-3.0
datasets:
- nkp37/OpenVid-1M
- TempoFunk/webvid-10M
base_model:
- VideoCrafter/VideoCrafter2
pipeline_tag: text-to-video
---
# Advanced text-to-video Diffusion Models
⚡️ This repository provides training recipes for the AMD efficient text-to-video models, which are designed for high performance and efficiency. The training process includes two key steps:
* Distillation and Pruning: We distill and prune the popular text-to-video model [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter), reducing the parameters to a compact 945M while maintaining competitive performance.
* Optimization with T2V-Turbo: We apply the [T2V-Turbo](https://github.com/Ji4chenLi/t2v-turbo) method on the distilled model to reduce inference steps and further enhance model quality.
This implementation is released to promote further research and innovation in the field of efficient text-to-video generation, optimized for AMD Instinct accelerators.
You can download the code from our [GitHub Repo](https://github.com/AMD-AIG-AIMA/AMD-Hummingbird-T2V).
<img src="GIFs/vbench.png" alt="Vbench performance" title="Vbench performance" class="vbench-img">
**8-Steps Results**
<style>
table {
width: auto;
border-collapse: collapse;
}
th, td {
border: 1px solid #ddd;
text-align: center;
padding: 0px;
vertical-align: middle;
width: 256px; /* 每列宽度固定 */
}
tr.text-row {
height: 30px; /* 文字行高度 */
}
tr.image-row {
height: 160px; /* 图片行高度 */
}
/* 默认表格中的图片大小 */
img {
width: 256px;
height: 160px;
object-fit: cover;
}
/* 只影响 vbench.png */
.vbench-img {
width: 785px !important;
height: 698px !important;
object-fit: contain; /* 让图片完整显示,不裁剪 */
}
</style>
<table>
<tr class="text-row">
<th>A cute happy Corgi playing in park, sunset, pixel.</th>
<th>A cute happy Corgi playing in park, sunset, animated style.gif</th>
<th>A cute raccoon playing guitar in the beach.</th>
<th>A cute raccoon playing guitar in the forest.</th>
</tr>
<tr class="image-row">
<td><img src="GIFs/A_cute_happy_Corgi_playing_in_park,_sunset,_pixel_.gif"></td>
<td><img src="GIFs/A cute happy Corgi playing in park, sunset, animated style.gif"></td>
<td><img src="GIFs/A cute raccoon playing guitar in the beach.gif"></td>
<td><img src="GIFs/A cute raccoon playing guitar in the forest.gif"></td>
</tr>
<tr class="text-row">
<th>A quiet beach at dawn and the waves gently lapping.</th>
<th>A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, Chinese New Year.</th>
<th>A sandcastle being eroded by the incoming tide.</th>
<th>An astronaut flying in space, in cyberpunk style.</th>
</tr>
<tr class="image-row">
<td><img src="GIFs/A_quiet_beach_at_dawn_and_the_waves_gently_lapping.gif"></td>
<td><img src="GIFs/A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, chinese new year..gif"></td>
<td><img src="GIFs/A sandcastle being eroded by the incoming tide.gif"></td>
<td><img src="GIFs/An astronaut flying in space, in cyberpunk style.gif"></td>
</tr>
<tr class="text-row">
<th>A cat DJ at a party.</th>
<th>A 3D model of a 1800s victorian house.</th>
<th>A drone flying over a snowy forest.</th>
<th>A ghost ship navigating through a sea under a moon.</th>
</tr>
<tr class="image-row">
<td><img src="GIFs/A_cat_DJ_at_a_party.gif"></td>
<td><img src="GIFs/A 3D model of a 1800s victorian house..gif"></td>
<td><img src="GIFs/a_drone_flying_over_a_snowy_forest.gif"></td>
<td><img src="GIFs/A_ghost_ship_navigating_through_a_sea_under_a_moon.gif"></td>
</tr>
</table>
# License
Copyright (c) 2024 Advanced Micro Devices, Inc. All Rights Reserved.