|
--- |
|
license: gpl-3.0 |
|
datasets: |
|
- nkp37/OpenVid-1M |
|
- TempoFunk/webvid-10M |
|
base_model: |
|
- VideoCrafter/VideoCrafter2 |
|
pipeline_tag: text-to-video |
|
--- |
|
# Advanced text-to-video Diffusion Models |
|
|
|
|
|
⚡️ This repository provides training recipes for the AMD efficient text-to-video models, which are designed for high performance and efficiency. The training process includes two key steps: |
|
|
|
* Distillation and Pruning: We distill and prune the popular text-to-video model [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter), reducing the parameters to a compact 945M while maintaining competitive performance. |
|
|
|
* Optimization with T2V-Turbo: We apply the [T2V-Turbo](https://github.com/Ji4chenLi/t2v-turbo) method on the distilled model to reduce inference steps and further enhance model quality. |
|
|
|
This implementation is released to promote further research and innovation in the field of efficient text-to-video generation, optimized for AMD Instinct accelerators. |
|
|
|
You can download the code from our [GitHub Repo](https://github.com/AMD-AIG-AIMA/AMD-Hummingbird-T2V). |
|
|
|
<img src="GIFs/vbench.png" alt="Vbench performance" title="Vbench performance" class="vbench-img"> |
|
|
|
|
|
**8-Steps Results** |
|
<style> |
|
table { |
|
width: auto; |
|
border-collapse: collapse; |
|
} |
|
th, td { |
|
border: 1px solid #ddd; |
|
text-align: center; |
|
padding: 0px; |
|
vertical-align: middle; |
|
width: 256px; /* 每列宽度固定 */ |
|
} |
|
tr.text-row { |
|
height: 30px; /* 文字行高度 */ |
|
} |
|
tr.image-row { |
|
height: 160px; /* 图片行高度 */ |
|
} |
|
/* 默认表格中的图片大小 */ |
|
img { |
|
width: 256px; |
|
height: 160px; |
|
object-fit: cover; |
|
} |
|
/* 只影响 vbench.png */ |
|
.vbench-img { |
|
width: 785px !important; |
|
height: 698px !important; |
|
object-fit: contain; /* 让图片完整显示,不裁剪 */ |
|
} |
|
</style> |
|
|
|
|
|
<table> |
|
<tr class="text-row"> |
|
<th>A cute happy Corgi playing in park, sunset, pixel.</th> |
|
<th>A cute happy Corgi playing in park, sunset, animated style.gif</th> |
|
<th>A cute raccoon playing guitar in the beach.</th> |
|
<th>A cute raccoon playing guitar in the forest.</th> |
|
</tr> |
|
<tr class="image-row"> |
|
<td><img src="GIFs/A_cute_happy_Corgi_playing_in_park,_sunset,_pixel_.gif"></td> |
|
<td><img src="GIFs/A cute happy Corgi playing in park, sunset, animated style.gif"></td> |
|
<td><img src="GIFs/A cute raccoon playing guitar in the beach.gif"></td> |
|
<td><img src="GIFs/A cute raccoon playing guitar in the forest.gif"></td> |
|
</tr> |
|
<tr class="text-row"> |
|
<th>A quiet beach at dawn and the waves gently lapping.</th> |
|
<th>A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, Chinese New Year.</th> |
|
<th>A sandcastle being eroded by the incoming tide.</th> |
|
<th>An astronaut flying in space, in cyberpunk style.</th> |
|
</tr> |
|
<tr class="image-row"> |
|
<td><img src="GIFs/A_quiet_beach_at_dawn_and_the_waves_gently_lapping.gif"></td> |
|
<td><img src="GIFs/A cute teddy bear, dressed in a red silk outfit, stands in a vibrant street, chinese new year..gif"></td> |
|
<td><img src="GIFs/A sandcastle being eroded by the incoming tide.gif"></td> |
|
<td><img src="GIFs/An astronaut flying in space, in cyberpunk style.gif"></td> |
|
</tr> |
|
<tr class="text-row"> |
|
<th>A cat DJ at a party.</th> |
|
<th>A 3D model of a 1800s victorian house.</th> |
|
<th>A drone flying over a snowy forest.</th> |
|
<th>A ghost ship navigating through a sea under a moon.</th> |
|
</tr> |
|
<tr class="image-row"> |
|
<td><img src="GIFs/A_cat_DJ_at_a_party.gif"></td> |
|
<td><img src="GIFs/A 3D model of a 1800s victorian house..gif"></td> |
|
<td><img src="GIFs/a_drone_flying_over_a_snowy_forest.gif"></td> |
|
<td><img src="GIFs/A_ghost_ship_navigating_through_a_sea_under_a_moon.gif"></td> |
|
</tr> |
|
</table> |
|
|
|
|
|
|
|
|
|
|
|
|
|
# License |
|
Copyright (c) 2024 Advanced Micro Devices, Inc. All Rights Reserved. |