---
pipeline_tag: text-to-video
license: other
license_name: tencent-hunyuan-community
license_link: LICENSE
---
# FastHunyuan Model Card
## Model Details
FastHunyuan is an accelerated [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo) model. It can sample high quality videos with 6 diffusion steps. That brings around 8X speed up compared to the original HunyuanVideo with 50 steps.
- **Developed by**: [Hao AI Lab](https://hao-ai-lab.github.io/)
- **License**: tencent-hunyuan-community
- **Distilled from**: [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo)
- **Github Repository**: https://github.com/hao-ai-lab/FastVideo
## Usage
- Clone [Fastvideo](https://github.com/hao-ai-lab/FastVideo) repository and follow the inference instructions in the README.
- Alternatively, you can inference FastHunyuan using the official [Hunyuan Video repository](https://github.com/Tencent/HunyuanVideo) by **setting the shift to 17 and steps to 6, resolution to 720X1280X125, and cfg bigger than 6**.
We find that a large CFG scale generally leads to faster videos.
## Training details
FastHunyuan is consistency distillated on the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main) dataset with the following hyperparamters:
- Batch size: 16
- Resulotion: 720x1280
- Num of frames: 125
- Train steps: 320
- GPUs: 32
- LR: 1e-6
- Loss: huber
## Evaluation
We provide some qualitative comparison between FastHunyuan 6 step inference v.s. the original Hunyuan with 6 step inference:
| FastHunyuan 6 step | Hunyuan 6 step |
| --- | --- |
| data:image/s3,"s3://crabby-images/9ba22/9ba225bdee2024aa19a5f6f784b4e250fe0e5a1b" alt="FastHunyuan 6 step" | data:image/s3,"s3://crabby-images/d4e3a/d4e3a3ef293aae211ae4b58b4018983317e51846" alt="Hunyuan 6 step" |
| data:image/s3,"s3://crabby-images/c0575/c0575d90838dfec34d5dd1a23bedc2a82b0302d8" alt="FastHunyuan 6 step" | data:image/s3,"s3://crabby-images/4b152/4b152901f694dfdbf36b781f6bdab45ce91abe4a" alt="Hunyuan 6 step" |
| data:image/s3,"s3://crabby-images/0f473/0f473eb4a3d9827936755808574ee4add5ee743b" alt="FastHunyuan 6 step" | data:image/s3,"s3://crabby-images/7e1c2/7e1c28acd6b1c9dcbcd748e0aa574ddcff7bb53d" alt="Hunyuan 6 step" |
| data:image/s3,"s3://crabby-images/80c43/80c4356abd89925113ee427e3dd99fa6cd614480" alt="FastHunyuan 6 step" | data:image/s3,"s3://crabby-images/d9a8f/d9a8fd9093e76595270741b6ced60e6d2504e452" alt="Hunyuan 6 step" |
## Memory requirements
Please check our github repo for details. https://github.com/hao-ai-lab/FastVideo
For inference, we can inference FastHunyuan on single RTX4090. We now support NF4 and LLM-INT8 quantized inference using BitsAndBytes for FastHunyuan. With NF4 quantization, inference can be performed on a single RTX 4090 GPU, requiring just 20GB of VRAM.
For Lora Finetune, minimum hardware requirement
- 40 GB GPU memory each for 2 GPUs with lora
- 30 GB GPU memory each for 2 GPUs with CPU offload and lora.