FastVideo
/

FastWan2.1-T2V-1.3B-Diffusers

Model card Files Files and versions

BrianChen1129 commited on 24 days ago

Commit

aa98f43

·

verified ·

1 Parent(s): 3b45704

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -34,7 +34,8 @@ This model is jointly finetuned with [DMD](https://arxiv.org/pdf/2405.14867) and
 - 3-step inference is supported and achieves up to **20 FPS** on a single **H100** GPU.
 - Our model is trained on **61×448×832** resolution, but it supports generating videos with any resolution.(quality may degrade)
 - Finetuning and inference scripts are available in the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository:
-  - [Finetuning script](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/distill/v1_distill_dmd_wan_VSA.sh)
   - [Inference script](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/inference/v1_inference_wan_dmd.sh)
 - Try it out on **FastVideo** — we support a wide range of GPUs from **H100** to **4090**, and also support **Mac** users!
@@ -43,9 +44,6 @@ This model is jointly finetuned with [DMD](https://arxiv.org/pdf/2405.14867) and
 Training was conducted on **4 nodes with 32 H200 GPUs** in total, using a `global batch size = 64`.
 We enable `gradient checkpointing`, set `gradient_accumulation_steps=2`, and use `learning rate = 1e-5`.
 We set **VSA attention sparsity** to 0.8, and training runs for **4000 steps (~12 hours)**
-The detailed **training example script** is available [here](https://github.com/hao-ai-lab/FastVideo/blob/main/examples/distill/Wan-Syn-480P/distill_dmd_VSA_t2v.slurm).
 If you use the FastWan2.1-T2V-1.3B-Diffusers model for your research, please cite our paper:
 ```

 - 3-step inference is supported and achieves up to **20 FPS** on a single **H100** GPU.
 - Our model is trained on **61×448×832** resolution, but it supports generating videos with any resolution.(quality may degrade)
 - Finetuning and inference scripts are available in the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository:
+  - [1 Node/GPU debugging finetuning script](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/distill/v1_distill_dmd_wan_VSA.sh)
+  - [Slurm training example script](https://github.com/hao-ai-lab/FastVideo/blob/main/examples/distill/Wan-Syn-480P/distill_dmd_VSA_t2v.slurm)
   - [Inference script](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/inference/v1_inference_wan_dmd.sh)
 - Try it out on **FastVideo** — we support a wide range of GPUs from **H100** to **4090**, and also support **Mac** users!
 Training was conducted on **4 nodes with 32 H200 GPUs** in total, using a `global batch size = 64`.
 We enable `gradient checkpointing`, set `gradient_accumulation_steps=2`, and use `learning rate = 1e-5`.
 We set **VSA attention sparsity** to 0.8, and training runs for **4000 steps (~12 hours)**
 If you use the FastWan2.1-T2V-1.3B-Diffusers model for your research, please cite our paper:
 ```