FastVideo
/

FastWan2.1-T2V-1.3B-Diffusers

Model card Files Files and versions

BrianChen1129 commited on 26 days ago

Commit

9560ff8

·

verified ·

1 Parent(s): b311ea4

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ This model is jointly finetuned with [DMD](https://arxiv.org/pdf/2405.14867) and
 Training was conducted on **4 nodes with 32 H200 GPUs** in total, using a `global batch size = 64`.
 We enable `gradient checkpointing`, set `gradient_accumulation_steps=2`, and use `learning rate = 1e-5`.
 We set **VSA attention sparsity** to 0.8, and training runs for **4000 steps (~12 hours)**
-The detailed training example script is available [here](https://github.com/hao-ai-lab/FastVideo/blob/main/examples/distill/Wan-Syn-480P/distill_dmd_VSA_t2v.slurm).

 Training was conducted on **4 nodes with 32 H200 GPUs** in total, using a `global batch size = 64`.
 We enable `gradient checkpointing`, set `gradient_accumulation_steps=2`, and use `learning rate = 1e-5`.
 We set **VSA attention sparsity** to 0.8, and training runs for **4000 steps (~12 hours)**
+The detailed **training example script** is available [here](https://github.com/hao-ai-lab/FastVideo/blob/main/examples/distill/Wan-Syn-480P/distill_dmd_VSA_t2v.slurm).