🎬 Wan-NVFP4-4Steps Models

NVFP4 Quantization-Aware Step Distillation for Blackwell Architecture

GitHub HuggingFace

📋 Table of Contents

✨ Features

  • ⚡ 4-Step Inference: Dramatically accelerated end-to-end generation approaching real-time performance (tested on RTX 5090 single GPU)
  • 🎯 NVFP4 Quantization: Reduced memory and bandwidth usage, optimized for Blackwell architecture
  • 🔧 LightX2V Integration: Optimal performance and stability on the official framework
  • 🚀 High-Quality Generation: Maintains Wan2.1's superior video quality while achieving unprecedented speed

🚀 Quick Start

# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .

# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel

MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
  -Cbuild-dir=build . \
  -Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
  --verbose --color=always --no-build-isolation

pip install dist/*whl --force-reinstall --no-deps

# 3. Run inference
cd examples/wan
python wan_i2v_nvfp4.py   # Image-to-Video
python wan_t2v_nvfp4.py   # Text-to-Video

🎬 Generation Results

"A cinematic, hyper-realistic 3D animation, in the somber and beautiful style of Sekiro: Shadows Die Twice. In a vast field of silvery-white pampas grass, under a luminous full moon, the shinobi Wolf stands ready for a final duel..."

Input Image Wan2.1-I2V-14B-480P wan2.1_i2v_480p_nvfp4_lightx2v_4step

"高对比度,高饱和度,短边构图,日落,中焦距,柔光,背光,暖色调,边缘光,中近景,日光,晴天光,一位外国白人女性的近景,她身穿黄色格子连衣裙,戴着耳环。随着仰拍镜头的上升,女子抬起头来,眼睛里含着泪水,看着前方说着话..."

Wan2.1-T2V-1.3B wan2.1_t2v_1_3b_nvfp4_lightx2v_4step

⚡ Performance Comparison

Test Environment: RTX 5090 Single GPU | LightX2V Framework

📸 Image-to-Video (I2V-14B-480P)

Metric Original Model Optimized Model Speedup
Single-step Denoising 12.10s 3.40s 3.5x
End-to-End 498.90s 17.65s 28x

🎬 Text-to-Video (T2V-1.3B-480P)

Metric Original Model Optimized Model Speedup
Single-step Denoising 2.00s 0.70s 2.9x
End-to-End 83.50s 6.54s 12.8x

⚠️ Notes

System Requirements

  • Required Hardware: NVIDIA RTX 50-series GPUs (RTX 5090/5080/5070/5060) or other Blackwell architecture GPUs

Dependencies

  • Prepare T5 / CLIP / VAE components yourself (same as Wan2.x structure)

Performance Tips

  • Use Blackwell + NVFP4 for best performance
  • Enable CPU offload for GPUs with limited memory

🤝 Community


If you find this project helpful, please give us a ⭐ on GitHub

Downloads last month
743
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lightx2v/Wan-NVFP4

Finetuned
(16)
this model

Collection including lightx2v/Wan-NVFP4