DCM / README.md
cszy98's picture
Add sample usage and video-generation tag (#2)
d7af58f verified
metadata
library_name: diffusers
license: mit
pipeline_tag: text-to-video
tags:
  - video-generation

DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

This repository hosts the Dual-Expert Consistency Model (DCM) as presented in the paper Dual-Expert Consistency Model for Efficient and High-Quality Video Generation. DCM addresses the challenge of applying Consistency Models to video diffusion, which often leads to temporal inconsistency and loss of detail. By using a dual-expert approach, DCM achieves state-of-the-art visual quality with significantly reduced sampling steps.

For more information, please refer to the project's Github repository.

Usage

You can use this model with the diffusers library. Make sure you have diffusers, transformers, torch, accelerate, and imageio (with imageio-ffmpeg for MP4/GIF saving) installed.

pip install diffusers transformers torch accelerate imageio[ffmpeg]

Here is a quick example to generate a video:

from diffusers import DiffusionPipeline
import torch
import imageio

# Load the pipeline
# The custom_pipeline argument is necessary because the pipeline class (WanPipeline)
# is defined within the repository and not part of the standard diffusers library.
pipe = DiffusionPipeline.from_pretrained("Vchitect/DCM", torch_dtype=torch.float16, custom_pipeline="Vchitect/DCM", trust_remote_code=True)
pipe.to("cuda")

# Define the prompt and generation parameters
prompt = "A futuristic car driving through a neon-lit city at night"
generator = torch.Generator(device="cuda").manual_seed(0) # for reproducibility

# Generate video frames
video_frames = pipe(
    prompt=prompt,
    num_frames=16, # number of frames to generate
    num_inference_steps=4, # DCM excels at efficient generation in few steps
    guidance_scale=7.5, # Classifier-free guidance scale
    generator=generator,
).frames[0] # Assuming the output is a list containing one video (list of frames)

# Save the generated video
output_path = "generated_video.gif" # You can change this to .mp4 if imageio[ffmpeg] is properly set up
imageio.mimsave(output_path, video_frames, fps=8) # frames per second
print(f"Video saved to {output_path}")