Image-to-Video
Diffusers
English

Add `diffusers` sample usage to model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +86 -6
README.md CHANGED
@@ -1,11 +1,11 @@
1
  ---
2
- license: mit
3
- language:
4
- - en
5
  base_model:
6
  - Wan-AI/Wan2.1-I2V-14B-480P
7
- pipeline_tag: image-to-video
 
8
  library_name: diffusers
 
 
9
  ---
10
 
11
  <div align="center">
@@ -28,6 +28,87 @@ Traditional cartoon/anime production is time-consuming, requiring skilled artist
28
  ToonComposer streamlines this with generative AI, turning hours of manual work of inbetweening and colorization into a single, seamless process. Visit our [project page](https://lg-li.github.io/project/tooncomposer) and read our [paper](https://arxiv.org/abs/2508.10881) for more details.
29
  This HF model repo provides model weights of ToonComposer. Codes are available at [our GitHub repo](https://github.com/TencentARC/ToonComposer).
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Citation
32
 
33
  If you find ToonComposer useful, please consider citing:
@@ -39,5 +120,4 @@ If you find ToonComposer useful, please consider citing:
39
  journal={arXiv preprint arXiv:2508.10881},
40
  year={2025}
41
  }
42
- ```
43
-
 
1
  ---
 
 
 
2
  base_model:
3
  - Wan-AI/Wan2.1-I2V-14B-480P
4
+ language:
5
+ - en
6
  library_name: diffusers
7
+ license: mit
8
+ pipeline_tag: image-to-video
9
  ---
10
 
11
  <div align="center">
 
28
  ToonComposer streamlines this with generative AI, turning hours of manual work of inbetweening and colorization into a single, seamless process. Visit our [project page](https://lg-li.github.io/project/tooncomposer) and read our [paper](https://arxiv.org/abs/2508.10881) for more details.
29
  This HF model repo provides model weights of ToonComposer. Codes are available at [our GitHub repo](https://github.com/TencentARC/ToonComposer).
30
 
31
+ ## Sample Usage
32
+
33
+ You can use the `ToonComposer` model with the `diffusers` library. Ensure your environment has the required dependencies, including `flash-attn` as specified in the [official GitHub repository](https://github.com/TencentARC/ToonComposer) for optimal performance.
34
+
35
+ ```python
36
+ import torch
37
+ from diffusers import DiffusionPipeline
38
+ from PIL import Image
39
+ import numpy as np # For creating dummy images if paths are not found
40
+
41
+ # Load the ToonComposer pipeline from the Hugging Face Hub
42
+ pipeline = DiffusionPipeline.from_pretrained(
43
+ "TencentARC/ToonComposer",
44
+ torch_dtype=torch.float16, # Use torch.bfloat16 for newer GPUs or if preferred
45
+ trust_remote_code=True # Required to load custom pipeline code
46
+ )
47
+ pipeline.to("cuda") # Move model to GPU for faster inference (can use "cpu" for CPU inference)
48
+
49
+ # --- Prepare your input data ---
50
+ # 1. Initial Colored Keyframe (Reference Image)
51
+ # This image sets the base visual style and initial frame.
52
+ # Replace 'path/to/your/initial_colored_keyframe.png' with your actual image file.
53
+ try:
54
+ initial_colored_keyframe = Image.open("path/to/your/initial_colored_keyframe.png").convert("RGB")
55
+ except FileNotFoundError:
56
+ print("Warning: Initial colored keyframe image not found. Using a dummy white image for demonstration.")
57
+ # Create a dummy white image matching target resolution (e.g., 1088x608 from model config)
58
+ initial_colored_keyframe = Image.fromarray(np.full((608, 1088, 3), 255, dtype=np.uint8))
59
+
60
+ # 2. Sketch Keyframe (for motion control)
61
+ # This is typically a black and white line drawing that guides motion at a specific frame.
62
+ # Replace 'path/to/your/sketch_at_frame_X.png' with your actual sketch image path.
63
+ # ToonComposer supports multiple sketches at different time steps. For this example, we use one.
64
+ try:
65
+ sketch_keyframe = Image.open("path/to/your/sketch_at_frame_X.png").convert("RGB")
66
+ except FileNotFoundError:
67
+ print("Warning: Sketch keyframe image not found. Using a dummy black image for demonstration.")
68
+ # Create a dummy black image matching target resolution
69
+ sketch_keyframe = Image.fromarray(np.full((608, 1088, 3), 0, dtype=np.uint8))
70
+
71
+ # Text Prompt: Describe the desired motion or scene
72
+ prompt = "a joyful character bouncing a ball"
73
+
74
+ # Video Generation Parameters (adjust as needed)
75
+ # Refer to the model's config.json or official GitHub for recommended values.
76
+ num_frames = 33 # Example number of frames from model's config.json
77
+ height = 608 # Example resolution from model's config.json
78
+ width = 1088 # Example resolution from model's config.json
79
+ guidance_scale = 7.5 # Common value for text-to-image/video diffusion
80
+
81
+ # --- Generate the video frames ---
82
+ # The exact arguments for this custom pipeline might vary.
83
+ # We infer a plausible API based on common diffusion model practices and the paper's description.
84
+ # The `image` argument is for the initial colored keyframe, and `sketches` for the list of sketch images.
85
+ video_frames = pipeline(
86
+ prompt=prompt,
87
+ image=initial_colored_keyframe,
88
+ sketches=[(5, sketch_keyframe)], # Example: Apply `sketch_keyframe` at frame index 5
89
+ num_frames=num_frames,
90
+ height=height,
91
+ width=width,
92
+ guidance_scale=guidance_scale,
93
+ ).frames # The output is expected to be a list of PIL Images.
94
+
95
+ # --- Save or display the generated video frames ---
96
+ output_dir = "./tooncomposer_output"
97
+ import os
98
+ os.makedirs(output_dir, exist_ok=True)
99
+ for i, frame in enumerate(video_frames):
100
+ frame.save(f"{output_dir}/frame_{i:04d}.png")
101
+ print(f"Generated {len(video_frames)} frames to '{output_dir}'.")
102
+
103
+ # Optional: To compile frames into a GIF (requires `imageio` and `imageio-ffmpeg` to be installed)
104
+ # import imageio
105
+ # try:
106
+ # imageio.mimsave(f"{output_dir}/output_video.gif", video_frames, fps=10)
107
+ # print(f"Saved GIF to '{output_dir}/output_video.gif'.")
108
+ # except Exception as e:
109
+ # print(f"Could not save GIF (ensure imageio and imageio-ffmpeg are installed): {e}")
110
+ ```
111
+
112
  ## Citation
113
 
114
  If you find ToonComposer useful, please consider citing:
 
120
  journal={arXiv preprint arXiv:2508.10881},
121
  year={2025}
122
  }
123
+ ```