vidfom's picture
Upload folder using huggingface_hub
79dc332 verified

Text to Video

In DiffSynth Studio, we can use some video models to generate videos.

Example: Text-to-Video using CogVideoX-5B (Experimental)

See cogvideo_text_to_video.py.

First, we generate a video using prompt "an astronaut riding a horse on Mars".

https://github.com/user-attachments/assets/4c91c1cd-e4a0-471a-bd8d-24d761262941

Then, we convert the astronaut to a robot.

https://github.com/user-attachments/assets/225a00a4-2bc8-4740-8e86-a64b460a29ec

Upscale the video using the model itself.

https://github.com/user-attachments/assets/c02cb30c-de60-473c-8242-32c67b3155ad

Make the video look smoother by interpolating frames.

https://github.com/user-attachments/assets/f0e465b4-45df-4435-ab10-7a084ca2b0a0

Here is another example.

First, we generate a video using prompt "a dog is running".

https://github.com/user-attachments/assets/e3696297-99f5-4d0c-a5ca-1d1566db85b4

Then, we add a blue collar to the dog.

https://github.com/user-attachments/assets/7ff22be7-4390-4d33-ae6c-53f6f056e18d

Upscale the video using the model itself.

https://github.com/user-attachments/assets/a909c32c-0b7d-495c-a53c-d23a99a3d3e9

Make the video look smoother by interpolating frames.

https://github.com/user-attachments/assets/ea37c150-97a0-4858-8003-0c2e5eef3331

Example: Text-to-Video using AnimateDiff

Generate a video using a Stable Diffusion model and an AnimateDiff model. We can break the limitation of number of frames! See sd_text_to_video.py.

https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/8f556355-4079-4445-9b48-e9da77699437