Text-to-Video with LTX-Video Lora Model

This document provides a step-by-step guide to generating videos from text prompts using the LTX-Video model from Hugging Face's diffusers library. The model is fine-tuned with LoRA weights for specific styles, such as the "Genshin Impact Env" used in this example.

Dataset

This model tuned by

https://huggingface.co/datasets/svjack/video-dataset-genshin-impact-anime-organized

Installation

First, ensure you have the necessary libraries installed. You can install them using pip:

pip install torch diffusers safetensors peft

Usage

Below is a complete example of how to generate a video from a text prompt using the LTX-Video model.

Step 1: Import Required Libraries

import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video
from IPython import display

Step 2: Load the Model and LoRA Weights

# Load the LTX-Video model with bfloat16 precision
pipe = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16)

# Load LoRA weights for the "Genshin Impact Env" style
pipe.load_lora_weights("ltx_pytorch_lora_weights.safetensors", "genshin_impact_env")

# Set the adapter with a strength of 2.0
pipe.set_adapters("genshin_impact_env", 2.0)

# Move the model to the GPU for faster inference
pipe.to("cuda")

Step 3: Define the Prompt and Generate the Video

# Define the text prompt
prompt = "In the style of Genshin Impact Env, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."

# Define the negative prompt to avoid undesirable qualities
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

# Generate the video
video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]

# Export the video to a file
export_to_video(video, "output.mp4", fps=24)

Step 4: Display the Generated Video

# Display the generated video in a Jupyter notebook
display.Video("output.mp4")

Example Prompts

Lora Prefix

In the style of Genshin Impact Env,

Here are three example prompts that you can use to generate different videos:

Forest Scene:

prompt = "In the style of Genshin Impact Env, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."

video without lora

video with lora

Castle Scene:

prompt = "In the style of Genshin Impact Env, the video shifts to a majestic castle under a starry sky. Silvery moonlight bathes the ancient stone walls, casting soft shadows on the surrounding landscape. Towering spires rise into the night, their peaks adorned with glowing orbs that mimic the stars above. A tranquil moat reflects the shimmering heavens, its surface rippling gently in the cool breeze. Fireflies dance around the castle’s ivy-covered arches, adding a touch of magic to the scene. In the distance, a faint aurora paints the horizon with hues of green and purple, blending seamlessly with the celestial tapestry. The scene exudes an aura of timeless wonder and serene beauty."

video without lora

video with lora

Coastal Scene:

prompt = "In the style of Genshin Impact Env, the video shifts to a breathtaking coastal scene. The turquoise sea stretches endlessly, its waves gently lapping against the golden sandy shore. The sky is painted with hues of orange and pink as the sun dips toward the horizon, casting a warm glow over the water. Rocky cliffs rise along the coastline, their jagged edges softened by patches of green vegetation. Seafoam glistens as it washes ashore, and the air is filled with the soothing sound of the tide. The scene radiates a serene and timeless beauty, capturing the essence of the ocean’s tranquility."

video without lora

video with lora

Conclusion

This guide demonstrates how to generate videos from text prompts using the LTX-Video model. By adjusting the prompts and parameters, you can create a wide variety of video content tailored to your needs.