update README
Browse files
README.md
CHANGED
|
@@ -12,3 +12,62 @@ tags:
|
|
| 12 |
- LoRA
|
| 13 |
- adapter
|
| 14 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
- LoRA
|
| 13 |
- adapter
|
| 14 |
---
|
| 15 |
+
|
| 16 |
+
Please refer to our github for more info: https://github.com/alibaba/wan-toy-transform
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
<div align="center">
|
| 20 |
+
<h2><center>Wan Toy Transform</h2>
|
| 21 |
+
<br>
|
| 22 |
+
Alibaba Research Intelligence Computing
|
| 23 |
+
<br>
|
| 24 |
+
<a href="https://github.com/alibaba/wan-toy-transform"><img src='https://img.shields.io/badge/Github-Link-black'></a>
|
| 25 |
+
<a href='https://modelscope.cn/models/Alibaba_Research_Intelligence_Computing/wan-toy-transform'><img src='https://img.shields.io/badge/🤖_ModelScope-weights-%23654dfc'></a>
|
| 26 |
+
<a href='https://huggingface.co/Alibaba-Research-Intelligence-Computing/wan-toy-transform'><img src='https://img.shields.io/badge/🤗_HuggingFace-weights-%23ff9e0e'></a>
|
| 27 |
+
<br>
|
| 28 |
+
</div>
|
| 29 |
+
|
| 30 |
+
This is a LoRA model finetuned on [Wan-I2V-14B-480P](https://github.com/Wan-Video/Wan2.1). It turns things in the image into fluffy toys.
|
| 31 |
+
|
| 32 |
+
## 🐍 Installation
|
| 33 |
+
|
| 34 |
+
```bash
|
| 35 |
+
# Python 3.12 and PyTorch 2.6.0 are tested.
|
| 36 |
+
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
|
| 37 |
+
pip install -r requirements.txt
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
## 🔄 Inference
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
python generate.py --prompt "The video opens with a clear view of a $name. Then it transforms to a b6e9636 JellyCat-style $name. It has a face and a cute, fluffy and playful appearance." --image $image_path --save_file "output.mp4" --offload_type leaf_level
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
Note:
|
| 47 |
+
|
| 48 |
+
- Change `$name` to the object name you want to transform.
|
| 49 |
+
- `$image_path` is the path to the first frame image.
|
| 50 |
+
- Choose `--offload_type` from ['leaf_level', 'block_level', 'none', 'model']. More details can be found [here](https://huggingface.co/docs/diffusers/optimization/memory#group-offloading).
|
| 51 |
+
- VRAM usage and generation time of different `--offload_type` are listed below.
|
| 52 |
+
|
| 53 |
+
| `--offload_type` | VRAM Usage | Generation Time (NVIDIA A100) |
|
| 54 |
+
| ------------------------------------ | ---------- | ----------------------------- |
|
| 55 |
+
| leaf_level | 11.9 GB | 17m17s |
|
| 56 |
+
| block_level (num_blocks_per_group=1) | 20.5 GB | 16m48s |
|
| 57 |
+
| model | 39.4 GB | 16m24s |
|
| 58 |
+
| none | 55.9 GB | 16m08s |
|
| 59 |
+
|
| 60 |
+
## 🤝 Acknowledgements
|
| 61 |
+
|
| 62 |
+
Special thanks to these projects for their contributions to the community!
|
| 63 |
+
|
| 64 |
+
- [Wan2.1](https://github.com/Wan-Video/Wan2.1)
|
| 65 |
+
- [diffusion-pipe](https://github.com/tdrussell/diffusion-pipe)
|
| 66 |
+
- [diffusers](https://github.com/huggingface/diffusers)
|
| 67 |
+
|
| 68 |
+
## 📄 Our previous work
|
| 69 |
+
|
| 70 |
+
- [Tora: Trajectory-oriented Diffusion Transformer for Video Generation](https://github.com/alibaba/Tora)
|
| 71 |
+
|
| 72 |
+
- [AnimateAnything: Fine Grained Open Domain Image Animation with Motion Guidance](https://github.com/alibaba/animate-anything)
|
| 73 |
+
|