ruslanmv commited on
Commit
966629c
·
1 Parent(s): b7afaa5

First commit

Browse files
REAME.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Avatar‑Renderer Checkpoints
2
+
3
+ This repository bundles all pretrained model checkpoints required by the [Avatar Renderer MCP](https://github.com/ruslanmv/avatar-renderer-mcp) pipeline.
4
+
5
+ **VideoGenie Avatar Generator** is a single‑image → talking‑head engine that ships an MCP‑native stdio server (`render_avatar` tool) and a FastAPI REST façade in one CUDA container. Drop it into any GPU pool and your MCP Gateway auto‑discovers it on boot.
6
+
7
+ This model‑hub repo allows you to fetch **all** necessary checkpoints from a **single source** via Git LFS or the Hugging Face Hub API.
8
+
9
+ ---
10
+
11
+ ## Directory structure
12
+
13
+ ```
14
+ ├── diff2lip
15
+ │ └── Diff2Lip.pth # Audio‑to‑lip Diffusion model
16
+ ├── fomm
17
+ │ └── vox-cpk.pth # First‑Order‑Motion vox‑cpk checkpoint
18
+ ├── gfpgan
19
+ │ └── GFPGANv1.3.pth # GFPGAN v1.3 face enhancement model
20
+ ├── sadtalker
21
+ │ ├── SadTalker_V0.0.2_256.safetensors # Safetensors release bundle
22
+ │ ├── epoch_20.pth # Training checkpoint (epoch 20)
23
+ │ └── sadtalker.pth # Legacy binary checkpoint
24
+ └── wav2lip
25
+ └── wav2lip_gan.pth # Wav2Lip GAN audio-to-lip model
26
+ ```
27
+
28
+ Each subfolder contains one or more formats of the same model, ensuring compatibility with different inference pipelines.
29
+
30
+ ---
31
+
32
+ ## Usage
33
+
34
+ ### 1. Clone via Git LFS
35
+
36
+ ```bash
37
+ # Ensure Git LFS is installed:
38
+ # https://git-lfs.github.com/
39
+
40
+ git clone https://huggingface.co/ruslanmv/avatar-renderer
41
+ cd avatar-renderer
42
+ # You'll now have a `models/` tree matching the structure above.
43
+ ```
44
+
45
+ ### 2. Download via Python (Hugging Face Hub API)
46
+
47
+ ```python
48
+ from huggingface_hub import snapshot_download
49
+
50
+ # Download all files into ./models-cache
51
+ models_dir = snapshot_download(
52
+ repo_id="ruslanmv/avatar-renderer",
53
+ cache_dir="./models-cache",
54
+ )
55
+ print("Checkpoints downloaded to:", models_dir)
56
+ ```
57
+
58
+ ### 3. Integrate with Avatar Renderer MCP
59
+
60
+ In your **Avatar Renderer MCP** project, configure the checkpoint environment variables to point at the local `models` directory:
61
+
62
+ ```bash
63
+ export FOMM_CKPT_DIR=/path/to/avatar-renderer/fomm
64
+ export DIFF2LIP_CKPT=/path/to/avatar-renderer/diff2lip/Diff2Lip.pth
65
+ export SADTALKER_CKPT_DIR=/path/to/avatar-renderer/sadtalker
66
+ export WAV2LIP_CKPT=/path/to/avatar-renderer/wav2lip/wav2lip_gan.pth
67
+ export GFPGAN_CKPT=/path/to/avatar-renderer/gfpgan/GFPGANv1.3.pth
68
+ ```
69
+
70
+ Alternatively, mount the entire repo into `/models` inside a Docker container:
71
+
72
+ ```dockerfile
73
+ FROM ruslanmv/avatar-renderer-mcp:latest
74
+ COPY --from=ruslanmv/avatar-renderer /models /models
75
+ CMD ["uvicorn", "app.api:app", "--host", "0.0.0.0", "--port", "8000"]
76
+ ```
77
+
78
+ ---
79
+
80
+ ## License
81
+
82
+ This repository collects checkpoints that were released under their respective open licenses:
83
+
84
+ * **FOMM**: [Apache‑2.0](https://github.com/AliaksandrSiarohin/first-order-model/blob/master/LICENSE)
85
+ * **Diff2Lip**: [MIT](https://github.com/YuanGary/DiffusionLi/blob/main/LICENSE)
86
+ * **SadTalker**: [Apache‑2.0](https://github.com/Winfredy/SadTalker/blob/main/LICENSE)
87
+ * **Wav2Lip**: [MIT](https://github.com/Rudrabha/Wav2Lip/blob/master/LICENSE)
88
+ * **GFPGAN**: [MIT](https://github.com/TencentARC/GFPGAN/blob/main/LICENSE)
89
+
90
+ Please refer to each upstream project for full license details.
91
+
92
+ ---
93
+
94
+ > Maintained by [ruslanmv](https://github.com/ruslanmv).
95
+ > Part of the [Avatar Renderer MCP](https://github.com/ruslanmv/avatar-renderer-mcp) ecosystem.
diff2lip/Diff2Lip.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c71166482d2b893f2f77450563a1bb31d805f3048c7213b974fd9201e9aa4b3
3
+ size 406815527
fomm/vox-cpk.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:abb41ab1f279f26326c0d6e4d20702de6658364665aa1313daa7a63e89ea2b23
3
+ size 728766691
gfpgan/GFPGANv1.3.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c953a88f2727c85c3d9ae72e2bd4846bbaf59fe6972ad94130e23e7017524a70
3
+ size 348632874
sadtalker/SadTalker_V0.0.2_256.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c211f5d6de003516bf1bbda9f47049a4c9c99133b1ab565c6961e5af16477bff
3
+ size 725066984
sadtalker/epoch_20.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d17a6b23457b521801baae583cb6a58f7238fe6721fc3d65d76407460e9149b
3
+ size 288860037
sadtalker/sadtalker.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d17a6b23457b521801baae583cb6a58f7238fe6721fc3d65d76407460e9149b
3
+ size 288860037
wav2lip/wav2lip_gan.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca9ab7b7b812c0e80a6e70a5977c545a1e8a365a6c49d5e533023c034d7ac3d8
3
+ size 435801865