somos99 commited on
Commit
5738bbd
·
verified ·
1 Parent(s): 6d1cefe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -18
README.md CHANGED
@@ -5,10 +5,9 @@ license: other
5
  license_link: LICENSE
6
  ---
7
 
8
-
9
  <div align="center">
10
 
11
- <img src="./assets/logo.png" alt="HunyuanImage-3.0 Logo" width="400">
12
 
13
  # 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
14
 
@@ -26,12 +25,12 @@ license_link: LICENSE
26
  <a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
27
  <a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
28
  <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
29
- <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0/blob/main/assets/HunyuanImage_3_0.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
30
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
31
  </div>
32
 
33
  <p align="center">
34
- 👏 Join our <a href="https://github.com/Tencent-Hunyuan/HunyuanImage-3.0/blob/main/assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
35
  💻 <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(官网) Try our model!</a>&nbsp&nbsp
36
  </p>
37
 
@@ -125,7 +124,10 @@ If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know.
125
  # 1. First install PyTorch (CUDA 12.8 Version)
126
  pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
127
 
128
- # 2. Then install other dependencies
 
 
 
129
  pip install -r requirements.txt
130
  ```
131
 
@@ -204,24 +206,31 @@ hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3
204
  ```
205
 
206
  #### 3️⃣ Run the Demo
 
207
 
208
  ```bash
209
- python3 run_image_gen.py --model-id ./HunyuanImage-3 --verbose 1 --prompt "A brown and white dog is running on the grass"
 
 
 
 
210
  ```
211
 
212
  #### 4️⃣ Command Line Arguments
213
 
214
- | Arguments | Description | Default |
215
- |----------------------|-----------------------------------------------------------------|-------------|
216
- | `--prompt` | Input prompt | (Required) |
217
- | `--model-id` | Model path | (Required) |
218
- | `--attn-impl` | Attention implementation. Either `sdpa` or `flash_attention_2`. | `sdpa` |
219
- | `--moe-impl` | MoE implementation. Either `eager` or `flashinfer` | `eager` |
220
- | `--seed` | Random seed for image generation | `None` |
221
- | `--diff-infer-steps` | Diffusion infer steps | `50` |
222
- | `--image-size` | Image resolution. Can be `auto`, like `1280x768` or `16:9` | `auto` |
223
- | `--save` | Image save path. | `image.png` |
224
- | `--verbose` | Verbose level. 0: No log; 1: log inference information. | `0` |
 
 
225
 
226
  ### 🎨 Interactive Gradio Demo
227
 
@@ -422,4 +431,3 @@ We extend our heartfelt gratitude to the following open-source projects and comm
422
  * 🌐 [HuggingFace](https://huggingface.co/) - AI model hub and community
423
  * ⚡ [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
424
  * 🚀 [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine
425
-
 
5
  license_link: LICENSE
6
  ---
7
 
 
8
  <div align="center">
9
 
10
+ <img src="./assets/logo.png" alt="HunyuanImage-3.0 Logo" width="600">
11
 
12
  # 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
13
 
 
25
  <a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
26
  <a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
27
  <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
28
+ <a href=./assets/HunyuanImage_3_0.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
29
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
30
  </div>
31
 
32
  <p align="center">
33
+ 👏 Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
34
  💻 <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(官网) Try our model!</a>&nbsp&nbsp
35
  </p>
36
 
 
124
  # 1. First install PyTorch (CUDA 12.8 Version)
125
  pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
126
 
127
+ # 2. Then install tencentcloud-sdk
128
+ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python
129
+
130
+ # 3. Then install other dependencies
131
  pip install -r requirements.txt
132
  ```
133
 
 
206
  ```
207
 
208
  #### 3️⃣ Run the Demo
209
+ The Pretrain Checkpoint does not automatically rewrite or enhance input prompts, for optimal results currently, we recommend community partners to use deepseek to rewrite the prompts.
210
 
211
  ```bash
212
+ # set env
213
+ export DEEPSEEK_KEY_ID="your_deepseek_key_id"
214
+ export DEEPSEEK_KEY_SECRET="your_deepseek_key_secret"
215
+
216
+ python3 run_image_gen.py --model-id ./HunyuanImage-3 --verbose 1 --sys-deepseek-prompt "universal" --prompt "A brown and white dog is running on the grass"
217
  ```
218
 
219
  #### 4️⃣ Command Line Arguments
220
 
221
+ | Arguments | Description | Default |
222
+ |-------------------------|-----------------------------------------------------------------|-------------|
223
+ | `--prompt` | Input prompt | (Required) |
224
+ | `--model-id` | Model path | (Required) |
225
+ | `--attn-impl` | Attention implementation. Either `sdpa` or `flash_attention_2`. | `sdpa` |
226
+ | `--moe-impl` | MoE implementation. Either `eager` or `flashinfer` | `eager` |
227
+ | `--seed` | Random seed for image generation | `None` |
228
+ | `--diff-infer-steps` | Diffusion infer steps | `50` |
229
+ | `--image-size` | Image resolution. Can be `auto`, like `1280x768` or `16:9` | `auto` |
230
+ | `--save` | Image save path. | `image.png` |
231
+ | `--verbose` | Verbose level. 0: No log; 1: log inference information. | `0` |
232
+ | `--rewrite` | Whether to enable rewriting | `True` |
233
+ | `--sys-deepseek-prompt` | Select sys-prompt from `universal` or `text_rendering` | `universal` |
234
 
235
  ### 🎨 Interactive Gradio Demo
236
 
 
431
  * 🌐 [HuggingFace](https://huggingface.co/) - AI model hub and community
432
  * ⚡ [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
433
  * 🚀 [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine