Jarvis73 commited on
Commit
72b08bc
·
verified ·
1 Parent(s): 27f1778

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -99,7 +99,7 @@ If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know.
99
 
100
  * 🏆 **The Largest Image Generation MoE Model:** This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance.
101
 
102
- * 🎨 **Superior Image Generation Performance:**Through rigorous dataset curation and advanced reinforcement learning post-training, we've achieved an optimal balance between semantic accuracy and visual excellence. The model demonstrates exceptional prompt adherence while delivering photorealistic imagery with stunning aesthetic quality and fine-grained details.
103
 
104
  * 💭 **Intelligent World-Knowledge Reasoning:** The unified multimodal architecture endows HunyuanImage-3.0 with powerful reasoning capabilities. It leverages its extensive world knowledge to intelligently interpret user intent, automatically elaborating on sparse prompts with contextually appropriate details to produce superior, more complete visual outputs.
105
 
@@ -152,13 +152,23 @@ pip install flashinfer-python
152
 
153
  ### 🔥 Quick Start with Transformers
154
 
155
- The easiest way to get started with HunyuanImage-3.0:
 
 
 
 
 
 
 
 
156
 
157
  ```python
158
  from transformers import AutoModelForCausalLM
159
 
160
  # Load the model
161
- model_id = "tencent/HunyuanImage-3.0"
 
 
162
 
163
  kwargs = dict(
164
  attn_implementation="sdpa", # Use "flash_attention_2" if FlashAttention is installed
 
99
 
100
  * 🏆 **The Largest Image Generation MoE Model:** This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance.
101
 
102
+ * 🎨 **Superior Image Generation Performance:** Through rigorous dataset curation and advanced reinforcement learning post-training, we've achieved an optimal balance between semantic accuracy and visual excellence. The model demonstrates exceptional prompt adherence while delivering photorealistic imagery with stunning aesthetic quality and fine-grained details.
103
 
104
  * 💭 **Intelligent World-Knowledge Reasoning:** The unified multimodal architecture endows HunyuanImage-3.0 with powerful reasoning capabilities. It leverages its extensive world knowledge to intelligently interpret user intent, automatically elaborating on sparse prompts with contextually appropriate details to produce superior, more complete visual outputs.
105
 
 
152
 
153
  ### 🔥 Quick Start with Transformers
154
 
155
+ #### 1️⃣ Download model weights
156
+
157
+ ```bash
158
+ # Download from HuggingFace and rename the directory.
159
+ # Notice that the directory name should not contain dots, which may cause issues when loading using Transformers.
160
+ hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3
161
+ ```
162
+
163
+ #### 2️⃣ Run with Transformers
164
 
165
  ```python
166
  from transformers import AutoModelForCausalLM
167
 
168
  # Load the model
169
+ model_id = "./HunyuanImage-3"
170
+ # Currently we can not load the model using HF model_id `tencent/HunyuanImage-3.0` directly
171
+ # due to the dot in the name.
172
 
173
  kwargs = dict(
174
  attn_implementation="sdpa", # Use "flash_attention_2" if FlashAttention is installed