sxyao commited on
Commit
9558325
·
verified ·
1 Parent(s): b42671f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -92,7 +92,7 @@ We compared Moonlight with SOTA public models at similar scale:
92
 
93
  ### Inference with Hugging Face Transformers
94
 
95
- We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and the latest version of transformers as the development environment.
96
 
97
  For our pretrained model (Moonlight-16B-A3B):
98
  ```python
@@ -111,6 +111,7 @@ prompt = "1+1=2, 1+2="
111
  inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(model.device)
112
  generated_ids = model.generate(**inputs, max_new_tokens=100)
113
  response = tokenizer.batch_decode(generated_ids)[0]
 
114
  ```
115
 
116
  For our instruct model (Moonlight-16B-A3B-Instruct):
@@ -127,7 +128,6 @@ model = AutoModelForCausalLM.from_pretrained(
127
  )
128
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
129
 
130
- prompt = "Give me a short introduction to large language model."
131
  messages = [
132
  {"role": "system", "content": "You are a helpful assistant provided by Moonshot-AI."},
133
  {"role": "user", "content": "Is 123 a prime?"}
@@ -135,6 +135,7 @@ messages = [
135
  input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
136
  generated_ids = model.generate(inputs=input_ids, max_new_tokens=500)
137
  response = tokenizer.batch_decode(generated_ids)[0]
 
138
  ```
139
 
140
  Moonlight has the same architecture as DeepSeek-V3, which is supported by many popular inference engines, such as VLLM and SGLang. As a result, our model can also be easily deployed using these tools.
@@ -142,8 +143,8 @@ Moonlight has the same architecture as DeepSeek-V3, which is supported by many p
142
  ## Citation
143
  If you find Moonlight is useful or want to use in your projects, please kindly cite our paper:
144
  ```
145
- @article{MoonshotAI,
146
- author = {Kimi Team},
147
  title = {Muon is Scalable For LLM Training},
148
  year = {2025},
149
  }
 
92
 
93
  ### Inference with Hugging Face Transformers
94
 
95
+ We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and transformers=4.48.2 as the development environment.
96
 
97
  For our pretrained model (Moonlight-16B-A3B):
98
  ```python
 
111
  inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(model.device)
112
  generated_ids = model.generate(**inputs, max_new_tokens=100)
113
  response = tokenizer.batch_decode(generated_ids)[0]
114
+ print(response)
115
  ```
116
 
117
  For our instruct model (Moonlight-16B-A3B-Instruct):
 
128
  )
129
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
130
 
 
131
  messages = [
132
  {"role": "system", "content": "You are a helpful assistant provided by Moonshot-AI."},
133
  {"role": "user", "content": "Is 123 a prime?"}
 
135
  input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
136
  generated_ids = model.generate(inputs=input_ids, max_new_tokens=500)
137
  response = tokenizer.batch_decode(generated_ids)[0]
138
+ print(response)
139
  ```
140
 
141
  Moonlight has the same architecture as DeepSeek-V3, which is supported by many popular inference engines, such as VLLM and SGLang. As a result, our model can also be easily deployed using these tools.
 
143
  ## Citation
144
  If you find Moonlight is useful or want to use in your projects, please kindly cite our paper:
145
  ```
146
+ @article{MoonshotAIMuon,
147
+ author = {Jingyuan Liu and Jianlin Su and Xingcheng Yao and Zhejun Jiang and Guokun Lai and Yulun Du and Yidao Qin and Weixin Xu and Enzhe Lu and Junjie Yan and Yanru Chen and Huabin Zheng and Yibo Liu and Shaowei Liu and Bohong Yin and Weiran He and Han Zhu and Yuzhi Wang and Jianzhou Wang and Mengnan Dong and Zheng Zhang and Yongsheng Kang and Hao Zhang and Xinran Xu and Yutao Zhang and Yuxin Wu and Xinyu Zhou and Zhilin Yang},
148
  title = {Muon is Scalable For LLM Training},
149
  year = {2025},
150
  }