AndrewZeng commited on
Commit
afa88cb
·
verified ·
1 Parent(s): 3960f70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -12,6 +12,21 @@ base_model:
12
 
13
  This is the model checkpoint in Project SimpleRL. Qwen-2.5-Math-7B-SimpleRL is the simple RL training from the base model with initial warmup stage.
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ## Citation
16
 
17
  If you find this blog or our code useful, we would appreciate it if you could cite our work:
 
12
 
13
  This is the model checkpoint in Project SimpleRL. Qwen-2.5-Math-7B-SimpleRL is the simple RL training from the base model with initial warmup stage.
14
 
15
+ ## Quick Start
16
+
17
+ Please generate content using the following template:
18
+
19
+ ```
20
+ "o1_cot": (
21
+ '[Round 0] USER:\\n{input}\\nPlease reason step by step, and put your final answer within \\\\boxed{{}}. ASSISTANT:\\n',
22
+ "{output}",
23
+ "\\n\\n"
24
+ )
25
+ ```
26
+
27
+ Alternatively, you can use our [evaluation code](https://github.com/hkust-nlp/simpleRL-reason/tree/main/eval) and specify the Prompt type as "o1_cot"
28
+
29
+
30
  ## Citation
31
 
32
  If you find this blog or our code useful, we would appreciate it if you could cite our work: