RDson commited on
Commit
18bf4a8
·
verified ·
1 Parent(s): 33690ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -3
README.md CHANGED
@@ -1,3 +1,55 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - GAIR/LIMO
5
+ language:
6
+ - en
7
+ base_model:
8
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
9
+ tags:
10
+ - R1
11
+ - DeepSeek
12
+ - Distill
13
+ - Qwen
14
+ - 7B
15
+ - LIMO
16
+ ---
17
+ # LIMO-R1-Distill-Qwen-7B
18
+ Using [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) as base model.
19
+
20
+ Fine-tuned on [GAIR/LIMO](https://huggingface.co/GAIR/LIMO).
21
+
22
+ Trained using LLaMA-Factory with the config:
23
+ ```
24
+ max_seq_length = 6*1024
25
+
26
+ lora_rank = 32
27
+ lora_alpha = lora_rank * 2
28
+ lora_target = ["q_proj", "k_proj", "v_proj", "o_proj",
29
+ "gate_proj", "up_proj", "down_proj"]
30
+
31
+ args = dict(
32
+ stage="sft",
33
+ do_train=True,
34
+ model_name_or_path="unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
35
+ dataset="limo_restructured",
36
+ template="custom_template",
37
+ finetuning_type="lora",
38
+ lora_target=lora_target,
39
+ output_dir="qwen_distill_7b_lora",
40
+ per_device_train_batch_size=1,
41
+ gradient_accumulation_steps=3,
42
+ lr_scheduler_type="cosine",
43
+ logging_steps=1,
44
+ warmup_ratio=0.1,
45
+ save_steps=100,
46
+ learning_rate=1e-4,
47
+ num_train_epochs=1.0,
48
+ max_grad_norm=1.0,
49
+ loraplus_lr_ratio=16.0,
50
+ fp16=True,
51
+ report_to="none",
52
+ preprocessing_num_workers=16,
53
+ cutoff_len=max_seq_length,
54
+ )
55
+ ```