--- license: mit datasets: - GAIR/LIMO language: - en base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B tags: - R1 - DeepSeek - Distill - Qwen - 7B - LIMO --- # LIMO-R1-Distill-Qwen-7B Using [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) as base model. Fine-tuned on [GAIR/LIMO](https://huggingface.co/GAIR/LIMO). Trained using LLaMA-Factory with the config: ``` max_seq_length = 6*1024 lora_rank = 32 lora_alpha = lora_rank * 2 lora_target = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] args = dict( stage="sft", do_train=True, model_name_or_path="unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit", dataset="limo_restructured", template="custom_template", finetuning_type="lora", lora_target=lora_target, output_dir="qwen_distill_7b_lora", per_device_train_batch_size=1, gradient_accumulation_steps=3, lr_scheduler_type="cosine", logging_steps=1, warmup_ratio=0.1, save_steps=100, learning_rate=1e-4, num_train_epochs=1.0, max_grad_norm=1.0, loraplus_lr_ratio=16.0, fp16=True, report_to="none", preprocessing_num_workers=16, cutoff_len=max_seq_length, ) ``` System used: ``` 'You are a helpful assistant. Please reason step by step inside the tags and . Conclude with **Answer** and put your final answer within \\boxed{}.' ``` Custom template used in training: ``` register_template( name="custom_template", format_user=StringFormatter( slots=["<|User|>{{content}}"] ), format_assistant=StringFormatter( slots=["<|Assistant|>{{content}}<|end▁of▁sentence|>"] ), format_system=StringFormatter( slots=["{{content}}"] ), format_function=FunctionFormatter( slots=[ "<|Assistant|><|tool▁calls▁begin|><|tool▁call▁begin|>{{type}}<|tool▁sep|>{{name}}\n```json\n{{arguments}}\n```<|tool▁call▁end|><|tool▁calls▁end|><|end▁of▁sentence|>" ], tool_format="qwen" ), format_observation=StringFormatter( slots=[ "<|tool▁outputs▁begin|><|tool▁output_begin|>{{content}}<|tool▁output▁end|><|tool▁outputs▁end|>" ] ), format_tools=ToolFormatter(tool_format="qwen"), default_system="", stop_words=["<|end▁of▁sentence|>"] ) ``` In the dataset for variation, I randomly replaced the start of the string "Okay," with one of the following: ``` starts = [ "Alright,", "Well,", "So,", "Hmm,", "Okay then,", "Right,", "Let's see,", "Now,", "Alrighty,", "Thinking about it,", "You know,", "Well then,", "Come to think of it,", "Actually,", "Now that I think about it,", "Good question,", "Let me think,", "Let's see now,", "Interesting,", "Now then," ] ```