--- license: mit datasets: - ServiceNow-AI/R1-Distill-SFT language: - en base_model: - meta-llama/Llama-3.2-3B-Instruct pipeline_tag: text-generation library_name: transformers tags: - reasoning - r1 - axolotl new_version: suayptalha/DeepSeek-R1-Distill-Llama-3B --- # DeepSeek-R1-Distill-Llama-3B This model is the distilled version of DeepSeek-R1 on Llama-3.2-3B with R1-Distill-SFT dataset. This model is 4bit quantized! You should import it f16 if you want to use full model. Example usage: ```py import torch from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "suayptalha/DeepSeek-R1-Distill-Llama-3B-4bit", load_in_4bit = True, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B-4bit") SYSTEM_PROMPT = """Respond in the following format: You should reason between these tags. Answer goes here... Always use tags even if they are not necessary. """ messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": "Continue the fibonnaci sequence: 1, 1, 2, 3, 5, 8,"}, ] inputs = tokenizer.apply_chat_template( messages, tokenize = True, add_generation_prompt = True, return_tensors = "pt", ).to("cuda") output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7) decoded_output = tokenizer.decode(output[0], skip_special_tokens=False) print(decoded_output) ``` Output: ``` To continue the Fibonacci sequence, we need to recall the pattern of adding the previous two numbers to get the next number. The next numbers in the sequence would be: 13, 21, 34, 55, 89, 144 ``` Suggested system prompt: ``` Respond in the following format: You should reason between these tags. Answer goes here... Always use tags even if they are not necessary. ``` ## Parameters - lr: 2e-5 - epochs: 1 - optimizer: paged_adamw_8bit ## Support Buy Me A Coffee