|  | --- | 
					
						
						|  | license: apache-2.0 | 
					
						
						|  | tags: | 
					
						
						|  | - smollm | 
					
						
						|  | - python | 
					
						
						|  | - code-generation | 
					
						
						|  | - instruct | 
					
						
						|  | - qlora | 
					
						
						|  | - fine-tuned | 
					
						
						|  | - code | 
					
						
						|  | - nf4 | 
					
						
						|  | datasets: | 
					
						
						|  | - flytech/python-codes-25k | 
					
						
						|  | model-index: | 
					
						
						|  | - name: HF-SmolLM-1.7B-0.5B-4bit-coder | 
					
						
						|  | results: [] | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | pipeline_tag: text-generation | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | # HF-SmolLM-1.7B-0.5B-4bit-coder | 
					
						
						|  |  | 
					
						
						|  | ## Model Summary | 
					
						
						|  | **HF-SmolLM-1.7B-0.5B-4bit-coder** is a fine-tuned variant of [SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B), optimized for **instruction-following in Python code generation tasks**. | 
					
						
						|  | It was trained on a **1,500-sample subset** of the [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) dataset using **parameter-efficient fine-tuning (QLoRA 4-bit)**. | 
					
						
						|  |  | 
					
						
						|  | The model is suitable for: | 
					
						
						|  | - Generating Python code snippets from natural language instructions | 
					
						
						|  | - Completing short code functions | 
					
						
						|  | - Educational prototyping of fine-tuned LMs | 
					
						
						|  |  | 
					
						
						|  | ⚠️ This is **not a production-ready coding assistant**. Generated outputs must be manually reviewed before execution. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## Intended Uses & Limitations | 
					
						
						|  |  | 
					
						
						|  | ### ✅ Intended | 
					
						
						|  | - Research on parameter-efficient fine-tuning | 
					
						
						|  | - Educational demos of instruction-tuning workflows | 
					
						
						|  | - Prototype code generation experiments | 
					
						
						|  |  | 
					
						
						|  | ### ❌ Not Intended | 
					
						
						|  | - Deployment in production coding assistants | 
					
						
						|  | - Safety-critical applications | 
					
						
						|  | - Long-context multi-file programming tasks | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## Training Details | 
					
						
						|  |  | 
					
						
						|  | ### Base Model | 
					
						
						|  | - **Name:** [HuggingFaceTB/SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B) | 
					
						
						|  | - **Architecture:** Decoder-only causal LM | 
					
						
						|  | - **Total Parameters:** 1.72B | 
					
						
						|  | - **Fine-tuned Trainable Parameters:** ~9M (0.53%) | 
					
						
						|  |  | 
					
						
						|  | ### Dataset | 
					
						
						|  | - **Source:** [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) | 
					
						
						|  | - **Subset Used:** 1,500 randomly sampled examples | 
					
						
						|  | - **Content:** Instruction + optional input → Python code output | 
					
						
						|  | - **Formatting:** Converted into `chat` format with `user` / `assistant` roles | 
					
						
						|  |  | 
					
						
						|  | ### Training Procedure | 
					
						
						|  | - **Framework:** Hugging Face Transformers + TRL (SFTTrainer) | 
					
						
						|  | - **Quantization:** 4-bit QLoRA (nf4) with bfloat16 compute when available | 
					
						
						|  | - **Effective Batch Size:** 6 (with accumulation) | 
					
						
						|  | - **Optimizer:** AdamW | 
					
						
						|  | - **Scheduler:** Cosine decay with warmup ratio 0.05 | 
					
						
						|  | - **Epochs:** 3 | 
					
						
						|  | - **Learning Rate:** 2e-4 | 
					
						
						|  | - **Max Seq Length:** 64 tokens (training) | 
					
						
						|  | - **Mixed Precision:** FP16 | 
					
						
						|  | - **Gradient Checkpointing:** Enabled | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## Evaluation | 
					
						
						|  | No formal benchmark evaluation has been conducted yet. | 
					
						
						|  | Empirically, the model: | 
					
						
						|  | - Produces syntactically valid Python code for simple tasks | 
					
						
						|  | - Adheres to given instructions with reasonable accuracy | 
					
						
						|  | - Struggles with multi-step reasoning and long code outputs | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## Example Usage | 
					
						
						|  |  | 
					
						
						|  | ```python | 
					
						
						|  | from transformers import AutoModelForCausalLM, AutoTokenizer | 
					
						
						|  |  | 
					
						
						|  | repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder" | 
					
						
						|  | tokenizer = AutoTokenizer.from_pretrained(repo) | 
					
						
						|  | model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto") | 
					
						
						|  |  | 
					
						
						|  | prompt = "Write a Python function that checks if a number is prime." | 
					
						
						|  | inputs = tokenizer.apply_chat_template( | 
					
						
						|  | [{"role": "user", "content": prompt}], | 
					
						
						|  | return_tensors="pt", | 
					
						
						|  | add_generation_prompt=True | 
					
						
						|  | ).to(model.device) | 
					
						
						|  |  | 
					
						
						|  | outputs = model.generate(inputs, max_new_tokens=150) | 
					
						
						|  | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |