sweatSmile
/

HF-SmolLM-1.7B-0.5B-4bit-coder

Text Generation

code-generation

Model card Files Files and versions

HF-SmolLM-1.7B-0.5B-4bit-coder / README.md

sweatSmile's picture

Update README.md

90e8dc1 verified about 2 months ago

|

history blame contribute delete

3.32 kB

	---
	license: apache-2.0
	tags:
	- smollm
	- python
	- code-generation
	- instruct
	- qlora
	- fine-tuned
	- code
	- nf4
	datasets:
	- flytech/python-codes-25k
	model-index:
	- name: HF-SmolLM-1.7B-0.5B-4bit-coder
	results: []
	language:
	- en
	pipeline_tag: text-generation
	---

	# HF-SmolLM-1.7B-0.5B-4bit-coder

	## Model Summary
	HF-SmolLM-1.7B-0.5B-4bit-coder is a fine-tuned variant of [SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B), optimized for instruction-following in Python code generation tasks.
	It was trained on a 1,500-sample subset of the [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) dataset using parameter-efficient fine-tuning (QLoRA 4-bit).

	The model is suitable for:
	- Generating Python code snippets from natural language instructions
	- Completing short code functions
	- Educational prototyping of fine-tuned LMs

	⚠️ This is not a production-ready coding assistant. Generated outputs must be manually reviewed before execution.

	---

	## Intended Uses & Limitations

	### ✅ Intended
	- Research on parameter-efficient fine-tuning
	- Educational demos of instruction-tuning workflows
	- Prototype code generation experiments

	### ❌ Not Intended
	- Deployment in production coding assistants
	- Safety-critical applications
	- Long-context multi-file programming tasks

	---

	## Training Details

	### Base Model
	- Name: [HuggingFaceTB/SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B)
	- Architecture: Decoder-only causal LM
	- Total Parameters: 1.72B
	- Fine-tuned Trainable Parameters: ~9M (0.53%)

	### Dataset
	- Source: [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k)
	- Subset Used: 1,500 randomly sampled examples
	- Content: Instruction + optional input → Python code output
	- Formatting: Converted into `chat` format with `user` / `assistant` roles

	### Training Procedure
	- Framework: Hugging Face Transformers + TRL (SFTTrainer)
	- Quantization: 4-bit QLoRA (nf4) with bfloat16 compute when available
	- Effective Batch Size: 6 (with accumulation)
	- Optimizer: AdamW
	- Scheduler: Cosine decay with warmup ratio 0.05
	- Epochs: 3
	- Learning Rate: 2e-4
	- Max Seq Length: 64 tokens (training)
	- Mixed Precision: FP16
	- Gradient Checkpointing: Enabled

	---

	## Evaluation
	No formal benchmark evaluation has been conducted yet.
	Empirically, the model:
	- Produces syntactically valid Python code for simple tasks
	- Adheres to given instructions with reasonable accuracy
	- Struggles with multi-step reasoning and long code outputs

	---

	## Example Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder"
	tokenizer = AutoTokenizer.from_pretrained(repo)
	model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto")

	prompt = "Write a Python function that checks if a number is prime."
	inputs = tokenizer.apply_chat_template(
	[{"role": "user", "content": prompt}],
	return_tensors="pt",
	add_generation_prompt=True
	).to(model.device)

	outputs = model.generate(inputs, max_new_tokens=150)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))