---
license: cc-by-nc-4.0
tags:
- text-generation
- llama-3.2-1b-instruct
- function-calling
- finetuned-model
- trl
- lora
- Salesforce/xlam-function-calling-60k
datasets:
- Salesforce/xlam-function-calling-60k
base_model: meta-llama/Llama-3.2-1B-Instruct
library_name: transformers
languages:
- en
pipeline_tag: text-generation
---

# Llama-3.2-1B-Instruct Fine-tuned on xLAM

## Overview

This is a fine-tuned version of the Llama-3.2-1B-Instruct model. The model was trained using Hugging Face's TRL library on the xLAM dataset for function calling capabilities.

## Model Details

- **Developed by:** ermiaazarkhalili
- **License:** cc-by-nc-4.0
- **languages:** en
- **Finetuned from model:** meta-llama/Llama-3.2-1B-Instruct
- **Model size:** Llama-3.2-1B-Instruct parameters
- **Vocab size:** 128,256 tokens
- **Max sequence length:** 2,048 tokens
- **Tensor type:** BF16
- **Pad token:** `<|eot_id|>` (ID: 128009)


## Training Information

The model was fine-tuned using the following configuration:

### Training Libraries
- **Hugging Face TRL Library** for advanced training techniques
- **LoRA (Low-Rank Adaptation)** for parameter-efficient training
- **4-bit quantization** for memory efficiency

### Training Parameters
- **Learning Rate:** 0.0001
- **Batch Size:** 16
- **Gradient Accumulation Steps:** 8
- **Max Training Steps:** 1,000
- **Warmup Ratio:** 0.1
- **Max Sequence Length:** 2,048
- **Output Directory:** ./Llama_3_2_1B_Instruct_xLAM

### LoRA Configuration
- **LoRA Rank (r):** 16
- **LoRA Alpha:** 32
- **Target Modules:** Query and Value projections
- **LoRA Dropout:** 0.1

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "ermiaazarkhalili/Llama-3.2-1B-Instruct_function_calling_xLAM",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
    "ermiaazarkhalili/Llama-3.2-1B-Instruct_function_calling_xLAM",
    trust_remote_code=True
)

text= "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>"

# Tokenize and generate
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id
)

# Decode response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
generated_text = response[len(text):].strip()
print(generated_text)
```

## Dataset

The model was trained on the **xLAM** dataset.

## Model Performance

This fine-tuned model demonstrates improved capabilities in:
- **Function Detection:** Identifying when to call functions
- **Parameter Extraction:** Extracting correct parameters from user queries
- **Output Formatting:** Generating properly structured function calls
- **Tool Integration:** Working with external APIs and tools


## Credits

This model was developed by [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) and leverages the capabilities of:
- **Llama-3.2-1B-Instruct** base model
- **Hugging Face TRL** for advanced fine-tuning techniques
- **LoRA** for parameter-efficient adaptation

## Contact

For any inquiries or support, please reach out to the developer at [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili).

## Acknowledgments

We would like to thank the creators of:
- **Llama-3.2-1B-Instruct** for the excellent base model
- **Hugging Face** for the TRL library and infrastructure
- **xLAM** dataset contributors
- **LoRA** researchers for parameter-efficient fine-tuning methods

## Citation

If you use this model, please cite:

```bibtex
@misc{ermiaazarkhalili_Llama-3.2-1B-Instruct_function_calling_xLAM,
author = {ermiaazarkhalili},
title = { Fine-tuning  Llama-3.2-1B-Instruct on xLAM for Function Calling},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Llama-3.2-1B-Instruct_function_calling_xLAM}}
}
```