|
|
|
--- |
|
license: cc-by-nc-4.0 |
|
tags: |
|
- text-generation |
|
- qwen2.5-7b-instruct |
|
- function-calling |
|
- finetuned-model |
|
- trl |
|
- lora |
|
- Salesforce/xlam-function-calling-60k |
|
datasets: |
|
- Salesforce/xlam-function-calling-60k |
|
base_model: Qwen/Qwen2.5-7B-Instruct |
|
library_name: transformers |
|
languages: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Qwen2.5-7B-Instruct Fine-tuned on xLAM |
|
|
|
## Overview |
|
|
|
This is a fine-tuned version of the Qwen2.5-7B-Instruct model. The model was trained using Hugging Face's TRL library on the xLAM dataset for function calling capabilities. |
|
|
|
## Model Details |
|
|
|
- **Developed by:** ermiaazarkhalili |
|
- **License:** cc-by-nc-4.0 |
|
- **languages:** en |
|
- **Finetuned from model:** Qwen/Qwen2.5-7B-Instruct |
|
- **Model size:** Qwen2.5-7B-Instruct parameters |
|
- **Vocab size:** 152,064 tokens |
|
- **Max sequence length:** 2,048 tokens |
|
- **Tensor type:** BF16 |
|
- **Pad token:** `<|im_end|>` (ID: 151645) |
|
|
|
|
|
## Training Information |
|
|
|
The model was fine-tuned using the following configuration: |
|
|
|
### Training Libraries |
|
- **Hugging Face TRL Library** for advanced training techniques |
|
- **LoRA (Low-Rank Adaptation)** for parameter-efficient training |
|
- **4-bit quantization** for memory efficiency |
|
|
|
### Training Parameters |
|
- **Learning Rate:** 0.0001 |
|
- **Batch Size:** 16 |
|
- **Gradient Accumulation Steps:** 8 |
|
- **Max Training Steps:** 1,000 |
|
- **Warmup Ratio:** 0.1 |
|
- **Max Sequence Length:** 2,048 |
|
- **Output Directory:** ./Qwen2_5_7B_Instruct_xLAM |
|
|
|
### LoRA Configuration |
|
- **LoRA Rank (r):** 16 |
|
- **LoRA Alpha:** 32 |
|
- **Target Modules:** Query and Value projections |
|
- **LoRA Dropout:** 0.1 |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
# Load model and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM", |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
trust_remote_code=True |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained( |
|
"ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM", |
|
trust_remote_code=True |
|
) |
|
|
|
text= "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>" |
|
|
|
# Tokenize and generate |
|
inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
|
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=512, |
|
temperature=0.7, |
|
do_sample=True, |
|
pad_token_id=tokenizer.pad_token_id, |
|
eos_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
# Decode response |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
generated_text = response[len(text):].strip() |
|
print(generated_text) |
|
``` |
|
|
|
## Dataset |
|
|
|
The model was trained on the **xLAM** dataset. |
|
|
|
## Model Performance |
|
|
|
This fine-tuned model demonstrates improved capabilities in: |
|
- **Function Detection:** Identifying when to call functions |
|
- **Parameter Extraction:** Extracting correct parameters from user queries |
|
- **Output Formatting:** Generating properly structured function calls |
|
- **Tool Integration:** Working with external APIs and tools |
|
|
|
|
|
## Credits |
|
|
|
This model was developed by [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) and leverages the capabilities of: |
|
- **Qwen2.5-7B-Instruct** base model |
|
- **Hugging Face TRL** for advanced fine-tuning techniques |
|
- **LoRA** for parameter-efficient adaptation |
|
|
|
## Contact |
|
|
|
For any inquiries or support, please reach out to the developer at [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili). |
|
|
|
## Acknowledgments |
|
|
|
We would like to thank the creators of: |
|
- **Qwen2.5-7B-Instruct** for the excellent base model |
|
- **Hugging Face** for the TRL library and infrastructure |
|
- **xLAM** dataset contributors |
|
- **LoRA** researchers for parameter-efficient fine-tuning methods |
|
|
|
## Citation |
|
|
|
If you use this model, please cite: |
|
|
|
```bibtex |
|
@misc{ermiaazarkhalili_Qwen2.5-7B-Instruct_Function_Calling_xLAM, |
|
author = {ermiaazarkhalili}, |
|
title = { Fine-tuning Qwen2.5-7B-Instruct on xLAM for Function Calling}, |
|
year = {2025}, |
|
publisher = {Hugging Face}, |
|
howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM}} |
|
} |
|
``` |
|
|