Add comprehensive model card for Qwen2.5-7B-Instruct fine-tuned on xLAM

7e51937 verified 20 days ago

4.05 kB


	---
	license: cc-by-nc-4.0
	tags:
	- text-generation
	- qwen2.5-7b-instruct
	- function-calling
	- finetuned-model
	- trl
	- lora
	- Salesforce/xlam-function-calling-60k
	datasets:
	- Salesforce/xlam-function-calling-60k
	base_model: Qwen/Qwen2.5-7B-Instruct
	library_name: transformers
	languages:
	- en
	pipeline_tag: text-generation
	---

	# Qwen2.5-7B-Instruct Fine-tuned on xLAM

	## Overview

	This is a fine-tuned version of the Qwen2.5-7B-Instruct model. The model was trained using Hugging Face's TRL library on the xLAM dataset for function calling capabilities.

	## Model Details

	- Developed by: ermiaazarkhalili
	- License: cc-by-nc-4.0
	- languages: en
	- Finetuned from model: Qwen/Qwen2.5-7B-Instruct
	- Model size: Qwen2.5-7B-Instruct parameters
	- Vocab size: 152,064 tokens
	- Max sequence length: 2,048 tokens
	- Tensor type: BF16
	- Pad token: `<\|im_end\|>` (ID: 151645)


	## Training Information

	The model was fine-tuned using the following configuration:

	### Training Libraries
	- Hugging Face TRL Library for advanced training techniques
	- LoRA (Low-Rank Adaptation) for parameter-efficient training
	- 4-bit quantization for memory efficiency

	### Training Parameters
	- Learning Rate: 0.0001
	- Batch Size: 16
	- Gradient Accumulation Steps: 8
	- Max Training Steps: 1,000
	- Warmup Ratio: 0.1
	- Max Sequence Length: 2,048
	- Output Directory: ./Qwen2_5_7B_Instruct_xLAM

	### LoRA Configuration
	- LoRA Rank (r): 16
	- LoRA Alpha: 32
	- Target Modules: Query and Value projections
	- LoRA Dropout: 0.1

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	"ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	tokenizer = AutoTokenizer.from_pretrained(
	"ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM",
	trust_remote_code=True
	)

	text= "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>"

	# Tokenize and generate
	inputs = tokenizer(text, return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.7,
	do_sample=True,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id
	)

	# Decode response
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	generated_text = response[len(text):].strip()
	print(generated_text)
	```

	## Dataset

	The model was trained on the xLAM dataset.

	## Model Performance

	This fine-tuned model demonstrates improved capabilities in:
	- Function Detection: Identifying when to call functions
	- Parameter Extraction: Extracting correct parameters from user queries
	- Output Formatting: Generating properly structured function calls
	- Tool Integration: Working with external APIs and tools


	## Credits

	This model was developed by [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) and leverages the capabilities of:
	- Qwen2.5-7B-Instruct base model
	- Hugging Face TRL for advanced fine-tuning techniques
	- LoRA for parameter-efficient adaptation

	## Contact

	For any inquiries or support, please reach out to the developer at [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili).

	## Acknowledgments

	We would like to thank the creators of:
	- Qwen2.5-7B-Instruct for the excellent base model
	- Hugging Face for the TRL library and infrastructure
	- xLAM dataset contributors
	- LoRA researchers for parameter-efficient fine-tuning methods

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{ermiaazarkhalili_Qwen2.5-7B-Instruct_Function_Calling_xLAM,
	author = {ermiaazarkhalili},
	title = { Fine-tuning Qwen2.5-7B-Instruct on xLAM for Function Calling},
	year = {2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Qwen2.5-7B-Instruct_Function_Calling_xLAM}}
	}
	```