File size: 2,261 Bytes
7e5d395 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
language: en
license: apache-2.0
library_name: transformers
tags:
- agentic-rl
- agent
- qwen
- lora
- file-operations
- api-calls
base_model: Qwen/Qwen2.5-0.5B-Instruct
---
# Realistic Agentic Qwen Model
This model is fine-tuned on realistic agent tasks using agentic RL techniques. It learns to take concrete actions like file operations, API calls, and system commands.
## Model Description
- **Base Model**: Qwen/Qwen2.5-0.5B-Instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Training Data**: 5 successful agent trajectories
- **Actions Learned**: file operations, API calls, bash commands, task completion
- **Reward System**: GRPO-style with trajectory-end rewards
## Training Results
- **Loss Improvement**: 4.4078
- **Final Loss**: 5.9926
- **Training Samples**: 5
- **Training Date**: 2025-07-24
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "allthingssecurity/realistic-agentic-qwen"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example usage
problem = "Create a configuration file with settings"
inputs = tokenizer(f"Task: {problem}", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Actions the Model Can Perform
- `create_file`: Create files with specific content
- `write_to_file`: Write data to existing files
- `api_call`: Make HTTP API requests
- `search_files`: Search for patterns in files
- `bash_command`: Execute safe system commands
- `complete_task`: Mark tasks as completed with validation
## Training Examples
The model was trained on tasks like:
- Creating configuration files
- Making API calls and saving responses
- Searching files for specific patterns
- Generating reports and summaries
## Limitations
- Only supports safe, predefined actions
- Simulated environment for training
- Best suited for file/API/system interaction tasks
## Citation
If you use this model, please cite:
```
@misc{realistic-agentic-qwen,
title={Realistic Agentic Qwen Model},
author={Smart RL Trainer},
year={2024},
url={https://huggingface.co/allthingssecurity/realistic-agentic-qwen}
}
```
|