|
--- |
|
license: apache-2.0 |
|
base_model: Qwen/Qwen3-0.6B-Base |
|
tags: |
|
- peft |
|
- lora |
|
- ai-detection |
|
- text-classification |
|
- raid-dataset |
|
- qwen |
|
- unsloth |
|
language: |
|
- en |
|
pipeline_tag: text-classification |
|
library_name: peft |
|
datasets: |
|
- liamdugan/raid |
|
metrics: |
|
- accuracy |
|
- precision |
|
- recall |
|
--- |
|
|
|
# Qwen3-0.6B AI Content Detector (LoRA) |
|
|
|
## Model Description |
|
|
|
This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen3-0.6B-Base for AI-generated content detection. The model is trained to classify text as either human-written (class 0) or AI-generated (class 1) using the RAID dataset. |
|
|
|
## Model Details |
|
|
|
- **Base Model**: Qwen/Qwen3-0.6B-Base |
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
- **Task**: Binary text classification (Human vs AI content detection) |
|
- **Dataset**: RAID Dataset (train_none.csv) |
|
- **Training Framework**: Unsloth + Transformers |
|
- **Model Type**: Parameter-efficient fine-tuning adapter |
|
|
|
## Training Details |
|
|
|
### Dataset |
|
- **Source**: RAID Dataset for AI content detection |
|
- **Training Samples**: 24,000 (balanced: 12,000 human + 12,000 AI) |
|
- **Validation Samples**: 2,000 (balanced: 1,000 human + 1,000 AI) |
|
- **Class Balance**: 50% Human (class 0) / 50% AI (class 1) |
|
|
|
### Training Configuration |
|
- **LoRA Rank**: 16 |
|
- **LoRA Alpha**: 16 |
|
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
- **Learning Rate**: 1e-4 |
|
- **Batch Size**: 2 per device |
|
- **Epochs**: 1 |
|
- **Optimizer**: AdamW 8-bit |
|
- **Max Sequence Length**: 2048 |
|
|
|
### Hardware |
|
- **GPU**: Tesla T4 (Google Colab) |
|
- **Precision**: FP16 |
|
- **Memory Optimization**: Gradient checkpointing enabled |
|
|
|
## Usage |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from unsloth import FastLanguageModel |
|
import torch |
|
|
|
# Load base model first |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name="subhashbs36/qwen3-0.6-ai-detector-merged", |
|
max_seq_length=4096, |
|
dtype=torch.float16, |
|
load_in_4bit=False, |
|
) |
|
|
|
# Load your LoRA adapter |
|
# model.load_adapter("subhashbs36/qwen3-0.6-ai-detector-lora") |
|
|
|
# Enable inference mode |
|
FastLanguageModel.for_inference(model) |
|
|
|
``` |
|
```python |
|
import os |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
# Enable CUDA debugging for accurate stack trace |
|
# os.environ['CUDA_LAUNCH_BLOCKING'] = '1' |
|
|
|
def classify_text_fixed(text_sample): |
|
prompt = f"""Here is a text sample: |
|
{text_sample} |
|
|
|
Classify this text into one of the following: |
|
class 0: Human |
|
class 1: AI |
|
|
|
SOLUTION |
|
The correct answer is: class """ |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
device = next(model.parameters()).device |
|
inputs = {k: v.to(device) for k, v in inputs.items()} |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
|
|
# Fix: Get the last token index as a scalar, not tensor |
|
last_token_idx = (inputs['attention_mask'].sum(1) - 1).item() |
|
last_logits = outputs.logits[0, last_token_idx, :] |
|
|
|
# Debug information |
|
print(f"Logits shape: {last_logits.shape}") |
|
print(f"Number token ids: {number_token_ids}") |
|
print(f"Vocab size: {last_logits.shape[0]}") |
|
|
|
# Check if any index is out of bounds |
|
vocab_size = last_logits.shape[0] |
|
for i, idx in enumerate(number_token_ids): |
|
if idx >= vocab_size: |
|
print(f"ERROR: Index {idx} (class {i}) is out of bounds for vocab size {vocab_size}") |
|
return None, None |
|
|
|
probs_all = F.softmax(last_logits, dim=-1) |
|
probs = probs_all[number_token_ids] |
|
predicted_class = torch.argmax(probs).item() |
|
confidence = probs[predicted_class].item() |
|
|
|
return predicted_class, confidence |
|
|
|
``` |
|
|
|
## Performance |
|
|
|
- **Task**: Binary classification (Human vs AI content detection) |
|
- **Classes**: |
|
- Class 0: Human-written content |
|
- Class 1: AI-generated content |
|
- **Evaluation**: Tested on balanced validation set from RAID dataset |
|
|
|
## Limitations |
|
|
|
- Trained specifically on RAID dataset distribution |
|
- Performance may vary on out-of-domain text |
|
- Designed for English text classification |
|
- Requires specific prompt format for optimal performance |
|
|
|
## Technical Implementation |
|
|
|
This model uses a custom approach with: |
|
- **Reduced vocabulary**: Only uses token IDs for classes 0 and 1 |
|
- **Custom data collator**: Trains only on the last token of sequences |
|
- **Token mapping**: Maps original vocabulary to reduced classification head |
|
- **Parameter-efficient training**: Uses LoRA for efficient fine-tuning |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite: |
|
|
|
``` |
|
@misc{qwen3-ai-detector-2025, |
|
title={Qwen3-0.6B AI Content Detector}, |
|
author={subhashbs36}, |
|
year={2025}, |
|
howpublished={Hugging Face Model Hub}, |
|
url={https://huggingface.co/subhashbs36/qwen3-0.6-ai-detector-lora} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is released under the Apache 2.0 license, following the base model's licensing terms. |
|
|
|
## Acknowledgments |
|
|
|
- Built using [Unsloth](https://github.com/unslothai/unsloth) for efficient training |
|
- Based on Qwen3-0.6B-Base by Alibaba Cloud |
|
- Trained on RAID dataset for AI content detection research |
|
- Utilizes LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning |