File size: 5,150 Bytes
add824d
488a076
 
add824d
488a076
 
 
 
 
 
add824d
 
 
488a076
 
 
 
 
 
 
 
add824d
 
488a076
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
add824d
488a076
add824d
488a076
add824d
488a076
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
---
license: apache-2.0
base_model: Qwen/Qwen3-0.6B-Base
tags:
- peft
- lora
- ai-detection
- text-classification
- raid-dataset
- qwen
- unsloth
language:
- en
pipeline_tag: text-classification
library_name: peft
datasets:
- liamdugan/raid
metrics:
- accuracy
- precision
- recall
---

# Qwen3-0.6B AI Content Detector (LoRA)

## Model Description

This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen3-0.6B-Base for AI-generated content detection. The model is trained to classify text as either human-written (class 0) or AI-generated (class 1) using the RAID dataset.

## Model Details

- **Base Model**: Qwen/Qwen3-0.6B-Base
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Task**: Binary text classification (Human vs AI content detection)
- **Dataset**: RAID Dataset (train_none.csv)
- **Training Framework**: Unsloth + Transformers
- **Model Type**: Parameter-efficient fine-tuning adapter

## Training Details

### Dataset
- **Source**: RAID Dataset for AI content detection
- **Training Samples**: 24,000 (balanced: 12,000 human + 12,000 AI)
- **Validation Samples**: 2,000 (balanced: 1,000 human + 1,000 AI)
- **Class Balance**: 50% Human (class 0) / 50% AI (class 1)

### Training Configuration
- **LoRA Rank**: 16
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Learning Rate**: 1e-4
- **Batch Size**: 2 per device
- **Epochs**: 1
- **Optimizer**: AdamW 8-bit
- **Max Sequence Length**: 2048

### Hardware
- **GPU**: Tesla T4 (Google Colab)
- **Precision**: FP16
- **Memory Optimization**: Gradient checkpointing enabled

## Usage

### Loading the Model

```python
from unsloth import FastLanguageModel
import torch

# Load base model first
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="subhashbs36/qwen3-0.6-ai-detector-merged",
    max_seq_length=4096,
    dtype=torch.float16,
    load_in_4bit=False,
)

# Load your LoRA adapter
# model.load_adapter("subhashbs36/qwen3-0.6-ai-detector-lora")

# Enable inference mode
FastLanguageModel.for_inference(model)

```
```python
import os
import torch
import torch.nn.functional as F

# Enable CUDA debugging for accurate stack trace
# os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

def classify_text_fixed(text_sample):
    prompt = f"""Here is a text sample:
{text_sample}

Classify this text into one of the following:
class 0: Human
class 1: AI

SOLUTION
The correct answer is: class """
    
    inputs = tokenizer(prompt, return_tensors="pt")
    device = next(model.parameters()).device
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model(**inputs)
        
        # Fix: Get the last token index as a scalar, not tensor
        last_token_idx = (inputs['attention_mask'].sum(1) - 1).item()
        last_logits = outputs.logits[0, last_token_idx, :]
        
        # Debug information
        print(f"Logits shape: {last_logits.shape}")
        print(f"Number token ids: {number_token_ids}")
        print(f"Vocab size: {last_logits.shape[0]}")
        
        # Check if any index is out of bounds
        vocab_size = last_logits.shape[0]
        for i, idx in enumerate(number_token_ids):
            if idx >= vocab_size:
                print(f"ERROR: Index {idx} (class {i}) is out of bounds for vocab size {vocab_size}")
                return None, None
        
        probs_all = F.softmax(last_logits, dim=-1)
        probs = probs_all[number_token_ids]
        predicted_class = torch.argmax(probs).item()
        confidence = probs[predicted_class].item()
    
    return predicted_class, confidence

```

## Performance

- **Task**: Binary classification (Human vs AI content detection)
- **Classes**: 
  - Class 0: Human-written content
  - Class 1: AI-generated content
- **Evaluation**: Tested on balanced validation set from RAID dataset

## Limitations

- Trained specifically on RAID dataset distribution
- Performance may vary on out-of-domain text
- Designed for English text classification
- Requires specific prompt format for optimal performance

## Technical Implementation

This model uses a custom approach with:
- **Reduced vocabulary**: Only uses token IDs for classes 0 and 1
- **Custom data collator**: Trains only on the last token of sequences
- **Token mapping**: Maps original vocabulary to reduced classification head
- **Parameter-efficient training**: Uses LoRA for efficient fine-tuning

## Citation

If you use this model in your research, please cite:

```
@misc{qwen3-ai-detector-2025,
  title={Qwen3-0.6B AI Content Detector},
  author={subhashbs36},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/subhashbs36/qwen3-0.6-ai-detector-lora}
}
```

## License

This model is released under the Apache 2.0 license, following the base model's licensing terms.

## Acknowledgments

- Built using [Unsloth](https://github.com/unslothai/unsloth) for efficient training
- Based on Qwen3-0.6B-Base by Alibaba Cloud
- Trained on RAID dataset for AI content detection research
- Utilizes LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning