qwen3-0.6-ai-detector-merged / README.md

Update README.md

488a076 verified 4 months ago

5.15 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3-0.6B-Base
	tags:
	- peft
	- lora
	- ai-detection
	- text-classification
	- raid-dataset
	- qwen
	- unsloth
	language:
	- en
	pipeline_tag: text-classification
	library_name: peft
	datasets:
	- liamdugan/raid
	metrics:
	- accuracy
	- precision
	- recall
	---

	# Qwen3-0.6B AI Content Detector (LoRA)

	## Model Description

	This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen3-0.6B-Base for AI-generated content detection. The model is trained to classify text as either human-written (class 0) or AI-generated (class 1) using the RAID dataset.

	## Model Details

	- Base Model: Qwen/Qwen3-0.6B-Base
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Task: Binary text classification (Human vs AI content detection)
	- Dataset: RAID Dataset (train_none.csv)
	- Training Framework: Unsloth + Transformers
	- Model Type: Parameter-efficient fine-tuning adapter

	## Training Details

	### Dataset
	- Source: RAID Dataset for AI content detection
	- Training Samples: 24,000 (balanced: 12,000 human + 12,000 AI)
	- Validation Samples: 2,000 (balanced: 1,000 human + 1,000 AI)
	- Class Balance: 50% Human (class 0) / 50% AI (class 1)

	### Training Configuration
	- LoRA Rank: 16
	- LoRA Alpha: 16
	- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	- Learning Rate: 1e-4
	- Batch Size: 2 per device
	- Epochs: 1
	- Optimizer: AdamW 8-bit
	- Max Sequence Length: 2048

	### Hardware
	- GPU: Tesla T4 (Google Colab)
	- Precision: FP16
	- Memory Optimization: Gradient checkpointing enabled

	## Usage

	### Loading the Model

	```python
	from unsloth import FastLanguageModel
	import torch

	# Load base model first
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name="subhashbs36/qwen3-0.6-ai-detector-merged",
	max_seq_length=4096,
	dtype=torch.float16,
	load_in_4bit=False,
	)

	# Load your LoRA adapter
	# model.load_adapter("subhashbs36/qwen3-0.6-ai-detector-lora")

	# Enable inference mode
	FastLanguageModel.for_inference(model)

	```
	```python
	import os
	import torch
	import torch.nn.functional as F

	# Enable CUDA debugging for accurate stack trace
	# os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

	def classify_text_fixed(text_sample):
	prompt = f"""Here is a text sample:
	{text_sample}

	Classify this text into one of the following:
	class 0: Human
	class 1: AI

	SOLUTION
	The correct answer is: class """

	inputs = tokenizer(prompt, return_tensors="pt")
	device = next(model.parameters()).device
	inputs = {k: v.to(device) for k, v in inputs.items()}

	with torch.no_grad():
	outputs = model(**inputs)

	# Fix: Get the last token index as a scalar, not tensor
	last_token_idx = (inputs['attention_mask'].sum(1) - 1).item()
	last_logits = outputs.logits[0, last_token_idx, :]

	# Debug information
	print(f"Logits shape: {last_logits.shape}")
	print(f"Number token ids: {number_token_ids}")
	print(f"Vocab size: {last_logits.shape[0]}")

	# Check if any index is out of bounds
	vocab_size = last_logits.shape[0]
	for i, idx in enumerate(number_token_ids):
	if idx >= vocab_size:
	print(f"ERROR: Index {idx} (class {i}) is out of bounds for vocab size {vocab_size}")
	return None, None

	probs_all = F.softmax(last_logits, dim=-1)
	probs = probs_all[number_token_ids]
	predicted_class = torch.argmax(probs).item()
	confidence = probs[predicted_class].item()

	return predicted_class, confidence

	```

	## Performance

	- Task: Binary classification (Human vs AI content detection)
	- Classes:
	- Class 0: Human-written content
	- Class 1: AI-generated content
	- Evaluation: Tested on balanced validation set from RAID dataset

	## Limitations

	- Trained specifically on RAID dataset distribution
	- Performance may vary on out-of-domain text
	- Designed for English text classification
	- Requires specific prompt format for optimal performance

	## Technical Implementation

	This model uses a custom approach with:
	- Reduced vocabulary: Only uses token IDs for classes 0 and 1
	- Custom data collator: Trains only on the last token of sequences
	- Token mapping: Maps original vocabulary to reduced classification head
	- Parameter-efficient training: Uses LoRA for efficient fine-tuning

	## Citation

	If you use this model in your research, please cite:

	```
	@misc{qwen3-ai-detector-2025,
	title={Qwen3-0.6B AI Content Detector},
	author={subhashbs36},
	year={2025},
	howpublished={Hugging Face Model Hub},
	url={https://huggingface.co/subhashbs36/qwen3-0.6-ai-detector-lora}
	}
	```

	## License

	This model is released under the Apache 2.0 license, following the base model's licensing terms.

	## Acknowledgments

	- Built using [Unsloth](https://github.com/unslothai/unsloth) for efficient training
	- Based on Qwen3-0.6B-Base by Alibaba Cloud
	- Trained on RAID dataset for AI content detection research
	- Utilizes LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning