TinyLlama YouTube Replies (LoRA)

This model is a LoRA fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0, designed to generate concise, friendly, and domain-specific replies to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline.

Model Details

Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
Fine-Tuning Method: LoRA (Low-Rank Adaptation)
Task: Generating short, engaging replies to AI/ML YouTube comments
Language: English
License: Apache 2.0

Intended Use

This model is intended for:

Generating polite and engaging replies to AI/ML-related YouTube comments.
Demonstrating a fine-tuning project using LoRA for lightweight adaptation.
Research or educational purposes in conversational AI.

Not Intended For:

Production environments without further optimization.
Non-English text generation.
Applications requiring high contextual accuracy without human review.

Usage

To use this model, you need the transformers and peft libraries. Below is an example of how to load and generate replies:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the base model, tokenizer, and LoRA adapters
base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_id = "AdamDE/tinyllama-custom-youtube-replies"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base_model, adapter_id)

# Prepare input
messages = [
    {"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. "
                                  "Provide concise, friendly, and domain-specific help, encourage engagement, "
                                  "and keep a positive tone with occasional emojis when appropriate."},
    {"role": "user", "content": "Your enthusiasm is contagious!"}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

# Generate reply
with torch.no_grad():
    out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id)
reply = tokenizer.decode(out[0], skip_special_tokens=True)
print(reply)
# Example output: "Haha, thanks! 😂 What's your favorite part?"

Requirements

pip install transformers peft torch

Notes

Use a clear, comment-like prompt for best results.
Adjust max_new_tokens, temperature, and top_p to control reply length and creativity.
The model may occasionally generate out-of-context replies, indicating room for further optimization.

Training Details

Dataset: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets.
Training Procedure: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16).
Hyperparameters:
- LoRA Rank (r): 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Epochs: 5
- Learning Rate: 2e-4
- Optimizer: Paged AdamW 8-bit
- Scheduler: Cosine with 10% warmup
Evaluation Metrics:
- BLEU and ROUGE scores computed on the test set (see training script for details).
Training Features:
- Gradient checkpointing for memory efficiency.
- Early stopping with patience of 2 epochs based on validation loss.
- Custom data collator for padding and label masking.

Performance

The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning.

Limitations

May generate out-of-context or generic replies, requiring human review.
Optimized for AI/ML YouTube comments; performance may vary for other domains.
Limited to English-language inputs and outputs.

Ethical Considerations

Generated replies should be reviewed to ensure they are appropriate and constructive.
Use responsibly to foster positive community interactions.

AdamDE
/

tinyllama-custom-youtube-replies