TinyLlama YouTube Replies (LoRA)
This model is a LoRA fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0, designed to generate concise, friendly, and domain-specific replies to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline.
Model Details
- Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Fine-Tuning Method: LoRA (Low-Rank Adaptation)
- Task: Generating short, engaging replies to AI/ML YouTube comments
- Language: English
- License: Apache 2.0
Intended Use
This model is intended for:
- Generating polite and engaging replies to AI/ML-related YouTube comments.
- Demonstrating a fine-tuning project using LoRA for lightweight adaptation.
- Research or educational purposes in conversational AI.
Not Intended For:
- Production environments without further optimization.
- Non-English text generation.
- Applications requiring high contextual accuracy without human review.
Usage
To use this model, you need the transformers
and peft
libraries. Below is an example of how to load and generate replies:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load the base model, tokenizer, and LoRA adapters
base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_id = "AdamDE/tinyllama-custom-youtube-replies"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base_model, adapter_id)
# Prepare input
messages = [
{"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. "
"Provide concise, friendly, and domain-specific help, encourage engagement, "
"and keep a positive tone with occasional emojis when appropriate."},
{"role": "user", "content": "Your enthusiasm is contagious!"}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# Generate reply
with torch.no_grad():
out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id)
reply = tokenizer.decode(out[0], skip_special_tokens=True)
print(reply)
# Example output: "Haha, thanks! 😂 What's your favorite part?"
Requirements
pip install transformers peft torch
Notes
- Use a clear, comment-like prompt for best results.
- Adjust
max_new_tokens
,temperature
, andtop_p
to control reply length and creativity. - The model may occasionally generate out-of-context replies, indicating room for further optimization.
Training Details
- Dataset: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets.
- Training Procedure: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16).
- Hyperparameters:
- LoRA Rank (r): 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Epochs: 5
- Learning Rate: 2e-4
- Optimizer: Paged AdamW 8-bit
- Scheduler: Cosine with 10% warmup
- Evaluation Metrics:
- BLEU and ROUGE scores computed on the test set (see training script for details).
- Training Features:
- Gradient checkpointing for memory efficiency.
- Early stopping with patience of 2 epochs based on validation loss.
- Custom data collator for padding and label masking.
Performance
The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning.
Limitations
- May generate out-of-context or generic replies, requiring human review.
- Optimized for AI/ML YouTube comments; performance may vary for other domains.
- Limited to English-language inputs and outputs.
Ethical Considerations
- Generated replies should be reviewed to ensure they are appropriate and constructive.
- Use responsibly to foster positive community interactions.
- Downloads last month
- 9
Model tree for AdamDE/tinyllama-custom-youtube-replies
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0