--- library_name: peft base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 pipeline_tag: text-generation tags: - lora - adapters - tinyllama - youtube - conversational - text-generation license: apache-2.0 --- # TinyLlama YouTube Replies (LoRA) This model is a **LoRA fine-tuned** version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0), designed to generate **concise, friendly, and domain-specific replies** to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline. ## Model Details - **Base Model**: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) - **Fine-Tuning Method**: LoRA (Low-Rank Adaptation) - **Task**: Generating short, engaging replies to AI/ML YouTube comments - **Language**: English - **License**: Apache 2.0 ## Intended Use This model is intended for: - Generating polite and engaging replies to AI/ML-related YouTube comments. - Demonstrating a fine-tuning project using LoRA for lightweight adaptation. - Research or educational purposes in conversational AI. **Not Intended For**: - Production environments without further optimization. - Non-English text generation. - Applications requiring high contextual accuracy without human review. ## Usage To use this model, you need the `transformers` and `peft` libraries. Below is an example of how to load and generate replies: ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load the base model, tokenizer, and LoRA adapters base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" adapter_id = "AdamDE/tinyllama-custom-youtube-replies" tokenizer = AutoTokenizer.from_pretrained(adapter_id) base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto") model = PeftModel.from_pretrained(base_model, adapter_id) # Prepare input messages = [ {"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. " "Provide concise, friendly, and domain-specific help, encourage engagement, " "and keep a positive tone with occasional emojis when appropriate."}, {"role": "user", "content": "Your enthusiasm is contagious!"} ] inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) # Generate reply with torch.no_grad(): out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id) reply = tokenizer.decode(out[0], skip_special_tokens=True) print(reply) # Example output: "Haha, thanks! 😂 What's your favorite part?" ``` ### Requirements ```bash pip install transformers peft torch ``` ### Notes - Use a clear, comment-like prompt for best results. - Adjust `max_new_tokens`, `temperature`, and `top_p` to control reply length and creativity. - The model may occasionally generate out-of-context replies, indicating room for further optimization. ## Training Details - **Dataset**: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets. - **Training Procedure**: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16). - **Hyperparameters**: - LoRA Rank (r): 16 - LoRA Alpha: 32 - LoRA Dropout: 0.05 - Epochs: 5 - Learning Rate: 2e-4 - Optimizer: Paged AdamW 8-bit - Scheduler: Cosine with 10% warmup - **Evaluation Metrics**: - BLEU and ROUGE scores computed on the test set (see training script for details). - **Training Features**: - Gradient checkpointing for memory efficiency. - Early stopping with patience of 2 epochs based on validation loss. - Custom data collator for padding and label masking. ## Performance The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning. ## Limitations - May generate out-of-context or generic replies, requiring human review. - Optimized for AI/ML YouTube comments; performance may vary for other domains. - Limited to English-language inputs and outputs. ## Ethical Considerations - Generated replies should be reviewed to ensure they are appropriate and constructive. - Use responsibly to foster positive community interactions.