Model Card: Llama-2-chat-finetuned

Model Details

  • Model Name: Llama-2-chat-finetuned
  • Base Model: NousResearch/Llama-2-7b-chat-hf
  • Fine-Tuned By: HiTruong
  • Fine-Tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: Movie-related dataset
  • Evaluation Metric: BLEU Score
  • BLEU Score Before Fine-Tuning: 33.26
  • BLEU Score After Fine-Tuning: 77.53

Model Description

This model is a fine-tuned version of NousResearch/Llama-2-7b-chat-hf, optimized for movie-related conversations. The fine-tuning process was performed using LoRA to efficiently adapt the model while keeping computational requirements manageable. It is designed to improve conversational understanding and response generation for movie-related queries.

Training Details

  • Hardware Used: Kaggle GPU (T4x2)
  • Fine-Tuning Framework: Hugging Face Transformers + LoRA
  • Output Folder: ./results
  • Number of Epochs: 2
  • Batch Size:
    • Per Device Train: 4
    • Per Device Eval: 4
  • Gradient Accumulation Steps: 1
  • Gradient Checkpointing: Enabled
  • Max Gradient Norm: 0.3
  • Mixed Precision: fp16=False, bf16=False
  • Optimizer: paged_adamw_32bit
  • Learning Rate: 2e-5
  • Weight Decay: 0.001
  • LR Scheduler Type: cosine
  • Warmup Ratio: 0.03
  • Max Steps: -1 (determined by epochs)
  • Quantization Settings:
    • use_4bit = True
    • bnb_4bit_compute_dtype = float16
    • bnb_4bit_quant_type = nf4
    • use_nested_quant = False
  • LoRA Hyperparameters:
    • lora_r = 64
    • lora_alpha = 16
    • lora_dropout = 0.05
  • Sequence Length: Dynamic (max_seq_length=None)
  • Packing: Disabled (packing=False)
  • Device Map: {"": 0}

Capabilities

  • Answers movie-related questions with improved accuracy.
  • Understands movie genres, actors, directors, and plots.
  • Provides recommendations based on user preferences.

Limitations

  • May generate incorrect or biased information.
  • Limited to the knowledge present in the training dataset.
  • Does not have real-time access to new movie releases.

Usage

You can load and use the model with the following code:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "HiTruong/Llama-2-chat-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

def generate_answer(question):
    inputs = tokenizer(f"<s>[INST] {question} [/INST]", return_tensors="pt", truncation=True, max_length=100).to(model.device)
    with torch.no_grad():
        output = model.generate(**inputs, max_length=75, eos_token_id=tokenizer.eos_token_id)
    response = tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    return response.replace(f"[INST] {question} [/INST]", "").strip().split('.')[0]

input_text = "What are some great sci-fi movies?"

print(generate_answer(input_text))
Downloads last month
46
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for HiTruong/Llama-2-chat-finetuned

Finetuned
(101)
this model

Dataset used to train HiTruong/Llama-2-chat-finetuned