LoRA weights only and trained for research - nothing from the foundation model. Trained using Anthropics HH dataset which can be found here https://huggingface.co/datasets/Anthropic/hh-rlhf

Sample usage

import torch
import os
import transformers
from peft import PeftModel
from transformers import LlamaTokenizer, LlamaForCausalLM

model_path = "decapoda-research/llama-30b-hf"
peft_path = 'serpdotai/llama-hh-lora-30B'
tokenizer_path = 'decapoda-research/llama-30b-hf'

model = LlamaForCausalLM.from_pretrained(model_path, load_in_8bit=True, device_map="auto") # or something like {"": 0}
model = PeftModel.from_pretrained(model, peft_path, torch_dtype=torch.float16, device_map="auto") # or something like {"": 0}
tokenizer = LlamaTokenizer.from_pretrained(tokenizer_path)

batch = tokenizer("\n\nUser: Are you sentient?\n\nAssistant:", return_tensors="pt")

with torch.no_grad():
    out = model.generate(
        input_ids=batch["input_ids"].cuda(),
        attention_mask=batch["attention_mask"].cuda(),
        max_length=100,
        do_sample=True,
        top_k=50,
        top_p=1.0,
        temperature=1.0,
        use_cache=False
    )
print(tokenizer.decode(out[0]))

The model will continue the conversation between the user and itself. If you want to use as a chatbot you can alter the generate method to include stop sequences for 'User:' and 'Assistant:' or strip off anything past the assistant's original response before returning.

Trained for 2 epochs with a sequence length of 368, mini-batch size of 1, gradient accumulation of 15, on 8 A6000s for an effective batch size of 120.

Training settings:

  • lr: 2.0e-04
  • lr_scheduler_type: linear
  • warmup_ratio: 0.06
  • weight_decay: 0.1
  • optimizer: adamw_torch_fused

LoRA config:

  • target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']
  • r: 64
  • lora_alpha: 32
  • lora_dropout: 0.05
  • bias: "none"
  • task_type: "CAUSAL_LM"
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.