Goku 8x22B v0.1 Logo

Llama-3-13B-Instruct-v0.1

This model is a self-merge of meta-llama/Meta-Llama-3-8B-Instruct model.

How to use

You can use this model by using MaziyarPanahi/Llama-3-13B-Instruct-v0.1 as the model name in Hugging Face's transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
import torch

model_id = "MaziyarPanahi/Llama-3-13B-Instruct-v0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
    # attn_implementation="flash_attention_2"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)

streamer = TextStreamer(tokenizer)

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    model_kwargs={"torch_dtype": torch.bfloat16},
    streamer=streamer
)

# Then you can use the pipeline to generate text.

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(outputs[0]["generated_text"][len(prompt):])

Prompt template

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

what's 25-4*2+3<|eot_id|><|start_header_id|>assistant<|end_header_id|>

To evaluate this expression, we need to follow the order of operations (PEMDAS):

1. First, multiply 4 and 2: 4*2 = 8
2. Then, subtract 8 from 25: 25 - 8 = 17
3. Finally, add 3: 17 + 3 = 20

So, 25-4*2+3 = 20!<|eot_id|>
To evaluate this expression, we need to follow the order of operations (PEMDAS):

1. First, multiply 4 and 2: 4*2 = 8
2. Then, subtract 8 from 25: 25 - 8 = 17
3. Finally, add 3: 17 + 3 = 20

So, 25-4*2+3 = 20!
Downloads last month
38
Safetensors
Model size
13.3B params
Tensor type
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for MaziyarPanahi/Llama-3-13B-Instruct-v0.1

Finetuned
(542)
this model
Quantizations
5 models

Collection including MaziyarPanahi/Llama-3-13B-Instruct-v0.1