Llama-3-8B-Instruct-v0.8

This model was developed based on MaziyarPanahi/Llama-3-8B-Instruct-v0.4 model.

⚡ Quantized GGUF

All GGUF models are available here: MaziyarPanahi/Llama-3-8B-Instruct-v0.8-GGUF

🏆 Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	73.20
AI2 Reasoning Challenge (25-Shot)	71.67
HellaSwag (10-Shot)	87.77
MMLU (5-Shot)	68.30
TruthfulQA (0-shot)	63.90
Winogrande (5-shot)	79.08
GSM8k (5-shot)	68.46

MaziyarPanahi/Llama-3-8B-Instruct-v0.8 is the 5th best-performing 8B model on the Open LLM Leaderboard. (03/06/2024).

Leaderboard 2.0:

Metric	Value
Avg.	26.75
IFEval (0-Shot)	75.12
BBH (3-Shot)	28.27
MATH Lvl 5 (4-Shot)	7.10
GPQA (0-shot)	7.38
MuSR (0-shot)	10.92
MMLU-PRO (5-shot)	31.68

Prompt Template

This model uses ChatML prompt template:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

How to use

You can use this model by using MaziyarPanahi/Llama-3-8B-Instruct-v0.8 as the model name in Hugging Face's transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
import torch

model_id = "MaziyarPanahi/Llama-3-8B-Instruct-v0.8"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    # attn_implementation="flash_attention_2"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)

streamer = TextStreamer(tokenizer)

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    model_kwargs={"torch_dtype": torch.bfloat16},
    streamer=streamer
)

# Then you can use the pipeline to generate text.

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,    
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(outputs[0]["generated_text"][len(prompt):])

Downloads last month: 5,830

Safetensors

Model size

8.03B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for MaziyarPanahi/Llama-3-8B-Instruct-v0.8

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Finetuned

MaziyarPanahi/Llama-3-8B-Instruct-v0.4

Finetuned

(2)

this model

Adapters

2 models

Finetunes

1 model

Merges

2 models

Quantizations

3 models

Spaces using MaziyarPanahi/Llama-3-8B-Instruct-v0.8 7

Collection including MaziyarPanahi/Llama-3-8B-Instruct-v0.8

👑 Llama-3

Collection

My experiments with Llama-3 models • 61 items • Updated 19 days ago • 22

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

71.670
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

87.770
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

68.300
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

63.900
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

79.080
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

68.460
strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

75.120
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

28.270
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

7.100
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

7.380

View on Papers With Code