RedHatAI
/

Meta-Llama-3.1-70B-Instruct-quantized.w4a16

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

alexmarques commited on Oct 10, 2024

Commit

2465bef

·

verified ·

1 Parent(s): 3ecb94d

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -133,8 +133,11 @@ model.save_pretrained("Meta-Llama-3.1-70B-Instruct-quantized.w4a16")
 ## Evaluation
 This model was evaluated on the well-known Arena-Hard, OpenLLM v1, OpenLLM v2, and HumanEval benchmarks.
-Evaluation was conducted using the Neural Magic fork of [lm-evaluation-harness](https://github.com/neuralmagic/lm-evaluation-harness/tree/llama_3.1_instruct) (branch llama_3.1_instruct) and the [vLLM](https://docs.vllm.ai/en/stable/) engine.
 This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challenge and GSM-8K that match the prompting style of [Meta-Llama-3.1-Instruct-evals](https://huggingface.co/datasets/meta-llama/Meta-Llama-3.1-70B-Instruct-evals) and a few fixes to OpenLLM v2 tasks.
 **Note:** Results have been updated after Meta modified the chat template.

 ## Evaluation
 This model was evaluated on the well-known Arena-Hard, OpenLLM v1, OpenLLM v2, and HumanEval benchmarks.
+In all cases, model outputs were generated with the [vLLM](https://docs.vllm.ai/en/stable/) engine.
+Arena-Hard evaluations were conducted using the [Arena-Hard-Auto](https://github.com/lmarena/arena-hard-auto) repository.
+OpenLLM v1 and v2 evaluations were conducted using Neural Magic's fork of [lm-evaluation-harness](https://github.com/neuralmagic/lm-evaluation-harness/tree/llama_3.1_instruct) (branch llama_3.1_instruct).
 This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challenge and GSM-8K that match the prompting style of [Meta-Llama-3.1-Instruct-evals](https://huggingface.co/datasets/meta-llama/Meta-Llama-3.1-70B-Instruct-evals) and a few fixes to OpenLLM v2 tasks.
+HumanEval and HumanEval+ evaluations were conducted using Neural Magic's fork of the [EvalPlus](https://github.com/neuralmagic/evalplus) repository.
 **Note:** Results have been updated after Meta modified the chat template.