CapybaraHermes-2.5-Mistral-7B

Built with Distilabel

This model is the launching partner of the capybara-dpo dataset build with โš—๏ธ distilabel. It's a preference tuned OpenHermes-2.5-Mistral-7B.

CapybaraHermes has been preference tuned with LoRA and TRL for 3 epochs using argilla's dpo mix 7k.

To test the impact on multi-turn performance we have used MTBench. We also include the Nous Benchmark results and Mistral-7B-Instruct-v0.2 for reference as it's a strong 7B model on MTBench:

Model AGIEval GPT4All TruthfulQA Bigbench MTBench First Turn MTBench Second Turn Nous avg. MTBench avg.
argilla/CapybaraHermes-2.5-Mistral-7B 43.8 73.35 57.07 42.44 8.24375 7.5625 54.16 7.903125
teknium/OpenHermes-2.5-Mistral-7B 42.75 72.99 52.99 40.94 8.25 7.2875 52.42 7.76875
Mistral-7B-Instruct-v0.2 38.5 71.64 66.82 42.29 7.8375 7.1 54.81 7.46875

The most interesting aspect in the context of the capybara-dpo dataset is the increased performance in MTBench Second Turn scores.

For the merge lovers, we also preference tuned Beagle14-7B with a mix of capybara-dpo and distilabel orca pairs using the same recipe as NeuralBeagle (see YALL - Yet Another LLM Leaderboard for reference):

Model AGIEval GPT4All TruthfulQA Bigbench Average
DistilabelBeagle14-7B 45.29 76.92 71.66 48.78 60.66

Model Details

Model Description

This is the model card of a ๐Ÿค— transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Argilla
  • Shared by [optional]: Argilla
  • Model type: 7B chat model
  • Language(s) (NLP): English
  • License: Same as OpenHermes
  • Finetuned from model [optional]: OpenHermes-2.5-Mistral-7B

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 68.14
AI2 Reasoning Challenge (25-Shot) 65.78
HellaSwag (10-Shot) 85.45
MMLU (5-Shot) 63.13
TruthfulQA (0-shot) 56.91
Winogrande (5-shot) 78.30
GSM8k (5-shot) 59.29
Downloads last month
32
Safetensors
Model size
7.24B params
Tensor type
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for argilla/CapybaraHermes-2.5-Mistral-7B

Finetuned
(73)
this model
Adapters
1 model
Finetunes
1 model
Merges
13 models
Quantizations
6 models

Dataset used to train argilla/CapybaraHermes-2.5-Mistral-7B

Space using argilla/CapybaraHermes-2.5-Mistral-7B 1

Evaluation results