license: apache-2.0 | |
language: | |
- en | |
tags: | |
- merge | |
base_model: | |
- Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
- EmbeddedLLM/Mistral-7B-Merge-14-v0.3 | |
# Model Description | |
This is an experiment to test merging 14 models using DARE TIES 🦙 | |
We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.3](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.3), | |
which is then merged again with [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp) using Gradient SLERP. | |
The result is a model that performs quite well but may require further instruction fine-tuning. | |
## Open LLM Leaderboard | |
| Average | 71.19 | | |
|------------|-------| | |
| ARC | 66.81 | | |
| HellaSwag | 86.15 | | |
| MMLU | 65.10 | | |
| TruthfulQA | 58.25 | | |
| Winogrande | 80.03 | | |
| GSM8K | 70.81 | | |
## Chat Template | |
Either ChatML or Llama-2 chat template. | |
## Merge Configuration | |
The merge config file for this model is here: | |
```yaml | |
slices: | |
- sources: | |
- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
layer_range: [0, 32] | |
- model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3 | |
layer_range: [0, 32] | |
merge_method: slerp | |
base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
parameters: | |
t: | |
- filter: self_attn | |
value: [0, 0.5, 0.3, 0.7, 1] | |
- filter: mlp | |
value: [1, 0.5, 0.7, 0.3, 0] | |
- value: 0.5 # fallback for rest of tensors | |
tokenizer_source: base | |
embed_slerp: true | |
dtype: bfloat16 | |
``` |