thesunday commited on
Commit
e2659a2
·
1 Parent(s): 61495b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -1,3 +1,46 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - merge
7
  ---
8
+
9
+ # Model Description
10
+ This is an experiment to test merging 14 models using DARE TIES 🦙
11
+
12
+ We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.4](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.4),
13
+ which is then merged again with [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp) using Gradient SLERP.
14
+ The result is a model that performs quite well but may require further instruction fine-tuning.
15
+
16
+ ## Chat Template
17
+
18
+ Either ChatML or Llama-2 chat template.
19
+
20
+ ## Merge Configuration
21
+
22
+ The merge config file for this model is here:
23
+
24
+ ```yaml
25
+ slices:
26
+ - sources:
27
+ - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
28
+ layer_range: [0, 32]
29
+ - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3
30
+ layer_range: [0, 32]
31
+
32
+ merge_method: slerp
33
+ base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
34
+
35
+ parameters:
36
+ t:
37
+ - filter: self_attn
38
+ value: [0, 0.5, 0.3, 0.7, 1]
39
+ - filter: mlp
40
+ value: [1, 0.5, 0.7, 0.3, 0]
41
+ - value: 0.5 # fallback for rest of tensors
42
+ tokenizer_source: base
43
+ embed_slerp: true
44
+
45
+ dtype: bfloat1
46
+ ```