Ttimofeyka commited on
Commit
2940468
·
verified ·
1 Parent(s): c3a7728

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -40
README.md CHANGED
@@ -11,47 +11,11 @@ tags:
11
  - princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
12
  ---
13
 
14
- # Llama-3-15B-Instruct-64k
15
 
16
- Llama-3-15B-Instruct-64k is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
17
- * [princeton-nlp/Llama-3-8B-ProLong-64k-Instruct](https://huggingface.co/princeton-nlp/Llama-3-8B-ProLong-64k-Instruct)
18
- * [princeton-nlp/Llama-3-8B-ProLong-64k-Instruct](https://huggingface.co/princeton-nlp/Llama-3-8B-ProLong-64k-Instruct)
19
- * [princeton-nlp/Llama-3-8B-ProLong-64k-Instruct](https://huggingface.co/princeton-nlp/Llama-3-8B-ProLong-64k-Instruct)
20
- * [princeton-nlp/Llama-3-8B-ProLong-64k-Instruct](https://huggingface.co/princeton-nlp/Llama-3-8B-ProLong-64k-Instruct)
21
 
22
- ## 🧩 Configuration
23
-
24
- ```yaml
25
- dtype: bfloat16
26
- merge_method: passthrough
27
- slices:
28
- - sources:
29
- - layer_range: [0, 24]
30
- model: princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
31
- - sources:
32
- - layer_range: [8, 24]
33
- model: princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
34
- parameters:
35
- scale:
36
- - filter: o_proj
37
- value: 0.0
38
- - filter: down_proj
39
- value: 0.0
40
- - value: 1.0
41
- - sources:
42
- - layer_range: [8, 24]
43
- model: princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
44
- parameters:
45
- scale:
46
- - filter: o_proj
47
- value: 0.0
48
- - filter: down_proj
49
- value: 0.0
50
- - value: 1.0
51
- - sources:
52
- - layer_range: [24, 32]
53
- model: princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
54
- ```
55
 
56
  ## 💻 Usage
57
 
@@ -62,7 +26,7 @@ from transformers import AutoTokenizer
62
  import transformers
63
  import torch
64
 
65
- model = "Ttimofeyka/Llama-3-15B-Instruct-64k"
66
  messages = [{"role": "user", "content": "What is a large language model?"}]
67
 
68
  tokenizer = AutoTokenizer.from_pretrained(model)
 
11
  - princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
12
  ---
13
 
14
+ # Llama-3-15B-64k-Instruct
15
 
16
+ I decided to repeat [this](https://huggingface.co/elinas/Llama-3-15B-Instruct-zeroed) merge, but using [64K version of Llama 3 8B](https://huggingface.co/princeton-nlp/Llama-3-8B-ProLong-64k-Instruct).
 
 
 
 
17
 
18
+ This should work with a context up to 64k, but I strongly recommend making a finetune first.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## 💻 Usage
21
 
 
26
  import transformers
27
  import torch
28
 
29
+ model = "Ttimofeyka/Llama-3-15B-64k-Instruct"
30
  messages = [{"role": "user", "content": "What is a large language model?"}]
31
 
32
  tokenizer = AutoTokenizer.from_pretrained(model)