Commit
·
eadbb15
1
Parent(s):
732a0ac
Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,11 @@ datasets:
|
|
| 6 |
- yahma/alpaca-cleaned
|
| 7 |
---
|
| 8 |
|
| 9 |
-
This repo contains a low-rank adapter for LLaMA-7b fit on
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
This version of the weights was trained with the following hyperparameters:
|
|
@@ -15,5 +19,8 @@ This version of the weights was trained with the following hyperparameters:
|
|
| 15 |
- Batch size: 128
|
| 16 |
- Max Length: 2048
|
| 17 |
- Learning rate: 4e-6
|
| 18 |
-
- Lora _r_:
|
| 19 |
-
- Lora
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- yahma/alpaca-cleaned
|
| 7 |
---
|
| 8 |
|
| 9 |
+
This repo contains a low-rank adapter for **LLaMA-7b** fit on
|
| 10 |
+
- `Nebulous/gpt4all_pruned`
|
| 11 |
+
- `sahil2801/CodeAlpaca-20k`
|
| 12 |
+
- `yahma/alpaca-cleaned`
|
| 13 |
+
- datasets part of the OpenAssistant project.
|
| 14 |
|
| 15 |
|
| 16 |
This version of the weights was trained with the following hyperparameters:
|
|
|
|
| 19 |
- Batch size: 128
|
| 20 |
- Max Length: 2048
|
| 21 |
- Learning rate: 4e-6
|
| 22 |
+
- Lora _r_: 8
|
| 23 |
+
- Lora Alpha: 32
|
| 24 |
+
- Lora target modules: q_proj, k_proj, v_proj, o_proj
|
| 25 |
+
|
| 26 |
+
The model was trained with flash attention and gradient checkpointing.
|