jordiclive
/

gpt4all-alpaca-oa-codealpaca-lora-7b

Text Generation

Model card Files Files and versions

jordiclive commited on Apr 4, 2023

Commit

eadbb15

·

1 Parent(s): 732a0ac

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -6,7 +6,11 @@ datasets:
 - yahma/alpaca-cleaned
 ---
-This repo contains a low-rank adapter for LLaMA-7b fit on `Nebulous/gpt4all_pruned`, `sahil2801/CodeAlpaca-20k`, `yahma/alpaca-cleaned` and some datasets part of the OpenAssistant project.
 This version of the weights was trained with the following hyperparameters:
@@ -15,5 +19,8 @@ This version of the weights was trained with the following hyperparameters:
 - Batch size: 128
 - Max Length: 2048
 - Learning rate: 4e-6
-- Lora _r_: 16
-- Lora target modules: q_proj, k_proj, v_proj, o_proj

 - yahma/alpaca-cleaned
 ---
+This repo contains a low-rank adapter for **LLaMA-7b** fit on
+- `Nebulous/gpt4all_pruned`
+- `sahil2801/CodeAlpaca-20k`
+- `yahma/alpaca-cleaned`
+- datasets part of the OpenAssistant project.
 This version of the weights was trained with the following hyperparameters:
 - Batch size: 128
 - Max Length: 2048
 - Learning rate: 4e-6
+- Lora _r_: 8
+- Lora Alpha: 32
+- Lora target modules: q_proj, k_proj, v_proj, o_proj
+The model was trained with flash attention and gradient checkpointing.