Add comprehensive model card for Meta-Llama-3-8B-Instruct fine-tuned on xLAM
Browse files
README.md
CHANGED
@@ -1,3 +1,4 @@
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
tags:
|
@@ -31,7 +32,7 @@ This is a fine-tuned version of the Meta-Llama-3-8B-Instruct model. The model wa
|
|
31 |
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
|
32 |
- **Model size:** Meta-Llama-3-8B-Instruct parameters
|
33 |
- **Vocab size:** 128,256 tokens
|
34 |
-
- **Max sequence length:**
|
35 |
- **Tensor type:** BF16
|
36 |
- **Pad token:** `<|eot_id|>` (ID: 128009)
|
37 |
|
@@ -47,11 +48,11 @@ The model was fine-tuned using the following configuration:
|
|
47 |
|
48 |
### Training Parameters
|
49 |
- **Learning Rate:** 0.0001
|
50 |
-
- **Batch Size:**
|
51 |
-
- **Gradient Accumulation Steps:**
|
52 |
-
- **Max Training Steps:**
|
53 |
- **Warmup Ratio:** 0.1
|
54 |
-
- **Max Sequence Length:**
|
55 |
- **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
|
56 |
|
57 |
### LoRA Configuration
|
|
|
1 |
+
|
2 |
---
|
3 |
license: cc-by-nc-4.0
|
4 |
tags:
|
|
|
32 |
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
|
33 |
- **Model size:** Meta-Llama-3-8B-Instruct parameters
|
34 |
- **Vocab size:** 128,256 tokens
|
35 |
+
- **Max sequence length:** 2,048 tokens
|
36 |
- **Tensor type:** BF16
|
37 |
- **Pad token:** `<|eot_id|>` (ID: 128009)
|
38 |
|
|
|
48 |
|
49 |
### Training Parameters
|
50 |
- **Learning Rate:** 0.0001
|
51 |
+
- **Batch Size:** 16
|
52 |
+
- **Gradient Accumulation Steps:** 8
|
53 |
+
- **Max Training Steps:** 1,000
|
54 |
- **Warmup Ratio:** 0.1
|
55 |
+
- **Max Sequence Length:** 2,048
|
56 |
- **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
|
57 |
|
58 |
### LoRA Configuration
|