ermiaazarkhalili commited on
Commit
f81f656
·
verified ·
1 Parent(s): f5b6d6a

Add comprehensive model card for Meta-Llama-3-8B-Instruct fine-tuned on xLAM

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  ---
2
  license: cc-by-nc-4.0
3
  tags:
@@ -31,7 +32,7 @@ This is a fine-tuned version of the Meta-Llama-3-8B-Instruct model. The model wa
31
  - **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
32
  - **Model size:** Meta-Llama-3-8B-Instruct parameters
33
  - **Vocab size:** 128,256 tokens
34
- - **Max sequence length:** 512 tokens
35
  - **Tensor type:** BF16
36
  - **Pad token:** `<|eot_id|>` (ID: 128009)
37
 
@@ -47,11 +48,11 @@ The model was fine-tuned using the following configuration:
47
 
48
  ### Training Parameters
49
  - **Learning Rate:** 0.0001
50
- - **Batch Size:** 8
51
- - **Gradient Accumulation Steps:** 4
52
- - **Max Training Steps:** 10
53
  - **Warmup Ratio:** 0.1
54
- - **Max Sequence Length:** 512
55
  - **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
56
 
57
  ### LoRA Configuration
 
1
+
2
  ---
3
  license: cc-by-nc-4.0
4
  tags:
 
32
  - **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
33
  - **Model size:** Meta-Llama-3-8B-Instruct parameters
34
  - **Vocab size:** 128,256 tokens
35
+ - **Max sequence length:** 2,048 tokens
36
  - **Tensor type:** BF16
37
  - **Pad token:** `<|eot_id|>` (ID: 128009)
38
 
 
48
 
49
  ### Training Parameters
50
  - **Learning Rate:** 0.0001
51
+ - **Batch Size:** 16
52
+ - **Gradient Accumulation Steps:** 8
53
+ - **Max Training Steps:** 1,000
54
  - **Warmup Ratio:** 0.1
55
+ - **Max Sequence Length:** 2,048
56
  - **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
57
 
58
  ### LoRA Configuration