tripathiarpan20 commited on
Commit
3f87604
·
verified ·
1 Parent(s): cbe8715

End of training

Browse files
Files changed (2) hide show
  1. README.md +139 -0
  2. adapter_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ license: llama3.2
4
+ base_model: unsloth/Llama-3.2-1B-Instruct
5
+ tags:
6
+ - axolotl
7
+ - generated_from_trainer
8
+ model-index:
9
+ - name: tuning-383a850e-bb15-45a2-8f4b
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
17
+ <details><summary>See axolotl config</summary>
18
+
19
+ axolotl version: `0.4.1`
20
+ ```yaml
21
+ adapter: lora
22
+ base_model: unsloth/Llama-3.2-1B-Instruct
23
+ bf16: auto
24
+ chat_template: llama3
25
+ dataset_prepared_path: null
26
+ datasets:
27
+ - path: mhenrichsen/alpaca_2k_test
28
+ type: alpaca
29
+ debug: null
30
+ deepspeed: null
31
+ early_stopping_patience: null
32
+ eval_max_new_tokens: 128
33
+ eval_table_size: null
34
+ evals_per_epoch: 4
35
+ flash_attention: true
36
+ fp16: null
37
+ fsdp: null
38
+ fsdp_config: null
39
+ gradient_accumulation_steps: 4
40
+ gradient_checkpointing: true
41
+ group_by_length: false
42
+ hub_model_id: tripathiarpan20/tuning-383a850e-bb15-45a2-8f4b
43
+ hub_strategy: checkpoint
44
+ hub_token: null
45
+ learning_rate: 0.0002
46
+ load_in_4bit: false
47
+ load_in_8bit: true
48
+ local_rank: null
49
+ logging_steps: 1
50
+ lora_alpha: 32
51
+ lora_dropout: 0.05
52
+ lora_fan_in_fan_out: null
53
+ lora_model_dir: null
54
+ lora_r: 16
55
+ lora_target_linear: true
56
+ lr_scheduler: cosine
57
+ max_steps: 10
58
+ micro_batch_size: 2
59
+ mlflow_experiment_name: mhenrichsen/alpaca_2k_test
60
+ model_type: LlamaForCausalLM
61
+ num_epochs: 1
62
+ optimizer: adamw_bnb_8bit
63
+ output_dir: miner_id_24
64
+ pad_to_sequence_len: true
65
+ resume_from_checkpoint: null
66
+ s2_attention: null
67
+ sample_packing: false
68
+ save_steps: 5
69
+ save_strategy: steps
70
+ sequence_len: 4096
71
+ strict: false
72
+ tf32: false
73
+ tokenizer_type: AutoTokenizer
74
+ train_on_inputs: false
75
+ val_set_size: 0.05
76
+ wandb_entity: tripathiarpan2000-corcel-io
77
+ wandb_mode: online
78
+ wandb_project: Public_TuningSN
79
+ wandb_run: miner_id_24
80
+ wandb_runid: 383a850e-bb15-45a2-8f4b
81
+ warmup_steps: 10
82
+ weight_decay: 0.0
83
+ xformers_attention: null
84
+
85
+ ```
86
+
87
+ </details><br>
88
+
89
+ # tuning-383a850e-bb15-45a2-8f4b
90
+
91
+ This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset.
92
+ It achieves the following results on the evaluation set:
93
+ - Loss: 1.2185
94
+
95
+ ## Model description
96
+
97
+ More information needed
98
+
99
+ ## Intended uses & limitations
100
+
101
+ More information needed
102
+
103
+ ## Training and evaluation data
104
+
105
+ More information needed
106
+
107
+ ## Training procedure
108
+
109
+ ### Training hyperparameters
110
+
111
+ The following hyperparameters were used during training:
112
+ - learning_rate: 0.0002
113
+ - train_batch_size: 2
114
+ - eval_batch_size: 2
115
+ - seed: 42
116
+ - gradient_accumulation_steps: 4
117
+ - total_train_batch_size: 8
118
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
119
+ - lr_scheduler_type: cosine
120
+ - lr_scheduler_warmup_steps: 10
121
+ - training_steps: 10
122
+
123
+ ### Training results
124
+
125
+ | Training Loss | Epoch | Step | Validation Loss |
126
+ |:-------------:|:------:|:----:|:---------------:|
127
+ | 1.3218 | 0.0042 | 1 | 1.2625 |
128
+ | 1.3154 | 0.0126 | 3 | 1.2611 |
129
+ | 1.4776 | 0.0253 | 6 | 1.2118 |
130
+ | 1.283 | 0.0379 | 9 | 1.2185 |
131
+
132
+
133
+ ### Framework versions
134
+
135
+ - PEFT 0.13.2
136
+ - Transformers 4.45.2
137
+ - Pytorch 2.4.1+cu124
138
+ - Datasets 3.0.1
139
+ - Tokenizers 0.20.1
adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d5ae62449ce6458f434e6476cf11e97dfb19197d1d3f83c03f27b9a254e425a3
3
+ size 45169354