End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2466
 ## Model description
@@ -46,14 +46,23 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- training_steps: 250
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.2601        | 0.03  | 250  | 0.2466          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2505
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- training_steps: 500
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.3021        | 0.01  | 50   | 0.2990          |
+| 0.2768        | 0.01  | 100  | 0.2823          |
+| 0.2781        | 0.02  | 150  | 0.2751          |
+| 0.2701        | 0.02  | 200  | 0.2724          |
+| 0.2594        | 0.03  | 250  | 0.2634          |
+| 0.2496        | 0.03  | 300  | 0.2591          |
+| 0.2975        | 0.04  | 350  | 0.2560          |
+| 0.2443        | 0.04  | 400  | 0.2535          |
+| 0.277         | 0.05  | 450  | 0.2512          |
+| 0.2407        | 0.05  | 500  | 0.2505          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -21,8 +21,8 @@
   "target_modules": [
     "k_proj",
     "o_proj",
-    "v_proj",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "target_modules": [
     "k_proj",
     "o_proj",
+    "q_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c256d3df717cda48fb83933d0b59364080dda84fc6561499d3c8ea03d8e37faa
 size 54560368

 version https://git-lfs.github.com/spec/v1
+oid sha256:b17cf761836b9715bee0515a3c20847cb238f8c36b452677307a8df25b08ba83
 size 54560368

tokenizer.json CHANGED Viewed

@@ -1,14 +1,12 @@
 {
   "version": "1.0",
-  "truncation": null,
-  "padding": {
-    "strategy": "BatchLongest",
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 2,
-    "pad_type_id": 0,
-    "pad_token": "</s>"
   },
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Left",
+    "max_length": 512,
+    "strategy": "LongestFirst",
+    "stride": 0
   },
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0b46faeb8c9b52ee0259c2d2afa8f09dd6ed0cd75ade83c2c77f5cd5356e44fa
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:200a3f43fe4dac28371f3b43d0a3417321b723563359f579193a6bb320c9e279
 size 4728