File size: 5,687 Bytes
73cb91b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
---
license: apache-2.0
base_model: teknium/OpenHermes-2.5-Mistral-7B
tags:
- generated_from_trainer
model-index:
- name: openhermes-mistral-2.5-7b-dpo-test
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# openhermes-mistral-2.5-7b-dpo-test
This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4487
- Rewards/chosen: -0.2951
- Rewards/rejected: -2.2421
- Rewards/accuracies: 0.875
- Rewards/margins: 1.9470
- Logps/rejected: -257.4751
- Logps/chosen: -204.3027
- Logits/rejected: -3.0752
- Logits/chosen: -3.0485
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- training_steps: 200
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.1645 | 0.01 | 10 | 0.5339 | 0.3993 | -0.1483 | 0.6875 | 0.5476 | -236.5374 | -197.3593 | -3.1575 | -3.1872 |
| 0.0519 | 0.01 | 20 | 0.5521 | 0.2239 | -0.4486 | 0.625 | 0.6725 | -239.5405 | -199.1127 | -3.1969 | -3.2456 |
| 0.1618 | 0.01 | 30 | 0.5866 | -0.0538 | -0.8893 | 0.5625 | 0.8355 | -243.9472 | -201.8902 | -3.2286 | -3.2525 |
| 0.1752 | 0.02 | 40 | 0.5943 | -0.2184 | -1.2057 | 0.5 | 0.9873 | -247.1112 | -203.5360 | -3.2201 | -3.2477 |
| 0.3811 | 0.03 | 50 | 0.6973 | -0.6180 | -1.8146 | 0.5 | 1.1966 | -253.2001 | -207.5316 | -3.1943 | -3.2034 |
| 1.158 | 0.03 | 60 | 0.6347 | -0.4710 | -1.7363 | 0.5625 | 1.2653 | -252.4173 | -206.0622 | -3.1655 | -3.1197 |
| 0.8751 | 0.04 | 70 | 0.6103 | -0.4061 | -1.5966 | 0.5625 | 1.1905 | -251.0201 | -205.4132 | -3.1360 | -3.0544 |
| 0.7811 | 0.04 | 80 | 0.6405 | -0.4774 | -1.6574 | 0.5625 | 1.1799 | -251.6278 | -206.1260 | -3.1337 | -3.0492 |
| 1.4305 | 0.04 | 90 | 0.6257 | -0.4784 | -1.6184 | 0.5625 | 1.1399 | -251.2379 | -206.1361 | -3.1251 | -3.0489 |
| 0.5478 | 0.05 | 100 | 0.6191 | -0.5317 | -1.7067 | 0.5625 | 1.1750 | -252.1214 | -206.6691 | -3.1207 | -3.0753 |
| 0.6344 | 0.06 | 110 | 0.5691 | -0.4827 | -1.7734 | 0.5625 | 1.2907 | -252.7882 | -206.1789 | -3.1075 | -3.0806 |
| 0.5405 | 0.06 | 120 | 0.5337 | -0.4681 | -2.1739 | 0.8125 | 1.7058 | -256.7935 | -206.0332 | -3.1124 | -3.0733 |
| 0.7848 | 0.07 | 130 | 0.5390 | -0.5288 | -2.3789 | 0.8125 | 1.8501 | -258.8436 | -206.6404 | -3.1019 | -3.0628 |
| 1.3119 | 0.07 | 140 | 0.4753 | -0.3276 | -2.0907 | 0.875 | 1.7631 | -255.9614 | -204.6279 | -3.0904 | -3.0648 |
| 0.3636 | 0.07 | 150 | 0.4555 | -0.2566 | -2.0064 | 0.625 | 1.7498 | -255.1179 | -203.9175 | -3.0804 | -3.0640 |
| 0.427 | 0.08 | 160 | 0.4614 | -0.2900 | -2.0804 | 0.625 | 1.7904 | -255.8585 | -204.2518 | -3.0721 | -3.0518 |
| 0.8971 | 0.09 | 170 | 0.4629 | -0.3117 | -2.1791 | 0.875 | 1.8673 | -256.8448 | -204.4694 | -3.0711 | -3.0468 |
| 0.6219 | 0.09 | 180 | 0.4560 | -0.3042 | -2.2114 | 0.875 | 1.9073 | -257.1686 | -204.3934 | -3.0743 | -3.0485 |
| 0.7551 | 0.1 | 190 | 0.4520 | -0.3007 | -2.2400 | 0.875 | 1.9392 | -257.4540 | -204.3593 | -3.0755 | -3.0481 |
| 1.0917 | 0.1 | 200 | 0.4487 | -0.2951 | -2.2421 | 0.875 | 1.9470 | -257.4751 | -204.3027 | -3.0752 | -3.0485 |
### Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1
|