qwen2.5-0.5b-expo-DPO-25-2
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-25-2 on the hZzy/train_pairwise_all_new dataset. It achieves the following results on the evaluation set:
- Loss: 0.6575
- Logps: -92.0019
- Logits: -1.5710
- Objective: 0.6607
- Dpo Loss: 0.6607
- Ranking Simple: 0.5576
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 12
- total_train_batch_size: 144
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Logps | Logits | Objective | Dpo Loss | Ranking Simple |
---|---|---|---|---|---|---|---|---|
0.6788 | 0.0719 | 50 | 0.6864 | -101.0843 | -1.4425 | 0.6873 | 0.6873 | 0.5170 |
0.6396 | 0.1438 | 100 | 0.6796 | -96.6355 | -1.6394 | 0.6831 | 0.6831 | 0.5284 |
0.6428 | 0.2157 | 150 | 0.6731 | -90.4612 | -1.3249 | 0.6741 | 0.6741 | 0.5422 |
0.6108 | 0.2876 | 200 | 0.6792 | -85.5977 | -1.4063 | 0.6795 | 0.6795 | 0.5468 |
0.5856 | 0.3595 | 250 | 0.6623 | -87.8055 | -1.3858 | 0.6615 | 0.6615 | 0.5459 |
0.5701 | 0.4314 | 300 | 0.6633 | -88.7819 | -1.4784 | 0.6662 | 0.6662 | 0.5449 |
0.5296 | 0.5034 | 350 | 0.6642 | -89.5468 | -1.5418 | 0.6664 | 0.6664 | 0.5457 |
0.5055 | 0.5753 | 400 | 0.6579 | -89.5613 | -1.4127 | 0.6593 | 0.6593 | 0.5535 |
0.5493 | 0.6472 | 450 | 0.6599 | -90.8471 | -1.5076 | 0.6628 | 0.6628 | 0.5557 |
0.5056 | 0.7191 | 500 | 0.6561 | -91.5462 | -1.4783 | 0.6591 | 0.6591 | 0.5541 |
0.475 | 0.7910 | 550 | 0.6573 | -90.8545 | -1.5168 | 0.6599 | 0.6599 | 0.5581 |
0.4881 | 0.8629 | 600 | 0.6582 | -91.8983 | -1.5757 | 0.6610 | 0.6610 | 0.5565 |
0.4496 | 0.9348 | 650 | 0.6576 | -91.9025 | -1.5710 | 0.6608 | 0.6608 | 0.5573 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 3.2.0
- Tokenizers 0.19.1
- Downloads last month
- 9
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.