reward_output / README.md
naive-puzzle's picture
End of training
673d1bb verified
---
library_name: peft
license: gemma
base_model: google/gemma-2-2b-jpn-it
tags:
- generated_from_trainer
model-index:
- name: budget-model
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# budget-model
This model is a fine-tuned version of [google/gemma-2-2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1566
- Mse: 0.1566
- Helpfulness: 0.4101
- Correctness: 0.4033
- Coherence: 0.2173
- Complexity: 0.2256
- Verbosity: 0.2138
- Mae: 0.2940
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 2.0
### Training results
| Training Loss | Epoch | Step | Validation Loss | Mse | Helpfulness | Correctness | Coherence | Complexity | Verbosity | Mae |
|:-------------:|:------:|:-----:|:---------------:|:------:|:-----------:|:-----------:|:---------:|:----------:|:---------:|:------:|
| 2.7412 | 0.0359 | 500 | 2.4270 | 2.4264 | 1.1650 | 1.1794 | 1.2669 | 1.1851 | 1.2484 | 1.2090 |
| 0.4979 | 0.0719 | 1000 | 0.4255 | 0.4255 | 0.5874 | 0.5992 | 0.4379 | 0.4541 | 0.4394 | 0.5036 |
| 0.2491 | 0.1078 | 1500 | 0.2678 | 0.2679 | 0.5257 | 0.4958 | 0.3171 | 0.3332 | 0.3079 | 0.3959 |
| 0.1797 | 0.1437 | 2000 | 0.2275 | 0.2274 | 0.4893 | 0.4812 | 0.2809 | 0.2982 | 0.2666 | 0.3632 |
| 0.1814 | 0.1797 | 2500 | 0.2081 | 0.2081 | 0.4918 | 0.4689 | 0.2816 | 0.2664 | 0.2434 | 0.3504 |
| 0.1888 | 0.2156 | 3000 | 0.2003 | 0.2003 | 0.4625 | 0.4527 | 0.2519 | 0.2701 | 0.2313 | 0.3337 |
| 0.1525 | 0.2515 | 3500 | 0.1925 | 0.1925 | 0.4575 | 0.4449 | 0.2414 | 0.2499 | 0.2320 | 0.3251 |
| 0.1584 | 0.2875 | 4000 | 0.1877 | 0.1877 | 0.4519 | 0.4427 | 0.2336 | 0.2524 | 0.2296 | 0.3220 |
| 0.1772 | 0.3234 | 4500 | 0.1964 | 0.1964 | 0.4683 | 0.4350 | 0.2331 | 0.3044 | 0.2240 | 0.3330 |
| 0.1722 | 0.3594 | 5000 | 0.1821 | 0.1821 | 0.4515 | 0.4424 | 0.2694 | 0.2389 | 0.2295 | 0.3263 |
| 0.1481 | 0.3953 | 5500 | 0.1793 | 0.1793 | 0.4472 | 0.4428 | 0.2428 | 0.2335 | 0.2454 | 0.3223 |
| 0.1391 | 0.4312 | 6000 | 0.1759 | 0.1758 | 0.4395 | 0.4441 | 0.2412 | 0.2333 | 0.2201 | 0.3156 |
| 0.1638 | 0.4672 | 6500 | 0.1775 | 0.1775 | 0.4499 | 0.4415 | 0.2445 | 0.2329 | 0.2281 | 0.3194 |
| 0.1579 | 0.5031 | 7000 | 0.1802 | 0.1802 | 0.4518 | 0.4499 | 0.2438 | 0.2625 | 0.2124 | 0.3241 |
| 0.1621 | 0.5390 | 7500 | 0.1742 | 0.1742 | 0.4463 | 0.4313 | 0.2370 | 0.2336 | 0.2283 | 0.3153 |
| 0.1522 | 0.5750 | 8000 | 0.1735 | 0.1735 | 0.4381 | 0.4257 | 0.2190 | 0.2275 | 0.2358 | 0.3092 |
| 0.1498 | 0.6109 | 8500 | 0.1736 | 0.1735 | 0.4451 | 0.4363 | 0.2105 | 0.2288 | 0.2185 | 0.3078 |
| 0.144 | 0.6468 | 9000 | 0.1731 | 0.1732 | 0.4343 | 0.4235 | 0.2481 | 0.2431 | 0.2246 | 0.3147 |
| 0.1536 | 0.6828 | 9500 | 0.1835 | 0.1835 | 0.4544 | 0.4514 | 0.3051 | 0.2394 | 0.2400 | 0.3381 |
| 0.1543 | 0.7187 | 10000 | 0.1673 | 0.1673 | 0.4353 | 0.4286 | 0.2422 | 0.2291 | 0.2063 | 0.3083 |
| 0.1238 | 0.7546 | 10500 | 0.1708 | 0.1708 | 0.4310 | 0.4410 | 0.2537 | 0.2508 | 0.2103 | 0.3174 |
| 0.1521 | 0.7906 | 11000 | 0.1634 | 0.1634 | 0.4256 | 0.4182 | 0.2383 | 0.2276 | 0.2172 | 0.3054 |
| 0.1661 | 0.8265 | 11500 | 0.1706 | 0.1706 | 0.4251 | 0.4114 | 0.2043 | 0.2384 | 0.2454 | 0.3049 |
| 0.1509 | 0.8624 | 12000 | 0.1656 | 0.1656 | 0.4279 | 0.4187 | 0.2097 | 0.2270 | 0.2162 | 0.2999 |
| 0.1634 | 0.8984 | 12500 | 0.1698 | 0.1698 | 0.4272 | 0.4205 | 0.2657 | 0.2277 | 0.2273 | 0.3137 |
| 0.1502 | 0.9343 | 13000 | 0.1627 | 0.1627 | 0.4192 | 0.4083 | 0.2015 | 0.2375 | 0.2165 | 0.2966 |
| 0.149 | 0.9702 | 13500 | 0.1664 | 0.1664 | 0.4234 | 0.4096 | 0.2412 | 0.2395 | 0.2281 | 0.3083 |
| 0.1444 | 1.0062 | 14000 | 0.1629 | 0.1629 | 0.4173 | 0.4016 | 0.2123 | 0.2260 | 0.2104 | 0.2935 |
| 0.1186 | 1.0421 | 14500 | 0.1635 | 0.1635 | 0.4218 | 0.4131 | 0.2515 | 0.2330 | 0.2053 | 0.3049 |
| 0.1569 | 1.0781 | 15000 | 0.1629 | 0.1629 | 0.4172 | 0.4096 | 0.2255 | 0.2302 | 0.2354 | 0.3036 |
| 0.1284 | 1.1140 | 15500 | 0.1634 | 0.1634 | 0.4138 | 0.3999 | 0.2133 | 0.2320 | 0.2044 | 0.2927 |
| 0.1415 | 1.1499 | 16000 | 0.1615 | 0.1615 | 0.4139 | 0.4012 | 0.2141 | 0.2277 | 0.2076 | 0.2929 |
| 0.1355 | 1.1859 | 16500 | 0.1619 | 0.1619 | 0.4165 | 0.4036 | 0.2286 | 0.2278 | 0.2202 | 0.2993 |
| 0.1277 | 1.2218 | 17000 | 0.1602 | 0.1601 | 0.4122 | 0.4027 | 0.2148 | 0.2298 | 0.2310 | 0.2981 |
| 0.1165 | 1.2577 | 17500 | 0.1589 | 0.1589 | 0.4173 | 0.4119 | 0.2200 | 0.2274 | 0.2220 | 0.2997 |
| 0.1367 | 1.2937 | 18000 | 0.1572 | 0.1571 | 0.4140 | 0.4063 | 0.2291 | 0.2302 | 0.2082 | 0.2976 |
| 0.1307 | 1.3296 | 18500 | 0.1612 | 0.1612 | 0.4155 | 0.4092 | 0.2496 | 0.2310 | 0.2085 | 0.3028 |
| 0.1469 | 1.3655 | 19000 | 0.1605 | 0.1605 | 0.4188 | 0.4168 | 0.2264 | 0.2279 | 0.2113 | 0.3002 |
| 0.1213 | 1.4015 | 19500 | 0.1579 | 0.1578 | 0.4134 | 0.4095 | 0.2285 | 0.2280 | 0.2187 | 0.2996 |
| 0.1299 | 1.4374 | 20000 | 0.1585 | 0.1585 | 0.4116 | 0.4063 | 0.2375 | 0.2269 | 0.2204 | 0.3006 |
| 0.1096 | 1.4733 | 20500 | 0.1581 | 0.1581 | 0.4108 | 0.4021 | 0.2301 | 0.2283 | 0.2119 | 0.2966 |
| 0.1445 | 1.5093 | 21000 | 0.1576 | 0.1576 | 0.4126 | 0.4028 | 0.2436 | 0.2279 | 0.2170 | 0.3008 |
| 0.1112 | 1.5452 | 21500 | 0.1595 | 0.1595 | 0.4103 | 0.3968 | 0.2047 | 0.2275 | 0.2207 | 0.2920 |
| 0.1487 | 1.5811 | 22000 | 0.1599 | 0.1599 | 0.4101 | 0.3981 | 0.2165 | 0.2283 | 0.2261 | 0.2958 |
| 0.0965 | 1.6171 | 22500 | 0.1572 | 0.1572 | 0.4103 | 0.4012 | 0.2205 | 0.2272 | 0.2094 | 0.2937 |
| 0.1394 | 1.6530 | 23000 | 0.1574 | 0.1574 | 0.4107 | 0.4021 | 0.2150 | 0.2264 | 0.2176 | 0.2944 |
| 0.1233 | 1.6889 | 23500 | 0.1568 | 0.1569 | 0.4110 | 0.4069 | 0.2302 | 0.2264 | 0.2103 | 0.2970 |
| 0.1584 | 1.7249 | 24000 | 0.1565 | 0.1565 | 0.4119 | 0.4023 | 0.2170 | 0.2272 | 0.2102 | 0.2937 |
| 0.1147 | 1.7608 | 24500 | 0.1567 | 0.1568 | 0.4099 | 0.4001 | 0.2170 | 0.2261 | 0.2125 | 0.2931 |
| 0.1392 | 1.7968 | 25000 | 0.1566 | 0.1566 | 0.4107 | 0.4039 | 0.2214 | 0.2253 | 0.2123 | 0.2947 |
| 0.1237 | 1.8327 | 25500 | 0.1565 | 0.1565 | 0.4107 | 0.4020 | 0.2196 | 0.2258 | 0.2135 | 0.2943 |
| 0.1284 | 1.8686 | 26000 | 0.1565 | 0.1565 | 0.4108 | 0.4024 | 0.2162 | 0.2261 | 0.2113 | 0.2933 |
| 0.1243 | 1.9046 | 26500 | 0.1567 | 0.1567 | 0.4106 | 0.4030 | 0.2145 | 0.2259 | 0.2142 | 0.2936 |
| 0.1271 | 1.9405 | 27000 | 0.1566 | 0.1566 | 0.4106 | 0.4028 | 0.2151 | 0.2264 | 0.2141 | 0.2938 |
| 0.1429 | 1.9764 | 27500 | 0.1566 | 0.1566 | 0.4101 | 0.4033 | 0.2173 | 0.2256 | 0.2138 | 0.2940 |
### Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0