SmolLM2-360M-japanese_scratch_phase_1-checkpoints
Collection
11 items
•
Updated
This model is a fine-tuned version of HuggingFaceTB/SmolLM2-360M on the kajuma/training_01-09_patch dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.4162 | 0.0439 | 500 | 1.4265 |
1.3632 | 0.0878 | 1000 | 1.3825 |
1.3563 | 0.1317 | 1500 | 1.3339 |
1.2638 | 0.1755 | 2000 | 1.3033 |
1.2974 | 0.2194 | 2500 | 1.2802 |
1.3333 | 0.2633 | 3000 | 1.2623 |
1.254 | 0.3072 | 3500 | 1.2466 |
1.2591 | 0.3511 | 4000 | 1.2318 |
1.2091 | 0.3950 | 4500 | 1.2186 |
1.2803 | 0.4388 | 5000 | 1.2060 |
1.222 | 0.4827 | 5500 | 1.1942 |
1.2236 | 0.5266 | 6000 | 1.1826 |
1.1148 | 0.5705 | 6500 | 1.1723 |
1.2086 | 0.6144 | 7000 | 1.1626 |
1.1524 | 0.6583 | 7500 | 1.1542 |
1.1177 | 0.7022 | 8000 | 1.1471 |
1.1894 | 0.7460 | 8500 | 1.1417 |
1.1384 | 0.7899 | 9000 | 1.1379 |
1.1379 | 0.8338 | 9500 | 1.1350 |
1.1464 | 0.8777 | 10000 | 1.1333 |
1.1579 | 0.9216 | 10500 | 1.1322 |
1.144 | 0.9655 | 11000 | 1.1315 |