Terjman-Nano-v2.0-512
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.5184
- Bleu: 2.2033
- Gen Len: 11.8452
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 40
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
9.7501 | 0.5635 | 1000 | 6.1316 | 0.6938 | 11.4881 |
7.6553 | 1.1268 | 2000 | 5.4289 | 1.0007 | 10.4235 |
6.7699 | 1.6903 | 3000 | 5.0480 | 1.2106 | 10.835 |
6.1954 | 2.2536 | 4000 | 4.7737 | 1.7367 | 10.9592 |
6.1181 | 2.8171 | 5000 | 4.6615 | 2.0549 | 11.1088 |
5.9065 | 3.3804 | 6000 | 4.6103 | 2.0903 | 10.9507 |
5.9562 | 3.9439 | 7000 | 4.5762 | 2.1818 | 11.7874 |
5.8241 | 4.5072 | 8000 | 4.5597 | 2.2354 | 11.7704 |
5.9395 | 5.0704 | 9000 | 4.5460 | 2.2286 | 11.7874 |
5.7882 | 5.6340 | 10000 | 4.5379 | 2.2387 | 11.7925 |
5.6615 | 6.1972 | 11000 | 4.5317 | 2.2517 | 11.8214 |
5.7702 | 6.7608 | 12000 | 4.5299 | 2.2277 | 11.8095 |
5.7982 | 7.3240 | 13000 | 4.5274 | 2.248 | 11.8248 |
5.8247 | 7.8876 | 14000 | 4.5253 | 2.2272 | 11.7262 |
5.7944 | 8.4508 | 15000 | 4.5242 | 2.2102 | 11.7483 |
5.7937 | 9.0141 | 16000 | 4.5219 | 2.0453 | 11.8248 |
5.8086 | 9.5776 | 17000 | 4.5242 | 2.2002 | 11.7942 |
5.7118 | 10.1409 | 18000 | 4.5207 | 2.1983 | 11.8197 |
5.6221 | 10.7044 | 19000 | 4.5195 | 2.2475 | 11.8588 |
5.7131 | 11.2677 | 20000 | 4.5189 | 2.2372 | 11.767 |
5.6595 | 11.8312 | 21000 | 4.5183 | 2.2103 | 11.8214 |
5.7572 | 12.3945 | 22000 | 4.5188 | 2.1995 | 11.8827 |
5.7426 | 12.9580 | 23000 | 4.5173 | 2.0773 | 11.7738 |
5.7731 | 13.5213 | 24000 | 4.5184 | 2.2054 | 11.7823 |
5.6443 | 14.0845 | 25000 | 4.5184 | 2.214 | 11.8282 |
5.7615 | 14.6481 | 26000 | 4.5176 | 1.9705 | 11.8214 |
5.6754 | 15.2113 | 27000 | 4.5187 | 2.2401 | 11.8027 |
5.902 | 15.7749 | 28000 | 4.5182 | 2.2285 | 11.7891 |
5.8776 | 16.3381 | 29000 | 4.5175 | 2.1819 | 11.8265 |
5.7233 | 16.9017 | 30000 | 4.5182 | 2.1982 | 11.8061 |
5.732 | 17.4649 | 31000 | 4.5173 | 2.2053 | 11.7891 |
5.7165 | 18.0282 | 32000 | 4.5183 | 2.1991 | 11.8537 |
5.8338 | 18.5917 | 33000 | 4.5188 | 2.1873 | 11.8248 |
5.8152 | 19.1550 | 34000 | 4.5180 | 2.1978 | 11.7568 |
5.597 | 19.7185 | 35000 | 4.5182 | 2.2272 | 11.7976 |
5.7124 | 20.2818 | 36000 | 4.5181 | 2.1915 | 11.8997 |
5.8329 | 20.8453 | 37000 | 4.5184 | 1.9777 | 11.7653 |
5.7707 | 21.4086 | 38000 | 4.5170 | 2.2169 | 11.8418 |
5.8133 | 21.9721 | 39000 | 4.5177 | 2.1797 | 11.881 |
5.7323 | 22.5354 | 40000 | 4.5179 | 2.1909 | 11.8282 |
5.8272 | 23.0986 | 41000 | 4.5180 | 2.2036 | 11.8044 |
5.7333 | 23.6622 | 42000 | 4.5179 | 2.2158 | 11.7891 |
5.7345 | 24.2254 | 43000 | 4.5185 | 1.967 | 11.8112 |
5.7984 | 24.7890 | 44000 | 4.5184 | 2.2096 | 11.7296 |
5.7832 | 25.3522 | 45000 | 4.5179 | 2.1928 | 11.8844 |
5.7056 | 25.9158 | 46000 | 4.5179 | 2.2039 | 11.7908 |
5.6642 | 26.4790 | 47000 | 4.5188 | 2.1819 | 11.7721 |
5.8378 | 27.0423 | 48000 | 4.5175 | 2.172 | 11.8163 |
5.6316 | 27.6058 | 49000 | 4.5177 | 2.1752 | 11.8146 |
5.6802 | 28.1691 | 50000 | 4.5180 | 2.2163 | 11.818 |
5.7301 | 28.7326 | 51000 | 4.5175 | 2.0041 | 11.8163 |
5.7853 | 29.2959 | 52000 | 4.5184 | 2.2214 | 11.8401 |
5.9104 | 29.8594 | 53000 | 4.5183 | 2.1885 | 11.8316 |
5.7037 | 30.4227 | 54000 | 4.5178 | 2.1707 | 11.7925 |
5.6241 | 30.9862 | 55000 | 4.5179 | 2.2225 | 11.7993 |
5.744 | 31.5495 | 56000 | 4.5179 | 2.2003 | 11.8146 |
5.7843 | 32.1127 | 57000 | 4.5177 | 2.2002 | 11.869 |
5.8889 | 32.6762 | 58000 | 4.5180 | 2.2385 | 11.7806 |
5.7761 | 33.2395 | 59000 | 4.5178 | 2.1707 | 11.8027 |
5.8273 | 33.8030 | 60000 | 4.5178 | 2.2267 | 11.8469 |
5.6471 | 34.3663 | 61000 | 4.5186 | 2.1974 | 11.8061 |
5.7228 | 34.9298 | 62000 | 4.5179 | 2.2024 | 11.7721 |
5.6731 | 35.4931 | 63000 | 4.5186 | 2.0918 | 11.7993 |
5.5668 | 36.0564 | 64000 | 4.5177 | 2.1991 | 11.7466 |
5.6701 | 36.6199 | 65000 | 4.5181 | 2.193 | 11.7857 |
5.6992 | 37.1832 | 66000 | 4.5177 | 2.1926 | 11.8452 |
5.6918 | 37.7467 | 67000 | 4.5179 | 2.0086 | 11.7789 |
5.7022 | 38.3099 | 68000 | 4.5183 | 1.974 | 11.7976 |
5.8113 | 38.8735 | 69000 | 4.5175 | 1.9785 | 11.8486 |
5.865 | 39.4367 | 70000 | 4.5184 | 2.2033 | 11.8452 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for BounharAbdelaziz/Terjman-Nano-v2.0-512
Base model
Helsinki-NLP/opus-mt-en-ar