_base_nougat_JawiChar
This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7218
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 6
- total_train_batch_size: 48
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
21.3619 | 0.9783 | 15 | 1.7367 |
8.1981 | 1.9783 | 30 | 1.1706 |
6.7061 | 2.9783 | 45 | 1.0059 |
5.8689 | 3.9783 | 60 | 0.9978 |
5.9197 | 4.9783 | 75 | 0.9428 |
5.5444 | 5.9783 | 90 | 0.9029 |
5.4086 | 6.9783 | 105 | 0.8840 |
5.1521 | 7.9783 | 120 | 0.8626 |
5.2231 | 8.9783 | 135 | 0.8425 |
4.9312 | 9.9783 | 150 | 0.8164 |
4.8809 | 10.9783 | 165 | 0.8104 |
4.8759 | 11.9783 | 180 | 0.7843 |
4.9104 | 12.9783 | 195 | 0.7834 |
4.5906 | 13.9783 | 210 | 0.7732 |
4.5919 | 14.9783 | 225 | 0.7699 |
4.4027 | 15.9783 | 240 | 0.7521 |
4.5193 | 16.9783 | 255 | 0.7466 |
4.3829 | 17.9783 | 270 | 0.7603 |
4.2354 | 18.9783 | 285 | 0.7374 |
4.1685 | 19.9783 | 300 | 0.7435 |
4.2185 | 20.9783 | 315 | 0.7329 |
4.0828 | 21.9783 | 330 | 0.7311 |
4.1832 | 22.9783 | 345 | 0.7295 |
4.0023 | 23.9783 | 360 | 0.7237 |
4.0266 | 24.9783 | 375 | 0.7211 |
4.0373 | 25.9783 | 390 | 0.7233 |
4.0861 | 26.9783 | 405 | 0.7207 |
3.9774 | 27.9783 | 420 | 0.7193 |
4.1174 | 28.9783 | 435 | 0.7199 |
3.9658 | 29.9783 | 450 | 0.7218 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.