_base_nougat_JawiChar

This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7218

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 48
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
21.3619 0.9783 15 1.7367
8.1981 1.9783 30 1.1706
6.7061 2.9783 45 1.0059
5.8689 3.9783 60 0.9978
5.9197 4.9783 75 0.9428
5.5444 5.9783 90 0.9029
5.4086 6.9783 105 0.8840
5.1521 7.9783 120 0.8626
5.2231 8.9783 135 0.8425
4.9312 9.9783 150 0.8164
4.8809 10.9783 165 0.8104
4.8759 11.9783 180 0.7843
4.9104 12.9783 195 0.7834
4.5906 13.9783 210 0.7732
4.5919 14.9783 225 0.7699
4.4027 15.9783 240 0.7521
4.5193 16.9783 255 0.7466
4.3829 17.9783 270 0.7603
4.2354 18.9783 285 0.7374
4.1685 19.9783 300 0.7435
4.2185 20.9783 315 0.7329
4.0828 21.9783 330 0.7311
4.1832 22.9783 345 0.7295
4.0023 23.9783 360 0.7237
4.0266 24.9783 375 0.7211
4.0373 25.9783 390 0.7233
4.0861 26.9783 405 0.7207
3.9774 27.9783 420 0.7193
4.1174 28.9783 435 0.7199
3.9658 29.9783 450 0.7218

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
12
Safetensors
Model size
349M params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for bustamiyusoef/_base_nougat_Char

Finetuned
(8)
this model
Finetunes
2 models