_base_nougat_JawiChar

This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 48
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss
21.3619	0.9783	15	1.7367
8.1981	1.9783	30	1.1706
6.7061	2.9783	45	1.0059
5.8689	3.9783	60	0.9978
5.9197	4.9783	75	0.9428
5.5444	5.9783	90	0.9029
5.4086	6.9783	105	0.8840
5.1521	7.9783	120	0.8626
5.2231	8.9783	135	0.8425
4.9312	9.9783	150	0.8164
4.8809	10.9783	165	0.8104
4.8759	11.9783	180	0.7843
4.9104	12.9783	195	0.7834
4.5906	13.9783	210	0.7732
4.5919	14.9783	225	0.7699
4.4027	15.9783	240	0.7521
4.5193	16.9783	255	0.7466
4.3829	17.9783	270	0.7603
4.2354	18.9783	285	0.7374
4.1685	19.9783	300	0.7435
4.2185	20.9783	315	0.7329
4.0828	21.9783	330	0.7311
4.1832	22.9783	345	0.7295
4.0023	23.9783	360	0.7237
4.0266	24.9783	375	0.7211
4.0373	25.9783	390	0.7233
4.0861	26.9783	405	0.7207
3.9774	27.9783	420	0.7193
4.1174	28.9783	435	0.7199
3.9658	29.9783	450	0.7218