flan_T5_MT
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.8295
- Accuracy: 0.7971
- Precision: 0.7988
- Recall: 0.7941
- F1 score: 0.7965
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|---|---|---|
1.1041 | 0.2103 | 2500 | 0.9414 | 0.7371 | 0.7346 | 0.7424 | 0.7384 |
0.8837 | 0.4205 | 5000 | 1.1617 | 0.7524 | 0.8994 | 0.5682 | 0.6965 |
0.8923 | 0.6308 | 7500 | 0.9163 | 0.7871 | 0.8306 | 0.7212 | 0.7720 |
0.8658 | 0.8410 | 10000 | 1.0822 | 0.7947 | 0.8711 | 0.6918 | 0.7711 |
0.7201 | 1.0513 | 12500 | 1.0379 | 0.7794 | 0.7838 | 0.7718 | 0.7777 |
0.5277 | 1.2616 | 15000 | 1.1824 | 0.7994 | 0.8353 | 0.7459 | 0.7881 |
0.5173 | 1.4718 | 17500 | 1.0692 | 0.7806 | 0.7630 | 0.8141 | 0.7877 |
0.45 | 1.6821 | 20000 | 0.9608 | 0.7906 | 0.7913 | 0.7894 | 0.7903 |
0.3901 | 1.8923 | 22500 | 1.2113 | 0.7994 | 0.8041 | 0.7918 | 0.7979 |
0.2902 | 2.1026 | 25000 | 1.2724 | 0.7912 | 0.8098 | 0.7612 | 0.7847 |
0.1808 | 2.3129 | 27500 | 1.4793 | 0.7994 | 0.7956 | 0.8059 | 0.8007 |
0.1787 | 2.5231 | 30000 | 1.3220 | 0.8053 | 0.8100 | 0.7976 | 0.8038 |
0.1908 | 2.7334 | 32500 | 1.3844 | 0.8012 | 0.82 | 0.7718 | 0.7952 |
0.1709 | 2.9437 | 35000 | 1.6708 | 0.7924 | 0.7860 | 0.8035 | 0.7946 |
0.1183 | 3.1539 | 37500 | 1.5582 | 0.7912 | 0.7809 | 0.8094 | 0.7949 |
0.088 | 3.3642 | 40000 | 1.6673 | 0.8059 | 0.8342 | 0.7635 | 0.7973 |
0.0948 | 3.5744 | 42500 | 1.4679 | 0.8065 | 0.8318 | 0.7682 | 0.7988 |
0.0863 | 3.7847 | 45000 | 1.5944 | 0.7935 | 0.7939 | 0.7929 | 0.7934 |
0.0576 | 3.9950 | 47500 | 1.9939 | 0.7971 | 0.8046 | 0.7847 | 0.7945 |
0.0454 | 4.2052 | 50000 | 1.9647 | 0.7924 | 0.784 | 0.8071 | 0.7954 |
0.0407 | 4.4155 | 52500 | 1.9074 | 0.7959 | 0.7955 | 0.7965 | 0.7960 |
0.0404 | 4.6257 | 55000 | 1.9506 | 0.8076 | 0.8257 | 0.78 | 0.8022 |
0.0494 | 4.8360 | 57500 | 1.8295 | 0.7971 | 0.7988 | 0.7941 | 0.7965 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.5.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 32
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for 13ari/flan_T5_MT
Base model
google/flan-t5-large