flan-t5-small-squad-qag

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.1573

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
40.773 0.5714 1 41.7049
58.3411 1.5714 2 39.3183
54.8652 2.5714 3 37.3843
53.7579 3.5714 4 35.8088
52.5214 4.5714 5 34.5335
50.0236 5.5714 6 33.5388
49.5252 6.5714 7 32.7734
48.018 7.5714 8 32.1632
46.7346 8.5714 9 31.6080
45.4348 9.5714 10 31.0589
44.8246 10.5714 11 30.5032
44.1633 11.5714 12 29.9093
42.8213 12.5714 13 29.2965
43.2365 13.5714 14 28.6880
41.5266 14.5714 15 28.0847
40.6435 15.5714 16 27.4881
40.1899 16.5714 17 26.9148
39.3795 17.5714 18 26.3482
38.4061 18.5714 19 25.8042
38.4415 19.5714 20 25.2741
36.9642 20.5714 21 24.7624
36.3868 21.5714 22 24.2690
36.2422 22.5714 23 23.7877
35.3793 23.5714 24 23.3194
34.9853 24.5714 25 22.8591
34.0927 25.5714 26 22.4058
33.2451 26.5714 27 21.9624
32.8551 27.5714 28 21.5381
32.1326 28.5714 29 21.1176
31.84 29.5714 30 20.6980
31.2982 30.5714 31 20.2775
30.8415 31.5714 32 19.8578
30.073 32.5714 33 19.4395
29.8896 33.5714 34 19.0213
29.2583 34.5714 35 18.6041
28.5195 35.5714 36 18.1902
27.7352 36.5714 37 17.7715
28.0043 37.5714 38 17.3529
26.7202 38.5714 39 16.9311
26.8391 39.5714 40 16.5091
26.0355 40.5714 41 16.0881
25.5678 41.5714 42 15.6670
25.281 42.5714 43 15.2460
24.9389 43.5714 44 14.8265
24.2087 44.5714 45 14.4072
24.0442 45.5714 46 13.9871
23.5964 46.5714 47 13.5686
22.5465 47.5714 48 13.1483
22.0742 48.5714 49 12.7263
21.9666 49.5714 50 12.3055
21.1685 50.5714 51 11.8917
21.1257 51.5714 52 11.4814
20.2889 52.5714 53 11.0750
20.3047 53.5714 54 10.6724
19.8761 54.5714 55 10.2840
19.0577 55.5714 56 9.9060
18.6548 56.5714 57 9.5428
18.7313 57.5714 58 9.2004
18.247 58.5714 59 8.8795
17.7508 59.5714 60 8.5831
17.1485 60.5714 61 8.3108
16.8734 61.5714 62 8.0638
16.7851 62.5714 63 7.8416
16.2609 63.5714 64 7.6450
16.1574 64.5714 65 7.4740
15.8518 65.5714 66 7.3281
15.8425 66.5714 67 7.2009
15.3619 67.5714 68 7.0914
15.5268 68.5714 69 6.9991
15.3891 69.5714 70 6.9188
14.7154 70.5714 71 6.8483
14.5997 71.5714 72 6.7852
14.6067 72.5714 73 6.7290
14.4925 73.5714 74 6.6800
14.326 74.5714 75 6.6356
14.0346 75.5714 76 6.5929
13.9427 76.5714 77 6.5531
13.8931 77.5714 78 6.5155
13.6341 78.5714 79 6.4793
13.7549 79.5714 80 6.4462
13.4067 80.5714 81 6.4152
13.4218 81.5714 82 6.3872
13.1982 82.5714 83 6.3615
13.0855 83.5714 84 6.3381
12.9228 84.5714 85 6.3163
12.8098 85.5714 86 6.2966
12.9304 86.5714 87 6.2780
13.0 87.5714 88 6.2604
12.6473 88.5714 89 6.2440
12.4884 89.5714 90 6.2286
12.8845 90.5714 91 6.2152
12.3722 91.5714 92 6.2033
12.5444 92.5714 93 6.1931
12.3583 93.5714 94 6.1844
12.3182 94.5714 95 6.1766
12.345 95.5714 96 6.1702
12.3766 96.5714 97 6.1649
12.7799 97.5714 98 6.1610
12.505 98.5714 99 6.1586
12.2264 99.5714 100 6.1573

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.0
  • Tokenizers 0.21.0
Downloads last month
6
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for devagonal/flan-t5-small-squad-qag

Finetuned
(337)
this model