wikipedia_42

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.4657	2000	7.7675
7.8833	2.9315	4000	7.0282
7.8833	4.3972	6000	6.5853
6.6338	5.8630	8000	6.2212
6.6338	7.3287	10000	5.9097
5.9514	8.7944	12000	5.6314
5.9514	10.2602	14000	5.3996
5.4199	11.7259	16000	5.1990
5.4199	13.1916	18000	5.0405
5.0121	14.6574	20000	4.9063
5.0121	16.1231	22000	4.7988
4.7026	17.5889	24000	4.7078
4.7026	19.0546	26000	4.6257
4.457	20.5203	28000	4.5664
4.457	21.9861	30000	4.5069
4.2579	23.4518	32000	4.4624
4.2579	24.9176	34000	4.4199
4.0916	26.3833	36000	4.3874
4.0916	27.8490	38000	4.3623
3.9507	29.3148	40000	4.3443
3.9507	30.7805	42000	4.3140
3.821	32.2462	44000	4.3072
3.821	33.7120	46000	4.2900
3.7002	35.1777	48000	4.2812
3.7002	36.6435	50000	4.2770
3.6009	38.1092	52000	4.2762
3.6009	39.5749	54000	4.2695
3.5172	41.0407	56000	4.2709
3.5172	42.5064	58000	4.2759
3.4448	43.9722	60000	4.2693
3.4448	45.4379	62000	4.2815
3.3812	46.9036	64000	4.2788
3.3812	48.3694	66000	4.2915
3.3268	49.8351	68000	4.2839
3.3268	51.3008	70000	4.2940
3.2758	52.7666	72000	4.2919
3.2758	54.2323	74000	4.3084
3.2333	55.6981	76000	4.3099
3.2333	57.1638	78000	4.3111
3.1928	58.6295	80000	4.3121
3.1928	60.0953	82000	4.3197
3.1562	61.5610	84000	4.3232
3.1562	63.0267	86000	4.3240
3.1231	64.4925	88000	4.3278
3.1231	65.9582	90000	4.3292
3.0943	67.4240	92000	4.3349
3.0943	68.8897	94000	4.3352
3.0684	70.3554	96000	4.3382
3.0684	71.8212	98000	4.3372
3.0462	73.2869	100000	4.3376