You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

xlm-roberta-large-bs-16-lr-0.0001-ep-1-wp-0.1-gacc-8-gnm-1.0-FP16-mx-512-v0.1

This model is a fine-tuned version of FacebookAI/xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2438

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
17.501 0.0055 50 4.8573
16.4238 0.0109 100 4.2333
15.0223 0.0164 150 4.1599
14.6734 0.0219 200 4.0074
14.8891 0.0273 250 nan
14.0058 0.0328 300 3.5820
13.7471 0.0382 350 3.4834
14.0411 0.0437 400 3.4724
13.7614 0.0492 450 3.4450
13.728 0.0546 500 3.3631
13.6001 0.0601 550 3.3878
12.943 0.0656 600 3.3878
14.0021 0.0710 650 3.1696
13.4041 0.0765 700 3.2144
13.2302 0.0819 750 3.1456
13.3945 0.0874 800 3.1081
13.3763 0.0929 850 3.0475
13.3499 0.0983 900 3.2461
13.5559 0.1038 950 3.0163
13.839 0.1093 1000 3.0701
13.3534 0.1147 1050 2.9885
13.2552 0.1202 1100 3.0023
13.6676 0.1256 1150 nan
13.1216 0.1311 1200 3.0053
12.6853 0.1366 1250 2.8969
12.9434 0.1420 1300 2.9016
12.2164 0.1475 1350 2.8974
12.825 0.1530 1400 2.9705
12.7314 0.1584 1450 2.8804
12.7405 0.1639 1500 2.8514
12.5693 0.1694 1550 2.8858
12.2698 0.1748 1600 2.8437
12.19 0.1803 1650 2.9199
12.2267 0.1857 1700 2.7915
12.1787 0.1912 1750 2.9066
12.1286 0.1967 1800 2.8383
12.3344 0.2021 1850 nan
13.0251 0.2076 1900 2.8345
12.4427 0.2131 1950 2.7413
12.6127 0.2185 2000 2.7285
12.6358 0.2240 2050 2.7807
12.2132 0.2294 2100 2.7657
12.5298 0.2349 2150 2.7935
12.156 0.2404 2200 2.6942
12.2265 0.2458 2250 2.7374
12.0772 0.2513 2300 2.6400
11.7906 0.2568 2350 2.6862
11.5912 0.2622 2400 2.6664
12.242 0.2677 2450 2.7530
11.3089 0.2731 2500 2.7606
11.2301 0.2786 2550 2.6787
11.9706 0.2841 2600 2.7440
11.5268 0.2895 2650 2.6760
11.8031 0.2950 2700 2.6846
11.6836 0.3005 2750 nan
11.4748 0.3059 2800 2.6796
11.9102 0.3114 2850 2.7101
11.4223 0.3169 2900 2.7066
12.0939 0.3223 2950 2.5908
11.5229 0.3278 3000 nan
10.8909 0.3332 3050 2.5104
11.2679 0.3387 3100 2.6391
11.6102 0.3442 3150 2.6375
11.1783 0.3496 3200 2.5392
11.5862 0.3551 3250 2.6254
11.0802 0.3606 3300 2.4951
11.2194 0.3660 3350 2.5535
10.8891 0.3715 3400 2.4888
11.1372 0.3769 3450 2.6514
11.1702 0.3824 3500 nan
11.1283 0.3879 3550 2.4935
11.858 0.3933 3600 2.6377
10.6952 0.3988 3650 2.5486
11.1094 0.4043 3700 2.5827
10.5929 0.4097 3750 2.5155
10.9796 0.4152 3800 2.6333
11.4408 0.4207 3850 2.4885
11.3756 0.4261 3900 2.6248
10.6489 0.4316 3950 2.5080
11.2278 0.4370 4000 2.6829
10.9081 0.4425 4050 nan
10.3177 0.4480 4100 2.5467
11.1393 0.4534 4150 2.4981
11.109 0.4589 4200 2.5696
10.5874 0.4644 4250 2.5346
10.2922 0.4698 4300 2.5247
11.1379 0.4753 4350 2.5050
10.9258 0.4807 4400 2.4393
10.7622 0.4862 4450 2.5386
10.5537 0.4917 4500 2.4742
10.6157 0.4971 4550 2.5183
10.5721 0.5026 4600 2.4624
10.448 0.5081 4650 nan
10.9621 0.5135 4700 2.4363
10.5947 0.5190 4750 2.4489
10.4982 0.5244 4800 nan
10.241 0.5299 4850 2.4834
10.8498 0.5354 4900 nan
10.291 0.5408 4950 2.4880
10.032 0.5463 5000 2.4780
10.6992 0.5518 5050 2.4536
10.3189 0.5572 5100 2.5406
10.36 0.5627 5150 2.5421
10.1413 0.5682 5200 2.5299
10.4146 0.5736 5250 2.4525
10.0561 0.5791 5300 2.5126
10.3447 0.5845 5350 2.4347
10.2634 0.5900 5400 2.3891
10.067 0.5955 5450 2.4418
10.479 0.6009 5500 2.4801
9.8486 0.6064 5550 2.4651
10.2608 0.6119 5600 2.3497
10.0271 0.6173 5650 2.5478
9.8674 0.6228 5700 2.3528
10.1599 0.6282 5750 2.4087
9.9866 0.6337 5800 2.3972
10.5326 0.6392 5850 2.4910
10.2033 0.6446 5900 2.3823
9.8695 0.6501 5950 2.3799
10.0466 0.6556 6000 2.4245
9.5177 0.6610 6050 2.4596
10.4291 0.6665 6100 2.4178
10.0009 0.6719 6150 2.3328
10.0692 0.6774 6200 2.3533
9.6967 0.6829 6250 2.4248
9.9892 0.6883 6300 2.3493
10.1783 0.6938 6350 2.3389
10.019 0.6993 6400 2.4507
9.8618 0.7047 6450 2.2831
10.3984 0.7102 6500 2.3761
9.919 0.7157 6550 2.5036
9.2917 0.7211 6600 2.3926
9.6774 0.7266 6650 2.3494
10.0028 0.7320 6700 2.3653
9.6192 0.7375 6750 2.3574
9.9689 0.7430 6800 2.4544
10.0934 0.7484 6850 2.4070
10.0145 0.7539 6900 2.3699
9.559 0.7594 6950 nan
10.5713 0.7648 7000 2.3410
9.7507 0.7703 7050 nan
9.9102 0.7757 7100 2.4138
9.4241 0.7812 7150 2.2941
9.6202 0.7867 7200 2.3024
9.5112 0.7921 7250 2.3756
9.4726 0.7976 7300 2.3240
9.5841 0.8031 7350 2.4397
9.1056 0.8085 7400 nan
9.0733 0.8140 7450 2.3982
9.9461 0.8194 7500 2.3694
9.1871 0.8249 7550 2.3681
9.723 0.8304 7600 2.3977
9.7697 0.8358 7650 2.4167
9.2425 0.8413 7700 2.2994
9.5511 0.8468 7750 2.3465
9.8158 0.8522 7800 2.3081
9.4219 0.8577 7850 2.2640
9.4233 0.8632 7900 2.3290
9.3864 0.8686 7950 2.2964
9.4981 0.8741 8000 2.2984
9.1101 0.8795 8050 2.3284
9.1299 0.8850 8100 2.3426
8.9554 0.8905 8150 2.3206
9.5779 0.8959 8200 2.2987
9.1416 0.9014 8250 2.3276
9.4434 0.9069 8300 2.2201
9.1004 0.9123 8350 2.2855
9.3678 0.9178 8400 2.3188
9.2545 0.9232 8450 2.3988
9.3835 0.9287 8500 2.2233
9.7359 0.9342 8550 2.2780
9.2803 0.9396 8600 2.3142
8.9966 0.9451 8650 2.2083
9.2548 0.9506 8700 2.4125
10.0036 0.9560 8750 2.1931
9.4264 0.9615 8800 2.1629
9.102 0.9669 8850 2.3306
9.3087 0.9724 8900 2.2894
8.9155 0.9779 8950 2.2347
9.1586 0.9833 9000 2.3156
9.2523 0.9888 9050 nan
9.541 0.9943 9100 2.2957
9.4701 0.9997 9150 2.2438

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
62
Safetensors
Model size
560M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for BounharAbdelaziz/XLM-RoBERTa-Morocco

Finetuned
(362)
this model
Finetunes
1 model

Collection including BounharAbdelaziz/XLM-RoBERTa-Morocco