AllanK24 commited on
Commit
76169b7
·
verified ·
1 Parent(s): a7e1588

📝 Added model card after best checkpoint

Browse files
Files changed (3) hide show
  1. README.md +19 -17
  2. model.safetensors +1 -1
  3. tokenizer.json +6 -1
README.md CHANGED
@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.4752
23
- - Accuracy: 0.8145
24
- - F1: 0.8531
25
 
26
  ## Model description
27
 
@@ -40,34 +40,36 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 0.001
44
- - train_batch_size: 16
45
- - eval_batch_size: 16
46
  - seed: 42
47
  - distributed_type: multi-GPU
48
  - num_devices: 2
49
- - total_train_batch_size: 32
50
- - total_eval_batch_size: 32
51
- - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
52
  - lr_scheduler_type: linear
 
53
  - num_epochs: 5
54
  - mixed_precision_training: Native AMP
55
  - label_smoothing_factor: 0.1
56
 
57
  ### Training results
58
 
59
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
60
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
61
- | 0.5192 | 1.0 | 938 | 0.4865 | 0.8090 | 0.8537 |
62
- | 0.4965 | 2.0 | 1876 | 0.4769 | 0.8180 | 0.8575 |
63
- | 0.4879 | 3.0 | 2814 | 0.4783 | 0.8069 | 0.8399 |
64
- | 0.4834 | 4.0 | 3752 | 0.4790 | 0.8131 | 0.8450 |
65
- | 0.479 | 5.0 | 4690 | 0.4752 | 0.8145 | 0.8531 |
66
 
67
 
68
  ### Framework versions
69
 
70
- - Transformers 4.50.1
71
  - Pytorch 2.6.0+cu126
72
  - Datasets 3.3.1
73
  - Tokenizers 0.21.0
 
19
 
20
  This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.5534
23
+ - Accuracy: 0.8388
24
+ - F1: 0.8635
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 5e-05
44
+ - train_batch_size: 2
45
+ - eval_batch_size: 2
46
  - seed: 42
47
  - distributed_type: multi-GPU
48
  - num_devices: 2
49
+ - gradient_accumulation_steps: 4
50
+ - total_train_batch_size: 16
51
+ - total_eval_batch_size: 4
52
+ - optimizer: Use adamw_torch_fused with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
53
  - lr_scheduler_type: linear
54
+ - lr_scheduler_warmup_ratio: 0.1
55
  - num_epochs: 5
56
  - mixed_precision_training: Native AMP
57
  - label_smoothing_factor: 0.1
58
 
59
  ### Training results
60
 
61
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
62
+ |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
63
+ | 1.9225 | 1.0 | 1876 | 0.4365 | 0.8401 | 0.8716 |
64
+ | 1.4566 | 2.0 | 3752 | 0.4588 | 0.8318 | 0.8601 |
65
+ | 1.155 | 3.0 | 5628 | 0.5355 | 0.8304 | 0.8511 |
66
+ | 1.0333 | 4.0 | 7504 | 0.5601 | 0.8311 | 0.8501 |
67
+ | 0.9694 | 4.9976 | 9375 | 0.5534 | 0.8388 | 0.8635 |
68
 
69
 
70
  ### Framework versions
71
 
72
+ - Transformers 4.50.2
73
  - Pytorch 2.6.0+cu126
74
  - Datasets 3.3.1
75
  - Tokenizers 0.21.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e9a28c1836c74f139def7cab94788f2b7bd63832bf01c6e16a81d45081ad52cb
3
  size 598439784
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95ed7142123a657dde68253257d449487a62d4feef0a0ad819a6bdc99c1f5ef4
3
  size 598439784
tokenizer.json CHANGED
@@ -1,6 +1,11 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
 
 
 
 
 
4
  "padding": null,
5
  "added_tokens": [
6
  {
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 8192,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
  "padding": null,
10
  "added_tokens": [
11
  {