Akchunks
/

speaker-segmentation-fine-tuned-hindi

@@ -11,23 +11,23 @@ tags:
 datasets:
 - Akchunks/synthetic-speaker-diarization-dataset-hindi-short
 model-index:
-- name: speaker-segmentation-fine-tuned-hindi-v2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# speaker-segmentation-fine-tuned-hindi-v2
 This model is a fine-tuned version of [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) on the Akchunks/synthetic-speaker-diarization-dataset-hindi-short dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3801
-- Model Preparation Time: 0.0043
-- Der: 0.1089
-- False Alarm: 0.0383
-- Missed Detection: 0.0234
-- Confusion: 0.0472
 ## Model description
@@ -52,27 +52,32 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
-- num_epochs: 15
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Der    | False Alarm | Missed Detection | Confusion |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:------:|:-----------:|:----------------:|:---------:|
-| No log        | 1.0   | 24   | 0.4602          | 0.0043                 | 0.1444 | 0.0350      | 0.0256           | 0.0838    |
-| 0.5195        | 2.0   | 48   | 0.3554          | 0.0043                 | 0.1305 | 0.0323      | 0.0243           | 0.0739    |
-| 0.3063        | 3.0   | 72   | 0.3681          | 0.0043                 | 0.1243 | 0.0405      | 0.0252           | 0.0586    |
-| 0.2147        | 4.0   | 96   | 0.3841          | 0.0043                 | 0.1262 | 0.0370      | 0.0253           | 0.0639    |
-| 0.1941        | 5.0   | 120  | 0.3978          | 0.0043                 | 0.1259 | 0.0349      | 0.0245           | 0.0666    |
-| 0.1596        | 6.0   | 144  | 0.3656          | 0.0043                 | 0.1145 | 0.0393      | 0.0232           | 0.0520    |
-| 0.1435        | 7.0   | 168  | 0.3477          | 0.0043                 | 0.1114 | 0.0359      | 0.0248           | 0.0507    |
-| 0.1172        | 8.0   | 192  | 0.4034          | 0.0043                 | 0.1264 | 0.0393      | 0.0229           | 0.0643    |
-| 0.1253        | 9.0   | 216  | 0.3766          | 0.0043                 | 0.1169 | 0.0380      | 0.0240           | 0.0548    |
-| 0.0963        | 10.0  | 240  | 0.3694          | 0.0043                 | 0.1114 | 0.0399      | 0.0231           | 0.0483    |
-| 0.1055        | 11.0  | 264  | 0.3746          | 0.0043                 | 0.1101 | 0.0379      | 0.0238           | 0.0484    |
-| 0.0924        | 12.0  | 288  | 0.3782          | 0.0043                 | 0.1090 | 0.0379      | 0.0235           | 0.0476    |
-| 0.0975        | 13.0  | 312  | 0.3780          | 0.0043                 | 0.1082 | 0.0379      | 0.0236           | 0.0468    |
-| 0.082         | 14.0  | 336  | 0.3798          | 0.0043                 | 0.1087 | 0.0382      | 0.0234           | 0.0471    |
-| 0.0935        | 15.0  | 360  | 0.3801          | 0.0043                 | 0.1089 | 0.0383      | 0.0234           | 0.0472    |
 ### Framework versions

 datasets:
 - Akchunks/synthetic-speaker-diarization-dataset-hindi-short
 model-index:
+- name: speaker-segmentation-fine-tuned-hindi-v3
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# speaker-segmentation-fine-tuned-hindi-v3
 This model is a fine-tuned version of [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) on the Akchunks/synthetic-speaker-diarization-dataset-hindi-short dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3447
+- Model Preparation Time: 0.007
+- Der: 0.0985
+- False Alarm: 0.0375
+- Missed Detection: 0.0235
+- Confusion: 0.0375
 ## Model description
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
+- num_epochs: 20
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Der    | False Alarm | Missed Detection | Confusion |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:------:|:-----------:|:----------------:|:---------:|
+| No log        | 1.0   | 24   | 0.4600          | 0.007                  | 0.1443 | 0.0349      | 0.0256           | 0.0837    |
+| 0.5196        | 2.0   | 48   | 0.3562          | 0.007                  | 0.1304 | 0.0325      | 0.0242           | 0.0737    |
+| 0.306         | 3.0   | 72   | 0.3732          | 0.007                  | 0.1251 | 0.0402      | 0.0253           | 0.0596    |
+| 0.2116        | 4.0   | 96   | 0.3712          | 0.007                  | 0.1265 | 0.0408      | 0.0242           | 0.0615    |
+| 0.1944        | 5.0   | 120  | 0.3846          | 0.007                  | 0.1223 | 0.0337      | 0.0260           | 0.0627    |
+| 0.1538        | 6.0   | 144  | 0.3544          | 0.007                  | 0.1191 | 0.0375      | 0.0228           | 0.0587    |
+| 0.1417        | 7.0   | 168  | 0.4045          | 0.007                  | 0.1213 | 0.0358      | 0.0241           | 0.0614    |
+| 0.1122        | 8.0   | 192  | 0.4213          | 0.007                  | 0.1267 | 0.0438      | 0.0228           | 0.0601    |
+| 0.1053        | 9.0   | 216  | 0.4171          | 0.007                  | 0.1178 | 0.0368      | 0.0255           | 0.0555    |
+| 0.0897        | 10.0  | 240  | 0.3561          | 0.007                  | 0.1142 | 0.0409      | 0.0228           | 0.0505    |
+| 0.1043        | 11.0  | 264  | 0.3738          | 0.007                  | 0.1122 | 0.0380      | 0.0248           | 0.0495    |
+| 0.0825        | 12.0  | 288  | 0.3383          | 0.007                  | 0.1025 | 0.0377      | 0.0237           | 0.0411    |
+| 0.0894        | 13.0  | 312  | 0.3328          | 0.007                  | 0.0995 | 0.0388      | 0.0237           | 0.0370    |
+| 0.0699        | 14.0  | 336  | 0.3272          | 0.007                  | 0.0988 | 0.0376      | 0.0237           | 0.0375    |
+| 0.0785        | 15.0  | 360  | 0.3374          | 0.007                  | 0.0991 | 0.0378      | 0.0235           | 0.0378    |
+| 0.0759        | 16.0  | 384  | 0.3414          | 0.007                  | 0.0978 | 0.0383      | 0.0233           | 0.0362    |
+| 0.0653        | 17.0  | 408  | 0.3417          | 0.007                  | 0.0973 | 0.0375      | 0.0234           | 0.0364    |
+| 0.0726        | 18.0  | 432  | 0.3439          | 0.007                  | 0.0981 | 0.0374      | 0.0236           | 0.0370    |
+| 0.0684        | 19.0  | 456  | 0.3445          | 0.007                  | 0.0984 | 0.0374      | 0.0235           | 0.0375    |
+| 0.0731        | 20.0  | 480  | 0.3447          | 0.007                  | 0.0985 | 0.0375      | 0.0235           | 0.0375    |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c098a3b7b7577e336ba8cb9fa63a96e5f95997cdbf75d3c6a7120892c20350a3
 size 5899124

 version https://git-lfs.github.com/spec/v1
+oid sha256:edb82510cc88d375c6a03a919b01c27c602cd7bf4eab2f140da89d3acd4e8b1d
 size 5899124