--- base_model: - openai/whisper-small --- Note: This classifier also contains fine-tuned `whisper-small` weights in its state dict. It will be properly loaded by my model wrapper. Result of the classifier Rob's human-annotated dataset (`data/voicemail_human_eval.csv`): Results for chunk size 1 seconds: - Accuracy: 0.8080 - Precision: 0.9353 - Recall: 0.7692 - F1 Score: 0.8442 Results for chunk size 2 seconds: - Accuracy: 0.8560 - Precision: 0.9650 - Recall: 0.8166 - F1 Score: 0.8846 Results for chunk size 5 seconds: - Accuracy: 0.8640 - Precision: 0.9856 - Recall: 0.8107 - F1 Score: 0.8896 Results for chunk size 10 seconds: - Accuracy: 0.8760 - Precision: 1.0000 - Recall: 0.8166 - F1 Score: 0.8990 Results for full audio samples: - Accuracy: 0.8760 - Precision: 1.0000 - Recall: 0.8166 - F1 Score: 0.8990