Upload explainability.md with huggingface_hub
Browse files- explainability.md +14 -0
explainability.md
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Field | Response
|
2 |
+
:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
|
3 |
+
Intended Domain: | Voice Activity Detection (VAD)
|
4 |
+
Model Type: | Convolutional Neural Network (CNN)
|
5 |
+
Intended Users: | Developers, Speech Processing Engineers, AI Researchers
|
6 |
+
Output: | Sequence of speech probabilities for each 20 millisecond audio frame
|
7 |
+
Describe how the model works: | The model processes input audio by extracting spectrogram features, which are then passed through MarbleNet—a lightweight CNN-based model designed for VAD. The CNN learns to detect patterns associated with speech activity and outputs a probability score indicating the presence of speech in each 20 millisecond frame
|
8 |
+
Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable
|
9 |
+
Technical Limitations: | The model operates on 20 millisecond frames. While it supports longer frames by breaking them into smaller segments, it does not support outputs with a finer granularity than 20 milliseconds.
|
10 |
+
Verified to have met prescribed NVIDIA quality standards: | Yes
|
11 |
+
Performance Metrics: | Accuracy (False Positive Rate, ROC-AUC score), Latency, Throughput
|
12 |
+
Potential Known Risks: | While the model was trained on a limited number of languages, including Chinese, English, French, Spanish, German, and Russian, the model may experience a degradation in quality for languages and accents that are not included in the training dataset
|
13 |
+
Licensing: | [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license)
|
14 |
+
|