Taejin commited on
Commit
1be9bf9
·
1 Parent(s): d12f0f3

Adding subcards from nemo model cards

Browse files

Signed-off-by: taejinp <[email protected]>

Files changed (4) hide show
  1. bias.md +5 -0
  2. explainability.md +14 -0
  3. privacy.md +13 -0
  4. safety.md +6 -0
bias.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ Field | Response
2
+ :---------------------------------------------------------------------------------------------------|:---------------
3
+ Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing: | None
4
+ Bias Metric (If Measured): | None
5
+ Measures taken to mitigate against unwanted bias: | None
explainability.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
3
+ Intended Task/Domain: | Multi-Talker Automatic Speech Recognition
4
+ Model Type: | FastConformer Encoder, Transformer Encoder, and RNNT Decoder
5
+ Intended Users: | People working with conversational AI models that need to transcribe speech to text for multiple users.
6
+ Output: | Text with speaker tags
7
+ Describe how the model works: | MT-Parakeet is an online, multi-talker ASR model that takes audio streams as input and produces transcripts for multiple speakers. The model processes input audio in chunks and uses the output of an online diarization model as speaker labels to generate separate transcripts for each speaker. A speaker kernel is used to inject speaker information and produce a speaker-injected ASR embedding, enabling the model to transcribe each speaker even when speech overlaps
8
+ Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable
9
+ Technical Limitations & Mitigation: | This model can detect up to four speakers; performance degrades in recordings with five or more speakers. The model was trained on publicly available English speech datasets. As a result, it is not suitable for non-English audio. Performance may also degrade on out-of-domain data, such as recordings in noisy conditions.
10
+ Verified to have met prescribed NVIDIA quality standards: | Yes
11
+ Performance Metrics: | Concatenated minimum-permutation word error rate (cpWER) and time-constrained minimum-permutation word error rate (tcpWER)
12
+ Potential Known Risks: | Transcripts may not be 100% accurate in instances with background noise. Punctuation/capitalization may not be 100% accurate.
13
+ Licensing: | GOVERNING TERMS: Use of this model is governed by the NVIDIA Open Model License Agreement (found [here](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
14
+
privacy.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
3
+ Generatable or reverse engineerable personal data? | No
4
+ Personal data used to create this model? | Yes - Voice
5
+ Was consent obtained for any personal data used? | Yes
6
+ Is a mechanism in place to honor data subject right of access or deletion of personal data? | Yes
7
+ If personal data was collected for the development of the model, was it collected directly by NVIDIA? | Yes
8
+ If personal data was collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Yes
9
+ If personal data was collected for the development of this AI model, was it minimized to only what was required? | Yes
10
+ Is there provenance for all datasets used in training? | Yes
11
+ Does data labeling (annotation, metadata) comply with privacy laws? | Yes
12
+ Is data compliant with data subject requests for data correction or removal, if such a request was made? | The data is compliant where applicable, but is not applicable for all data.
13
+ Applicable Privacy Policy | [https://www.nvidia.com/en-us/about-nvidia/privacy-policy/]
safety.md ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ Field | Response
2
+ :---------------------------------------------------|:----------------------------------
3
+ Model Application Field(s): | Multi-Talker Automatic Speech Recognition Systems
4
+ Describe the life critical impact (if present). | Not Applicable
5
+ Use Case Restrictions: | Abide by [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
6
+ Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to.