nithinraok commited on
Commit
7938c10
·
1 Parent(s): 9fc6642

update card

Browse files

Signed-off-by: nithinraok <[email protected]>

Files changed (1) hide show
  1. README.md +50 -10
README.md CHANGED
@@ -805,6 +805,7 @@ img {
805
  **Supported Languages:**
806
  Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
807
 
 
808
 
809
  ## <span style="color:#466f00;">Key Features:</span>
810
 
@@ -815,9 +816,9 @@ Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (*
815
  * **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
816
  * Released under a **permissive CC BY 4.0 license**
817
 
818
- This model is ready for commercial/non-commercial use.
819
 
820
- ---
821
 
822
  ## Automatic Speech Recognition (ASR) Performance
823
 
@@ -833,11 +834,6 @@ This model is ready for commercial/non-commercial use.
833
 
834
  **Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
835
 
836
- ## <span style="color:#466f00;">License/Terms of Use:</span>
837
-
838
- GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
839
-
840
-
841
  ### <span style="color:#466f00;">Deployment Geography:</span>
842
  Global
843
 
@@ -849,7 +845,8 @@ This model serves developers, researchers, academics, and industries building ap
849
 
850
  ### <span style="color:#466f00;">Release Date:</span>
851
 
852
- 08/14/2025
 
853
 
854
  ### <span style="color:#466f00;">Model Architecture:</span>
855
 
@@ -936,7 +933,7 @@ print(output[0].text)
936
  ## <span style="color:#466f00;">Software Integration:</span>
937
 
938
  **Runtime Engine(s):**
939
- * NeMo 2.5
940
 
941
 
942
  **Supported Hardware Microarchitecture Compatibility:**
@@ -1136,4 +1133,47 @@ NVIDIA believes Trustworthy AI is a shared responsibility and we have establishe
1136
 
1137
  For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
1138
 
1139
- Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
805
  **Supported Languages:**
806
  Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
807
 
808
+ This model is ready for commercial/non-commercial use.
809
 
810
  ## <span style="color:#466f00;">Key Features:</span>
811
 
 
816
  * **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
817
  * Released under a **permissive CC BY 4.0 license**
818
 
819
+ ## <span style="color:#466f00;">License/Terms of Use:</span>
820
 
821
+ GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
822
 
823
  ## Automatic Speech Recognition (ASR) Performance
824
 
 
834
 
835
  **Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
836
 
 
 
 
 
 
837
  ### <span style="color:#466f00;">Deployment Geography:</span>
838
  Global
839
 
 
845
 
846
  ### <span style="color:#466f00;">Release Date:</span>
847
 
848
+ Huggingface [08/14/2025](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3)
849
+
850
 
851
  ### <span style="color:#466f00;">Model Architecture:</span>
852
 
 
933
  ## <span style="color:#466f00;">Software Integration:</span>
934
 
935
  **Runtime Engine(s):**
936
+ * NeMo 2.4
937
 
938
 
939
  **Supported Hardware Microarchitecture Compatibility:**
 
1133
 
1134
  For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
1135
 
1136
+ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
1137
+
1138
+ ## <span style="color:#466f00;">Bias:</span>
1139
+
1140
+ Field | Response
1141
+ ---------------------------------------------------------------------------------------------------|---------------
1142
+ Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing | None
1143
+ Measures taken to mitigate against unwanted bias | None
1144
+
1145
+ ## <span style="color:#466f00;">Explainability:</span>
1146
+
1147
+ Field | Response
1148
+ ------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------
1149
+ Intended Domain | Speech to Text Transcription
1150
+ Model Type | FastConformer
1151
+ Intended Users | This model is intended for developers, researchers, academics, and industries building conversational based applications.
1152
+ Output | Text
1153
+ Describe how the model works | Speech input is encoded into embeddings and passed into conformer-based model and output a text response.
1154
+ Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of | Not Applicable
1155
+ Technical Limitations & Mitigation | Transcripts may be not 100% accurate. Accuracy varies based on language and characteristics of input audio (Domain, Use Case, Accent, Noise, Speech Type, Context of speech, etc.)
1156
+ Verified to have met prescribed NVIDIA quality standards | Yes
1157
+ Performance Metrics | Word Error Rate
1158
+ Potential Known Risks | If a word is not trained in the language model and not presented in vocabulary, the word is not likely to be recognized. Not recommended for word-for-word/incomplete sentences as accuracy varies based on the context of input text
1159
+ Licensing | GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
1160
+
1161
+ ## <span style="color:#466f00;">Privacy:</span>
1162
+
1163
+ Field | Response
1164
+ ----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------
1165
+ Generatable or reverse engineerable personal data? | None
1166
+ Personal data used to create this model? | None
1167
+ Is there provenance for all datasets used in training? | Yes
1168
+ Does data labeling (annotation, metadata) comply with privacy laws? | Yes
1169
+ Is data compliant with data subject requests for data correction or removal, if such a request was made? | No, not possible with externally-sourced data.
1170
+ Applicable Privacy Policy | https://www.nvidia.com/en-us/about-nvidia/privacy-policy/
1171
+
1172
+ ## <span style="color:#466f00;">Safety:</span>
1173
+
1174
+ Field | Response
1175
+ ---------------------------------------------------|----------------------------------
1176
+ Model Application(s) | Speech to Text Transcription
1177
+ Describe the life critical impact | None
1178
+ Use Case Restrictions | Abide by [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) License
1179
+ Model and dataset restrictions | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to.