nithinraok
commited on
Commit
·
7938c10
1
Parent(s):
9fc6642
update card
Browse filesSigned-off-by: nithinraok <[email protected]>
README.md
CHANGED
@@ -805,6 +805,7 @@ img {
|
|
805 |
**Supported Languages:**
|
806 |
Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
|
807 |
|
|
|
808 |
|
809 |
## <span style="color:#466f00;">Key Features:</span>
|
810 |
|
@@ -815,9 +816,9 @@ Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (*
|
|
815 |
* **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
|
816 |
* Released under a **permissive CC BY 4.0 license**
|
817 |
|
818 |
-
|
819 |
|
820 |
-
|
821 |
|
822 |
## Automatic Speech Recognition (ASR) Performance
|
823 |
|
@@ -833,11 +834,6 @@ This model is ready for commercial/non-commercial use.
|
|
833 |
|
834 |
**Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
|
835 |
|
836 |
-
## <span style="color:#466f00;">License/Terms of Use:</span>
|
837 |
-
|
838 |
-
GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
|
839 |
-
|
840 |
-
|
841 |
### <span style="color:#466f00;">Deployment Geography:</span>
|
842 |
Global
|
843 |
|
@@ -849,7 +845,8 @@ This model serves developers, researchers, academics, and industries building ap
|
|
849 |
|
850 |
### <span style="color:#466f00;">Release Date:</span>
|
851 |
|
852 |
-
08/14/2025
|
|
|
853 |
|
854 |
### <span style="color:#466f00;">Model Architecture:</span>
|
855 |
|
@@ -936,7 +933,7 @@ print(output[0].text)
|
|
936 |
## <span style="color:#466f00;">Software Integration:</span>
|
937 |
|
938 |
**Runtime Engine(s):**
|
939 |
-
* NeMo 2.
|
940 |
|
941 |
|
942 |
**Supported Hardware Microarchitecture Compatibility:**
|
@@ -1136,4 +1133,47 @@ NVIDIA believes Trustworthy AI is a shared responsibility and we have establishe
|
|
1136 |
|
1137 |
For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
|
1138 |
|
1139 |
-
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
805 |
**Supported Languages:**
|
806 |
Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
|
807 |
|
808 |
+
This model is ready for commercial/non-commercial use.
|
809 |
|
810 |
## <span style="color:#466f00;">Key Features:</span>
|
811 |
|
|
|
816 |
* **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
|
817 |
* Released under a **permissive CC BY 4.0 license**
|
818 |
|
819 |
+
## <span style="color:#466f00;">License/Terms of Use:</span>
|
820 |
|
821 |
+
GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
|
822 |
|
823 |
## Automatic Speech Recognition (ASR) Performance
|
824 |
|
|
|
834 |
|
835 |
**Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
|
836 |
|
|
|
|
|
|
|
|
|
|
|
837 |
### <span style="color:#466f00;">Deployment Geography:</span>
|
838 |
Global
|
839 |
|
|
|
845 |
|
846 |
### <span style="color:#466f00;">Release Date:</span>
|
847 |
|
848 |
+
Huggingface [08/14/2025](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3)
|
849 |
+
|
850 |
|
851 |
### <span style="color:#466f00;">Model Architecture:</span>
|
852 |
|
|
|
933 |
## <span style="color:#466f00;">Software Integration:</span>
|
934 |
|
935 |
**Runtime Engine(s):**
|
936 |
+
* NeMo 2.4
|
937 |
|
938 |
|
939 |
**Supported Hardware Microarchitecture Compatibility:**
|
|
|
1133 |
|
1134 |
For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
|
1135 |
|
1136 |
+
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|
1137 |
+
|
1138 |
+
## <span style="color:#466f00;">Bias:</span>
|
1139 |
+
|
1140 |
+
Field | Response
|
1141 |
+
---------------------------------------------------------------------------------------------------|---------------
|
1142 |
+
Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing | None
|
1143 |
+
Measures taken to mitigate against unwanted bias | None
|
1144 |
+
|
1145 |
+
## <span style="color:#466f00;">Explainability:</span>
|
1146 |
+
|
1147 |
+
Field | Response
|
1148 |
+
------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------
|
1149 |
+
Intended Domain | Speech to Text Transcription
|
1150 |
+
Model Type | FastConformer
|
1151 |
+
Intended Users | This model is intended for developers, researchers, academics, and industries building conversational based applications.
|
1152 |
+
Output | Text
|
1153 |
+
Describe how the model works | Speech input is encoded into embeddings and passed into conformer-based model and output a text response.
|
1154 |
+
Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of | Not Applicable
|
1155 |
+
Technical Limitations & Mitigation | Transcripts may be not 100% accurate. Accuracy varies based on language and characteristics of input audio (Domain, Use Case, Accent, Noise, Speech Type, Context of speech, etc.)
|
1156 |
+
Verified to have met prescribed NVIDIA quality standards | Yes
|
1157 |
+
Performance Metrics | Word Error Rate
|
1158 |
+
Potential Known Risks | If a word is not trained in the language model and not presented in vocabulary, the word is not likely to be recognized. Not recommended for word-for-word/incomplete sentences as accuracy varies based on the context of input text
|
1159 |
+
Licensing | GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
|
1160 |
+
|
1161 |
+
## <span style="color:#466f00;">Privacy:</span>
|
1162 |
+
|
1163 |
+
Field | Response
|
1164 |
+
----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------
|
1165 |
+
Generatable or reverse engineerable personal data? | None
|
1166 |
+
Personal data used to create this model? | None
|
1167 |
+
Is there provenance for all datasets used in training? | Yes
|
1168 |
+
Does data labeling (annotation, metadata) comply with privacy laws? | Yes
|
1169 |
+
Is data compliant with data subject requests for data correction or removal, if such a request was made? | No, not possible with externally-sourced data.
|
1170 |
+
Applicable Privacy Policy | https://www.nvidia.com/en-us/about-nvidia/privacy-policy/
|
1171 |
+
|
1172 |
+
## <span style="color:#466f00;">Safety:</span>
|
1173 |
+
|
1174 |
+
Field | Response
|
1175 |
+
---------------------------------------------------|----------------------------------
|
1176 |
+
Model Application(s) | Speech to Text Transcription
|
1177 |
+
Describe the life critical impact | None
|
1178 |
+
Use Case Restrictions | Abide by [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) License
|
1179 |
+
Model and dataset restrictions | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to.
|