File size: 1,359 Bytes
4de395a dd37d85 4de395a a95894b 4de395a a95894b 1c9b4d3 a95894b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: cc-by-4.0
thumbnail: null
tags:
- automatic-speech-recognition
- speech
- audio
- Transducer
- TDT
- FastConformer
- Conformer
- pytorch
- NeMo
- hf-asr-leaderboard
- coreml
- apple
language:
- en
pipeline_tag: automatic-speech-recognition
base_model:
- nvidia/parakeet-tdt-0.6b-v2
---
# Parakeet TDT 0.6B V2 - CoreML
This is a CoreML-optimized version of NVIDIA's Parakeet TDT 0.6B V2 model, designed for high-performance automatic speech recognition on Apple platforms.
## Model Description
Models will continue to evolve as we optimize performance and accuracy. This model has been converted to CoreML format for efficient on-device inference on Apple Silicon and iOS devices, enabling real-time speech recognition with
minimal memory footprint.
## Usage in Swift
See the [FluidAudio repository](https://github.com/FluidInference/FluidAudioSwift) for instructions.
## Performance
- Real-time factor: ~110x on M4 Pro
- Memory usage: ~800MB peak
- Supported platforms: macOS 14+, iOS 17+
- Optimized for: Apple Silicon
## Model Details
- Architecture: FastConformer-TDT
- Parameters: 0.6B
- Sample rate: 16kHz
## License
This model is released under the CC-BY-4.0 license. See the LICENSE file for details.
Acknowledgments
Based on NVIDIA's Parakeet TDT model. CoreML conversion and Swift integration by the FluidInference team. |