File size: 1,359 Bytes
4de395a
 
 
 
 
 
 
 
 
 
 
 
 
 
dd37d85
 
4de395a
 
 
 
 
 
 
a95894b
4de395a
a95894b
 
 
 
 
 
 
 
 
 
 
 
 
1c9b4d3
a95894b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
license: cc-by-4.0
thumbnail: null
tags:
- automatic-speech-recognition
- speech
- audio
- Transducer
- TDT
- FastConformer
- Conformer
- pytorch
- NeMo
- hf-asr-leaderboard
- coreml
- apple
language:
- en
pipeline_tag: automatic-speech-recognition
base_model:
- nvidia/parakeet-tdt-0.6b-v2
---

# Parakeet TDT 0.6B V2 - CoreML

This is a CoreML-optimized version of NVIDIA's Parakeet TDT 0.6B V2 model, designed for high-performance automatic speech recognition on Apple platforms.

## Model Description

Models will continue to evolve as we optimize performance and accuracy. This model has been converted to CoreML format for efficient on-device inference on Apple Silicon and iOS devices, enabling real-time speech recognition with
minimal memory footprint.

## Usage in Swift

See the [FluidAudio repository](https://github.com/FluidInference/FluidAudioSwift) for instructions. 

## Performance

- Real-time factor: ~110x on M4 Pro 
- Memory usage: ~800MB peak
- Supported platforms: macOS 14+, iOS 17+
- Optimized for: Apple Silicon 

## Model Details

- Architecture: FastConformer-TDT
- Parameters: 0.6B
- Sample rate: 16kHz

## License

This model is released under the CC-BY-4.0 license. See the LICENSE file for details.

Acknowledgments

Based on NVIDIA's Parakeet TDT model. CoreML conversion and Swift integration by the FluidInference team.