Update GitHub repository link
Browse filesThis PR updates the GitHub repository link at the top of the model card from `[[Installation]](https://github.com/usefulsensors/moonshine/blob/main/README.md)` to `[[Code]](https://github.com/moonshine-ai/moonshine)`. This change provides a more direct link to the project's main code repository, improving discoverability and ease of access for users.
All other metadata and content sections remain unchanged as they are consistent with the paper [Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices](https://huggingface.co/papers/2509.02523) and the current information available.
README.md
CHANGED
@@ -1,14 +1,15 @@
|
|
1 |
---
|
2 |
-
license: other
|
3 |
language:
|
4 |
-
- zh
|
5 |
library_name: transformers
|
|
|
6 |
pipeline_tag: automatic-speech-recognition
|
7 |
-
arxiv: https://arxiv.org/abs/2509.02523
|
8 |
---
|
|
|
9 |
# Moonshine
|
10 |
|
11 |
-
[[Paper]](https://arxiv.org/abs/2509.02523) [[
|
12 |
|
13 |
This is the model card for running the automatic speech recognition (ASR) models (Moonshine models) trained and released by Moonshine AI (f.k.a Useful Sensors.)
|
14 |
|
@@ -92,7 +93,7 @@ Our evaluations show that, the models exhibit greater accuracy on standard datas
|
|
92 |
|
93 |
However, like any machine learning model, the predictions may include texts that are not actually spoken in the audio input (i.e. hallucination). We hypothesize that this happens because, given their general knowledge of language, the models combine trying to predict the next word in audio with trying to transcribe the audio itself.
|
94 |
|
95 |
-
In addition, the sequence-to-sequence architecture of the model makes it prone to generating repetitive texts, which can be mitigated to some degree by beam search and temperature scheduling but not perfectly. It is likely that this behavior and hallucinations may be worse for short audio segments, or segments where parts of words are cut off at the beginning or the end of the segment.
|
96 |
|
97 |
## Broader Implications
|
98 |
|
@@ -113,4 +114,4 @@ If you benefit from our work, please cite us:
|
|
113 |
primaryClass={cs.CL},
|
114 |
url={https://arxiv.org/abs/2509.02523},
|
115 |
}
|
116 |
-
```
|
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
+
- zh
|
4 |
library_name: transformers
|
5 |
+
license: other
|
6 |
pipeline_tag: automatic-speech-recognition
|
7 |
+
arxiv: https://arxiv.org/abs/2509.02523
|
8 |
---
|
9 |
+
|
10 |
# Moonshine
|
11 |
|
12 |
+
[[Paper]](https://arxiv.org/abs/2509.02523) [[Code]](https://github.com/moonshine-ai/moonshine)
|
13 |
|
14 |
This is the model card for running the automatic speech recognition (ASR) models (Moonshine models) trained and released by Moonshine AI (f.k.a Useful Sensors.)
|
15 |
|
|
|
93 |
|
94 |
However, like any machine learning model, the predictions may include texts that are not actually spoken in the audio input (i.e. hallucination). We hypothesize that this happens because, given their general knowledge of language, the models combine trying to predict the next word in audio with trying to transcribe the audio itself.
|
95 |
|
96 |
+
In addition, the sequence-to-sequence architecture of the model makes it prone to generating repetitive texts, which can be mitigated to some degree by beam search and temperature scheduling but not perfectly. It is likely that this behavior and hallucinations may be worse for short audio segments, or segments where parts of words are cut off at the beginning or at the end of the segment.
|
97 |
|
98 |
## Broader Implications
|
99 |
|
|
|
114 |
primaryClass={cs.CL},
|
115 |
url={https://arxiv.org/abs/2509.02523},
|
116 |
}
|
117 |
+
```
|