karolnowakowski commited on
Commit
c04044e
·
1 Parent(s): ff47b00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -7
README.md CHANGED
@@ -8,15 +8,23 @@ license: apache-2.0
8
  ## Wav2Vec2-Large-XLSR-53 pretrained on Ainu language data
9
 
10
  This is a [wav2vec-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) model adapted for the Ainu language by performing continued pretraining for 100k steps on 234 hours of speech data in Hokkaido Ainu and Sakhalin Ainu.
11
- For details, please refer to the [paper]() (see below).
12
 
13
  ## Citation
14
- When using the model please cite the following paper (in press):
15
  ```bibtex
16
- @article{nowakowski2022,
17
- title={Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining},
18
- author={Nowakowski, Karol and Ptaszynski, Michal and Murasaki, Kyoko and Nieuważny, Jagna},
19
- year={2022},
20
- journal={Information Processing & Management}
 
 
 
 
 
 
 
 
21
  }
22
  ```
 
8
  ## Wav2Vec2-Large-XLSR-53 pretrained on Ainu language data
9
 
10
  This is a [wav2vec-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) model adapted for the Ainu language by performing continued pretraining for 100k steps on 234 hours of speech data in Hokkaido Ainu and Sakhalin Ainu.
11
+ For details, please refer to the [paper](https://authors.elsevier.com/a/1g7%7Es15hYdqHX1) (see below).
12
 
13
  ## Citation
14
+ When using the model please cite the following paper:
15
  ```bibtex
16
+ @article{NOWAKOWSKI2023103148,
17
+ title = {Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining},
18
+ journal = {Information Processing & Management},
19
+ volume = {60},
20
+ number = {2},
21
+ pages = {103148},
22
+ year = {2023},
23
+ issn = {0306-4573},
24
+ doi = {https://doi.org/10.1016/j.ipm.2022.103148},
25
+ url = {https://www.sciencedirect.com/science/article/pii/S0306457322002497},
26
+ author = {Karol Nowakowski and Michal Ptaszynski and Kyoko Murasaki and Jagna Nieuważny},
27
+ keywords = {Automatic speech transcription, ASR, Wav2vec 2.0, Pretrained transformer models, Speech representation models, Cross-lingual transfer, Language documentation, Endangered languages, Underresourced languages, Sakhalin Ainu},
28
+ abstract = {In recent years, neural models learned through self-supervised pretraining on large scale multilingual text or speech data have exhibited promising results for underresourced languages, especially when a relatively large amount of data from related language(s) is available. While the technology has a potential for facilitating tasks carried out in language documentation projects, such as speech transcription, pretraining a multilingual model from scratch for every new language would be highly impractical. We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language, focusing on actual fieldwork data from a critically endangered tongue: Ainu. Specifically, we (i) examine the feasibility of leveraging data from similar languages also in fine-tuning; (ii) verify whether the model’s performance can be improved by further pretraining on target language data. Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language and leads to considerable reduction in error rates. Furthermore, we find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance when there is very little labeled data in the target language.}
29
  }
30
  ```