kerlos127 commited on
Commit
0778ec3
·
verified ·
1 Parent(s): 4d6fb0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -78
README.md CHANGED
@@ -30,82 +30,6 @@ model-index:
30
  name: Wer
31
  ---
32
 
33
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
34
- should probably proofread and complete it, then remove this comment. -->
35
 
36
- # Whisper Medium (Thai): Combined V3
37
-
38
- This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on augmented versions of the mozilla-foundation/common_voice_13_0 th, google/fleurs, and curated datasets.
39
- It achieves the following results on the common-voice-13 test set:
40
- - WER: 7.42 (with Deepcut Tokenizer)
41
-
42
- ## Model description
43
-
44
- Use the model with huggingface's `transformers` as follows:
45
-
46
- ```py
47
- from transformers import pipeline
48
-
49
- MODEL_NAME = "biodatlab/whisper-th-medium-combined" # specify the model name
50
- lang = "th" # change to Thai langauge
51
-
52
- device = 0 if torch.cuda.is_available() else "cpu"
53
-
54
- pipe = pipeline(
55
- task="automatic-speech-recognition",
56
- model=MODEL_NAME,
57
- chunk_length_s=30,
58
- device=device,
59
- )
60
- pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(
61
- language=lang,
62
- task="transcribe"
63
- )
64
- text = pipe("audio.mp3")["text"] # give audio mp3 and transcribe text
65
- ```
66
-
67
-
68
- ## Intended uses & limitations
69
-
70
- More information needed
71
-
72
- ## Training and evaluation data
73
-
74
- More information needed
75
-
76
- ## Training procedure
77
-
78
- ### Training hyperparameters
79
-
80
- The following hyperparameters were used during training:
81
- - learning_rate: 1e-05
82
- - train_batch_size: 16
83
- - eval_batch_size: 16
84
- - seed: 42
85
- - optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
86
- - lr_scheduler_type: linear
87
- - lr_scheduler_warmup_steps: 500
88
- - training_steps: 10000
89
- - mixed_precision_training: Native AMP
90
-
91
- ### Framework versions
92
-
93
- - Transformers 4.37.2
94
- - Pytorch 2.1.0
95
- - Datasets 2.16.1
96
- - Tokenizers 0.15.1
97
-
98
- ## Citation
99
-
100
- Cite using Bibtex:
101
-
102
- ```
103
- @misc {thonburian_whisper_med,
104
- author = { Atirut Boribalburephan, Zaw Htet Aung, Knot Pipatsrisawat, Titipat Achakulvisut },
105
- title = { Thonburian Whisper: A fine-tuned Whisper model for Thai automatic speech recognition },
106
- year = 2022,
107
- url = { https://huggingface.co/biodatlab/whisper-th-medium-combined },
108
- doi = { 10.57967/hf/0226 },
109
- publisher = { Hugging Face }
110
- }
111
- ```
 
30
  name: Wer
31
  ---
32
 
33
+ CT2 Model. convert from
 
34
 
35
+ https://huggingface.co/biodatlab/whisper-th-medium-combined