How to use automatic language recognition?
#33
by
owao
- opened
I couldn't find a way to set the language to auto
or None
. I'm always getting Invalid language alpha2 code [type=language_alpha2, input_value='automatic', input_type=str]
And language is a required positional arg, so how is it meant to be used?
Thanks by advance
Same issue here.
automatic language detection is mentioned but all libraries and example codes expect a defined language.
I think i found a workaround.
# patching optional language field
from typing import Optional
from pydantic_extra_types.language_code import LanguageAlpha2
from mistral_common.protocol.transcription.request import TranscriptionRequest as _TR
class TranscriptionRequest(_TR):
# make it optional
language: Optional[LanguageAlpha2] = None
# for transcribing, use the mistral_common helper directly:
repo_id = "mistralai/Voxtral-Mini-3B-2507"
openai_req = {
"model": repo_id,
"file": wav_buffer, # has to be a path or io.BytesIO
# "language": 'en' # This is now optional. leave out for auto detection.
}
tr = TranscriptionRequest.from_openai(openai_req)
tok = processor.tokenizer.tokenizer.encode_transcription(tr)
audio_feats = processor.feature_extractor(
wav_buffer, sampling_rate=16000, return_tensors="pt"
).input_features.to(model.device)
with torch.no_grad():
ids = model.generate(
input_features=audio_feats,
input_ids = torch.tensor([tok.tokens], device=model.device),
max_new_tokens=500,
num_beams=1
)
response = processor.batch_decode(ids, skip_special_tokens=True)[0]
This solution worked! In fact, I gave it to Claude.ai and "he" integrated it into the script.