How to use automatic language recognition?

#33
by owao - opened

I couldn't find a way to set the language to auto or None. I'm always getting Invalid language alpha2 code [type=language_alpha2, input_value='automatic', input_type=str]
And language is a required positional arg, so how is it meant to be used?

Thanks by advance

Same issue here.
automatic language detection is mentioned but all libraries and example codes expect a defined language.

I think i found a workaround.

# patching optional language field
from typing import Optional
from pydantic_extra_types.language_code import LanguageAlpha2
from mistral_common.protocol.transcription.request import TranscriptionRequest as _TR

class TranscriptionRequest(_TR):
    # make it optional
    language: Optional[LanguageAlpha2] = None



# for transcribing, use the mistral_common helper directly:
repo_id = "mistralai/Voxtral-Mini-3B-2507"

openai_req = {
    "model": repo_id,
    "file":  wav_buffer, # has to be a path or io.BytesIO
    # "language": 'en'   # This is now optional. leave out for auto detection.
}
tr = TranscriptionRequest.from_openai(openai_req)

tok = processor.tokenizer.tokenizer.encode_transcription(tr)
audio_feats = processor.feature_extractor(
    wav_buffer, sampling_rate=16000, return_tensors="pt"
).input_features.to(model.device)

with torch.no_grad():
    ids = model.generate(
        input_features=audio_feats,
        input_ids     = torch.tensor([tok.tokens], device=model.device),
        max_new_tokens=500,
        num_beams=1
    )
response = processor.batch_decode(ids, skip_special_tokens=True)[0]

Thanks for sharing your solution @sharrnah ! I'll give it a try soon and report back!

This solution worked! In fact, I gave it to Claude.ai and "he" integrated it into the script.

Sign up or log in to comment