--- license: mit language: - ar metrics: - cer - wer base_model: - microsoft/wavlm-large --- # Overview This Tunisian Automatic Speech Recognition (ASR) project focuses on developing a system that can accurately transcribe spoken Tunisian Arabic into text. It's a finetuned on a WavLM (an extension of Wav2Vec 2.0 which uses a transformer architecture ) as a base Model and boosted with a KenLM language model located in language_model/languageModel.arpa. ## 📈 Performance Tested On a Private Dataset | CER | WER | | :------- | :------------------------- | | `9.18%` | `24.78%` | A Private Dataset , 2.5 Hours of Tunisian audio data. ## 🚀 How To run the web app Locally? ### 1. Download the repo : Make sure that you installed the huggingface client before cloning the repo . ```python > git clone https://huggingface.co/brdhaker3/TunASR ``` ### 2. install the necessary dependencies : ```python > pip install -r requirements.txt ``` ### 3. Adjust the hyperparams.yaml file Check the **hyper parameters** file hyperparams.yaml and verify the path of the **language model**. ### 4.🌐Run the web app: To run the web app you have only to execute: ```python > python app.py ``` ## ✉️ Contact : If you have questions, you can send an email to : benrejebdhaker3@gmail.com * [Dhaker Br](https://tn.linkedin.com/in/BrDhaker)