--- license: mit language: - en tags: - RVC - voice-conversion - text-to-speech - voice-cloning - audio-generation datasets: - LJSpeech - VCTK metrics: - MOS (Mean Opinion Score) - PESQ - STOI base_model: - MangioRVC/Mangio-RVC-Huggingface pipeline_tag: audio-to-audio --- # πŸŒ™ LUNAR - High-Quality Female Voice RVC Model LUNAR is a state-of-the-art **RVC (Retrieval-Based Voice Conversion) model** optimized for **female voice conversion** with studio-grade audio quality at **48kHz sampling rate**. This model delivers natural-sounding voice transformations with minimal artifacts. ## **Performance & Efficiency Metrics** Here are the visual benchmarks of Lunar-RVC: ### **1. Training Loss Curve** ![Training Loss](./Metrics/1.jpg) ### **2. Validation Loss Curve** ![Validation Loss](./Metrics/2.jpg) ### **3. Training vs Validation Loss** ![Train vs Val Loss](./Metrics/3.jpg) ### **4. Inference Speed Comparison** ![Inference Speed](./Metrics/4.jpg) ### **5. Audio Quality Scores (MOS)** ![MOS Score](./Metrics/5.jpg) ### **6. GPU Memory Usage** ![GPU Memory](./Metrics/6.jpg) ### **7. Dataset Duration Distribution** ![Dataset Distribution](./Metrics/7.jpg) ### **8. Spectral Convergence** ![Spectral Convergence](./Metrics/8.jpg) ### **9. Model Size Comparison** ![Model Size](./Metrics/9.jpg) ### **10. Efficiency Radar Chart** ![Efficiency Radar](./Metrics/10.jpg) --- ## Key Features - **High-Fidelity Conversion** - Produces natural, expressive female voices - **Real-Time Ready** - Optimized for low-latency inference (<20ms/frame) - **Pitch & Timbre Control** - Flexible voice modulation capabilities - **48kHz Studio Quality** - Professional-grade audio output - **Easy Integration** - Compatible with popular voice toolkits ## πŸ“Š Model Specifications | Parameter | Value | |--------------------|---------------------| | Framework | RVC v2 | | Sample Rate | 48kHz | | Bit Depth | 16-bit | | Model Size | 1.8GB | | Training Hours | 150 epochs (~10h) | | VRAM Requirements | 4GB+ (inference) | | Supported Formats | WAV, MP3, FLAC | --- ## **Inference Guide** To use Lunar-RVC for inference: ``` bash # Clone repository git clone https://huggingface.co/IssacMosesD/Lunar-RVC-Model cd Lunar-RVC # Install dependencies pip install -r requirements.txt # Run inference python infer.py --input input.wav --output output.wav --model Lunar-RVC.pth ## Use Cases - Voice Cloning – Convert your voice into a professional singing voice. - Streaming – Real-time voice conversion for content creators. - Dubbing – High-quality voice conversion for movies & animations. - Music Production – Transform any vocal track into a new singer’s voice. ## System Requirements - OS: Windows / Linux - Python: 3.8+ - GPU: NVIDIA (6GB VRAM minimum recommended) - CUDA: 11.7+ - Torch: 1.13.1+ ## Contact - For support, queries, or collaboration: - Name: [Issac Moses D](https://www.linkedin.com/in/issacmosesd/) - Email: [issacmsoes19082005@gmail.com](mailto:issacmoses19082005@gmail.com) - Hugging Face: [IssacMosesD](https://huggingface.co/IssacMosesD) - GitHub: [Issac Moses D](https://github.com/Issac-Moses) ## Collaburator - Name: [Dharani Karnan](https://www.linkedin.com/in/dharani-karnan-060040320/) - Email: [dharanikarnan18@gmail.com](mailto:dharanikarnan18@gmail.com) - Hugging Face: [Dharani K](https://huggingface.co/dharzz188) - GitHub: [Dharani K](https://github.com/Issac-Moses)