How can I use your model to continue my training process?
Is your model suitable for practicing another language in the Middle East? And if so, how can I use your model to continue the learning process?
You can follow the instructions for the F5-TTS model https://github.com/SWivid/F5-TTS :
1-Download the checkpoint – The best option is the 380000.pt checkpoint, along with the vocab.txt file.
2-Verify the vocabulary file – Ensure that vocab.txt contains all the characters in your dataset. If you are continuing fine-tuning in Arabic, you should be fine, as I have already included all Arabic characters and diacritics.
3-Set up the F5-TTS environment – Once the environment is ready, choose the fine-tuning option. Then, set the model path to the checkpoint file and the tokenizer to the vocab.txt file.
Hello,
Thank you for sharing this file! What was the structure of the training?
It seems the default one doesn't work: dict(dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4)
it is the default , did not change it . model_cfg = dict(dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4).I did changed the vocab thouth ,so make sure you are using the one here in the repo
Thank you!! now it works - the vocab file was exactly my problem
Thanks for your help, but:
I can't use your vocab file. When I give the path to your vocab file to Tokenizer File , it gives this error:
Loading model cost 0.440 seconds.
Prefix dict has been built successfully.
vocab : 2580
vocoder : vocos
Using logger: None
Loading dataset ...
Traceback (most recent call last):
File "C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts\model\dataset.py", line 259, in load_dataset
train_dataset = load_from_disk(f"{rel_data_path}/raw")
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\load.py", line 2207, in load_from_disk
raise FileNotFoundError(f"Directory {dataset_path} not found")
FileNotFoundError: Directory C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts....\data\my_Language_custom/raw not found
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts\train\finetune_cli.py", line 182, in
main()
File "C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts\train\finetune_cli.py", line 173, in main
train_dataset = load_dataset(args.dataset_name, tokenizer, mel_spec_kwargs=mel_spec_kwargs)
File "C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts\model\dataset.py", line 261, in load_dataset
train_dataset = Dataset_.from_file(f"{rel_data_path}/raw.arrow")
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\arrow_dataset.py", line 742, in from_file
table = ArrowReader.read_table(filename, in_memory=in_memory)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\arrow_reader.py", line 329, in read_table
return table_cls.from_file(filename)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\table.py", line 1017, in from_file
table = _memory_mapped_arrow_table_from_file(filename)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\table.py", line 63, in _memory_mapped_arrow_table_from_file
opened_stream = _memory_mapped_record_batch_reader_from_file(filename)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\datasets\table.py", line 48, in _memory_mapped_record_batch_reader_from_file
memory_mapped_stream = pa.memory_map(filename)
File "pyarrow\io.pxi", line 1147, in pyarrow.lib.memory_map
File "pyarrow\io.pxi", line 1094, in pyarrow.lib.MemoryMappedFile._open
File "pyarrow\error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\error.pxi", line 92, in pyarrow.lib.check_status
FileNotFoundError: [WinError 3] Failed to open local file 'C:/pinokio/api/e2-f5-tts.git/app/src/f5_tts/../../data/my_Language_custom/raw.arrow'. Detail: [Windows error 3] The system cannot find the path specified.
Traceback (most recent call last):
File "C:\pinokio\bin\miniconda\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\pinokio\bin\miniconda\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\pinokio\api\e2-f5-tts.git\app\env\Scripts\accelerate.exe_main.py", line 7, in
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
args.func(args)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\accelerate\commands\launch.py", line 1172, in launch_command
simple_launcher(args)
File "C:\pinokio\api\e2-f5-tts.git\app\env\lib\site-packages\accelerate\commands\launch.py", line 762, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\pinokio\api\e2-f5-tts.git\app\env\Scripts\python.exe', 'C:\pinokio\api\e2-f5-tts.git\app\src\f5_tts\train\finetune_cli.py', '--exp_name', 'F5TTS_Base', '--learning_rate', '1e-05', '--batch_size_per_gpu', '800', '--batch_size_type', 'frame', '--max_samples', '64', '--grad_accumulation_steps', '1', '--max_grad_norm', '1', '--epochs', '14', '--num_warmup_updates', '66', '--save_per_updates', '500', '--keep_last_n_checkpoints', '1', '--last_per_updates', '200', '--dataset_name', 'my_Language', '--finetune', '--pretrain', 'C:\pinokio\api\e2-f5-tts.git\app\ckpts\model_380000.pt', '--tokenizer_path', 'C:\pinokio\api\e2-f5-tts.git\app\ckpts\vocab.txt', '--tokenizer', 'custom', '--log_samples', '--logger', 'wandb']' returned non-zero exit status 1.
When I take the path to your vocab file from Tokenizer File, it gives this:
Creating dynamic batches with 800 audio frames per gpu: 0%| | 0/1335 [00:00<?, ?it/s]
Creating dynamic batches with 800 audio frames per gpu: 100%|##########| 1335/1335 [00:00<?, ?it/s]
Saved last checkpoint at update 380000
for this issue "Saved last checkpoint at update 380000",How many epochs you are using ?, I think you should increase the number of epochs.
14 epochs
Does that mean I should set the number of epochs so that when multiplied by the samples, it exceeds 380,000?
I saved your file with two names in the project path:
model_last.pt
pretrained_model_380000.pt
Does the number of epochs matter? Because I was just testing
And one more thing: when I reduce the size of your file in the Reduce Checkpoint , these problems do not occur, can this method be used to continue training the model?
If you're just testing, use infer_cli.py instead of finetune_cli.py. Make sure the checkpoint and vocabulary paths are correct. Additionally, provide reference audio along with Arabic reference text for optimal results. Experiment with the hyperparameters to fine-tune performance, and it should work without issues.
For fine-tuning, I haven't tested the reduced checkpoints myself, but if it includes the model's state dict, it should work just fine.
What I mean by testing is that I was experimenting with a small dataset so that I could eventually use a larger dataset. Actually, I wanted to calculate how much time it takes to train the model for each hour of dataset.
I want to collect quality data every few days and keep going.
Main problem is with the vocab file. I can't use the my vocab and your pretrained model.
size mismatch for ema_model.transformer.text_embed.text_embed.weight: copying a param with shape torch.Size([2581, 512]) from checkpoint, the shape in current model is torch. Size([2590, 512]).
Problem solved:
When we click the Extend option in the Vocab Check tab, it creates a pretrained_model_1200000 and vocab.txt file inside our project that uses them.
But because it was using the model_1200000.pt file that was in the
C:\pinokio\api\e2-f5-tts.git\cache\HF_HOME\hub\models--SWivid--F5-TTS\snapshots\4dcc16f297f2ff98a17b3726b16f5de5a5e45672\F5TTS_Base path,
the new vocab file was not compatible with your pretrained model.
I replaced your pretrained model file (model_380000.pt) with the original file project (model_1200000.pt) and the problem was solved.
Since you're using a new vocabulary with a different size, you need to expand the model's embeddings accordingly. When you expanded the vocabulary, the code automatically adjusted the model original embeddings to match. If you want to use the 380000.pt checkpoint, you'll need to do the same—expand the embeddings to ensure compatibility
Thanks for the help and quick response.