[Help Wanted] Training from scratch on 1000 hours of Spanish does not work #565

rlenain · 2024-12-02T17:28:13Z

Checks

This template is only for usage issues encountered.
I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
I have searched for existing issues, including closed ones, and couldn't find a solution.
I confirm that I am using English to submit this report in order to facilitate communication.

Environment Details

Linux, Python=3.10

Steps to Reproduce

I ran finetune_cli.py with --finetune False, so training from scratch, with 1000 hours of Spanish data and even after 500k steps, I am still not getting speech out. It sounds like the original speaker from the prompt sometimes, but the words being uttered are complete gibberish.

Any help on this?

✔️ Expected Behavior

I would like to get speech.

❌ Actual Behavior

Gibberish

The text was updated successfully, but these errors were encountered:

SWivid · 2024-12-02T17:30:00Z

need more info, e.g. detailed configuration of training setup

rlenain · 2024-12-02T17:31:49Z

The command I run (I have made a few changes to the repo around how to pass experiment names, etc, nothing to do with the actual training) is:

CUDA_VISIBLE_DEVICES=4,5,6,7 accelerate launch --main_process_port 29501 finetune-cli.py \ 
--model_name F5TTS_Base  --exp_name F5TTS_Base-FromScratch_1khrs_esLA --learning_rate 1e-05 \
 --batch_size_per_gpu 20000 --batch_size_type frame --max_samples 64 --grad_accumulation_steps 1 \
--max_grad_norm 1 --epochs 500 --num_warmup_updates 10000 --save_per_updates 100000 \
 --last_per_steps 5000 --dataset_name 1000hours_esLA_fromL --finetune False --tokenizer char

I run on 8*A100 GPUs

Audio sounds like this after 500k steps https://whyp.it/tracks/231435/gibberish?token=TQ1fK.

When I run the exact same setup but with --finetune True, it works fairly well and I get good Spanish speech out.

SWivid · 2024-12-02T17:36:09Z

so actually 4*a100 CUDA_VISIBLE_DEVICES=4,5,6,7
thought the learning rate is too small for training from scratch
would recommend the same setting as in our paper or train.py (if you have pulled the latest repo, the config of base model is under config/ directory)

also you may refer to #548 , as you are using a 1000hours dataset

rlenain · 2024-12-03T18:45:02Z

Thank you, changing the learning rate and increasing the number of warmup updates has helped

tuanh123789 · 2024-12-04T03:56:31Z

hi @rlenain can you share the dataset

rlenain · 2024-12-04T10:25:10Z

Unfortunately I cannot

Federico1666 · 2024-12-20T00:46:36Z

Unfortunately I cannot

puedes compartir el modelo final? estoy necesitando un buen modelo en español y el que hay disponible solo tiene 250 horas de entrenamiento

ukemamaster · 2025-01-07T10:13:31Z

@rlenain Is your data in LJSpeech style? Which recipe exactly did you use for your training? Can you share you training configuration?

rlenain added the help wanted Extra attention is needed label Dec 2, 2024

rlenain changed the title ~~[Help Wanted]~~ [Help Wanted] Training from scratch on 1000 hours of Spanish does not work Dec 2, 2024

SWivid closed this as completed Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Help Wanted] Training from scratch on 1000 hours of Spanish does not work #565

[Help Wanted] Training from scratch on 1000 hours of Spanish does not work #565

rlenain commented Dec 2, 2024

SWivid commented Dec 2, 2024

rlenain commented Dec 2, 2024 •

edited

Loading

SWivid commented Dec 2, 2024

rlenain commented Dec 3, 2024

tuanh123789 commented Dec 4, 2024

rlenain commented Dec 4, 2024

Federico1666 commented Dec 20, 2024

ukemamaster commented Jan 7, 2025

[Help Wanted] Training from scratch on 1000 hours of Spanish does not work #565

[Help Wanted] Training from scratch on 1000 hours of Spanish does not work #565

Comments

rlenain commented Dec 2, 2024

Checks

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

SWivid commented Dec 2, 2024

rlenain commented Dec 2, 2024 • edited Loading

SWivid commented Dec 2, 2024

rlenain commented Dec 3, 2024

tuanh123789 commented Dec 4, 2024

rlenain commented Dec 4, 2024

Federico1666 commented Dec 20, 2024

ukemamaster commented Jan 7, 2025

rlenain commented Dec 2, 2024 •

edited

Loading