Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust default parameters to make even small samples do something #628

Open
joanise opened this issue Jan 27, 2025 · 0 comments
Open

Adjust default parameters to make even small samples do something #628

joanise opened this issue Jan 27, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@joanise
Copy link
Member

joanise commented Jan 27, 2025

Bug description

For regression testing, the 15 minute of data test case yielded 150 utterances from LJ, and that caused training to fail with this error:

2025-01-23 12:13:40.264 | ERROR    | everyvoice.utils:filter_dataset_based_on_target_text_representation_level:96 - Sorry you do not have enough characters data in your current validation filelist to run the model with a batch size of 16.

This appears to be due to having just 15 samples in the validation set.

We should adjust the default wizard and training defaults so that if the data has <160 utterances, things are setup so training can proceed anyway.

How to reproduce the bug

  • Create a dataset with 150 samples
  • run the wizard
  • everyvoice preprocess config/everyvoice-text-to-spec.yaml
  • everyvoice train text-to-spec config/everyvoice-text-to-spec.yaml

Or from the branch for #616 run go.sh and inspect the logs in regress-lj-150/

Error messages and logs

2025-01-23 12:13:40.264 | ERROR    | everyvoice.utils:filter_dataset_based_on_target_text_representation_level:96 - Sorry you do not have enough characters data in your current validation filelist to run the model with a batch size of 16.
@joanise joanise added the bug Something isn't working label Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant