You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Error when starting model training from checkpoint in Coqui TTS
When saved as a checkpoint for later training, the last training and eval losses are saved as in dict. When training from scratch, the last training loss is saved as a float. Hence, starting from a checkpoint doesn't run the code properly
To Reproduce
Train a model in Coqui TTS using trainer
Once a checkpoint for best model is saved, stop the training
Set the checkpoint folder as continue path in the trainer class
Traceback (most recent call last):
File "/mnt/Work/anaconda3/envs/tts-env/lib/python3.10/site-packages/trainer/trainer.py", line 1808, in fit
self._fit()
File "/mnt/Work/anaconda3/envs/tts-env/lib/python3.10/site-packages/trainer/trainer.py", line 1771, in _fit
self.save_best_model()
File "/mnt/Work/anaconda3/envs/tts-env/lib/python3.10/site-packages/trainer/utils/distributed.py", line 35, in wrapped_fn
return fn(*args, **kwargs)
File "/mnt/Work/anaconda3/envs/tts-env/lib/python3.10/site-packages/trainer/trainer.py", line 1893, in save_best_model
self.best_loss = save_best_model(
File "/mnt/Work/anaconda3/envs/tts-env/lib/python3.10/site-packages/trainer/io.py", line 183, in save_best_model
if current_loss < best_loss:
TypeError: '<' not supported between instances of 'float' and 'dict'
Describe the bug
Error when starting model training from checkpoint in Coqui TTS
When saved as a checkpoint for later training, the last training and eval losses are saved as in dict. When training from scratch, the last training loss is saved as a float. Hence, starting from a checkpoint doesn't run the code properly
To Reproduce
https://colab.research.google.com/drive/1OwemROn306_JIYASjx39d52eXFHS1O_u
Expected behavior
The training should stop
Logs
Environment
Additional context
No response
The text was updated successfully, but these errors were encountered: