Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] xttsv2 license acceptance process broken if calling function doesn't provide stdin/stdout access; then coqui fails on next run due to missing files #145

Open
ajkessel opened this issue Nov 8, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@ajkessel
Copy link

ajkessel commented Nov 8, 2024

Describe the bug

I'm using coqui in ttspod, a tool that converts any content to a podcast feed. One problem I consistently encounter is the first time the application is run with xtts, the user is required to accept the license by keyboard. But often the first run in my case involves piping content into the application, which means it can't receive a y from stdin.

Is the xtts keyboard acceptance step really required by the copyright owner? Has anyone considered whether just displaying the license notice would be sufficient?

This also gives rise to a separate issue that could be easier to fix: if the license acceptance process fails on first run, the user's tts/tts_models--multilingual--multi-dataset--tts_models--multilingual--multi-dataset--xtts_v2/ folder is created but not populated. On the next run, coqui assumes the folder is populated and ends up raising a FileNotFoundError. I suggest a better workflow would be if it can't find the files it is looking for in the directory, it re-attempts the license acceptance process and downloads the model, rather than just throwing an exception.

I can work up a PR for either or both of these issues if people agree.

To Reproduce

Run coqui with xtts model

Expected behavior

Automatic license acceptance and/or re-download model if missing

Logs

No response

Environment

coqui-ai-tts 0.24.3

Additional context

No response

@ajkessel ajkessel added the bug Something isn't working label Nov 8, 2024
@eginhard
Copy link
Member

eginhard commented Nov 9, 2024

I'm using coqui in ttspod, a tool that converts any content to a podcast feed. One problem I consistently encounter is the first time the application is run with xtts, the user is required to accept the license by keyboard. But often the first run in my case involves piping content into the application, which means it can't receive a y from stdin.

Is the xtts keyboard acceptance step really required by the copyright owner? Has anyone considered whether just displaying the license notice would be sufficient?

I'm not a lawyer, but explicitly asking to accept the license should definitely be safer. You could do the license check on your side (using TTS.utils.manage.tos_agreed()) and if it hasn't been agreed yet, discard any piped input and then ask for confirmation.

This also gives rise to a separate issue that could be easier to fix: if the license acceptance process fails on first run, the user's tts/tts_models--multilingual--multi-dataset--tts_models--multilingual--multi-dataset--xtts_v2/ folder is created but not populated. On the next run, coqui assumes the folder is populated and ends up raising a FileNotFoundError.

Ok, I'll need to check what exactly is happening, can you share the full error log? There is a line that should remove the folder when the license is not accepted, but maybe it doesn't work as I'd expect:

os.rmdir(output_path)

@ajkessel
Copy link
Author

ajkessel commented Nov 9, 2024

I'm not a lawyer, but explicitly asking to accept the license should definitely be safer. You could do the license check on your side (using TTS.utils.manage.tos_agreed()) and if it hasn't been agreed yet, discard any piped input and then ask for confirmation.

It is much more the norm than the exception for github code libraries that a license file accompanies the code (and may be displayed on the screen) rather than something that requires keyboard input, especially for code where a common use case is non-interactive. If we were prompted to accept the license (open source or otherwise) for every component of the stack, the user experience would be radically different. Do we actually have a line of communication to whoever decided the license needed to be accepted by keyboard in the first place? I'm not suggesting overriding their preference, but I wonder if there is anyone to check with.

In the meantime, I'll try your alternate approach.

Ok, I'll need to check what exactly is happening, can you share the full error log? There is a line that should remove the folder when the license is not accepted, but maybe it doesn't work as I'd expect:

I think the reason this isn't working at least in my use case is you can't accept or reject the license. If stdin is a pipe, the task just hangs and the only way out is CTRL-C or to kill the process, so the folder removal never occurs.

Although I can fix some of this on my end, wouldn't it be a better practice for coqui-ai-tts to check for more than just the existence of the folder path, and if files in the folder are missing, attempt to download those? There are a variety of reasons why you might have the path but not the files (e.g. connection interruption on last run) and it shouldn't be hard for it to try to download rather than fail with file-not-found.

@eginhard
Copy link
Member

eginhard commented Nov 9, 2024

You don't have to use the keyboard, you could set the environment variable COQUI_TOS_AGREED=1 on your side and Coqui won't ask. It's up to you then how you ask/inform the user. I don't think I'll change how this is handled on the Coqui side.

Although I can fix some of this on my end, wouldn't it be a better practice for coqui-ai-tts to check for more than just the existence of the folder path, and if files in the folder are missing, attempt to download those? There are a variety of reasons why you might have the path but not the files (e.g. connection interruption on last run) and it shouldn't be hard for it to try to download rather than fail with file-not-found.

Yes, interrupted downloads aren't handled well, this can definitely be improved. There's some issues in the old repo as well where users had failures during the Bark download, but Coqui doesn't give a useful error message then. Open to PRs for that!

@ajkessel
Copy link
Author

ajkessel commented Nov 9, 2024

You don't have to use the keyboard, you could set the environment variable COQUI_TOS_AGREED=1 on your side and Coqui won't ask. It's up to you then how you ask/inform the user. I don't think I'll change how this is handled on the Coqui side.

Thanks, that's helpful.

Yes, interrupted downloads aren't handled well, this can definitely be improved. There's some issues in the old repo as well where users had failures during the Bark download, but Coqui doesn't give a useful error message then. Open to PRs for that!

I can take a stab at it. I'm not actually seeing where the logic is by which it doesn't try to re-download if the directory is empty. There's this line which reports a model as downloaded if the directory exists, but that's just for listing models, not for deciding whether to download/re-download.

@eginhard
Copy link
Member

The main checks are in here:

def download_model(self, model_name):

For XTTS it looks like a hash is actually checked, for other models it's just based on folder existence.

@ajkessel
Copy link
Author

For XTTS it looks like a hash is actually checked, for other models it's just based on folder existence.

So I saw that code and was confused because in the scenario mentioned at the start of this issue, I was getting a FileNotFound exception for XTTS when the folder existed but the files did not. I'll see if I can figure out what's going on there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants
@ajkessel @eginhard and others