-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Occasional encoding errors #483
Comments
This strips out unencodable characters and if not fixes, at least masks #483 by using `codecs.open` and telling it to ignore errors.
Using |
Needs to be re-reviewed in the context of Python3, where the issue may no longer arise. |
The exception seems to be caused by characters which occupy more than just a single byte in UTF-8 i.e. characters with a Unicode code point > 127 (= not from the lower half of ASCII). For example also the German umlauts With |
Sounds promising @spoeschel , does this mean you can generate a test case? That would be great because even if we fix it for Python2, we will also need to check it still works in Python3 when we migrate. |
@spoeschel I made a comment in #484 a long long time ago suggesting this was worth re-testing in Python3. I don't know if Python3 would work for you, but I've pushed a working Python3 build to the |
I havent't yet worked into the testing subsystem, but I will create a test case for this. Testing with the Python 3 branch this issue indeed no longer occurs when using one of the German letters mentioned above. However I get an exception when using the WebSocket output with the Python 3 branch (the WS input works), regardless of using any of the problematic letters or not. The filesystem output works though. I will have a look into that and probably open a new issue. |
Thank you @spoeschel ! |
It just turned out that this quick fix for the Python 2 branch only worked when I used the Resequencer. With the |
I think this is a strong argument for tying up the release/2.1.2 work, releasing it as our final Python2 release and moving all future work into release/3.0. |
I agree; this makes more sense than fixing a complex issue for a Python version that will be deprecated very soon anyway. |
Using the EBU-TT-D Encoder I'm occasionally getting Unicode errors like:
This is annoying. I don't know what's causing it, but there's probably an easy fix (though possibly a dangerous one) - https://docs.python.org/2.7/howto/unicode.html#the-unicode-type suggests using
codecs.open
and settingerrors='ignore'
will at least make the error go away...The text was updated successfully, but these errors were encountered: