-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode character in README causes UnicodeEncodeError for 0.9.0b1 #149
Comments
The difference between |
Perhaps #142 ? |
Also happens for python 3.8.19, 3.9.19 and 3.10.14 so not caused by old
|
@pradyunsg or @FFY00, could you yank 0.9.0b1? I realize I'm not a maintainer on the PyPI package. We can use https://github.com/shacklettbp/madrona/blob/3fab172cb0ad41e0b035871885c2bc4c459c520d/py/madrona-py-build.py#L34C5-L35C38 instead of compat32, but then new lines aren't supported in headers... |
I thought pre-releases are meant for testing and that they are expected to contain bugs. If you want production quality releases, maybe do not use pre-releases? Otherwise you are kind of asking for known bugs... |
This wasn't a known bug. I would't make a prerelease with a known bug. But you are opting in to helping us test untested code. ;) |
I meant to say that it is known that very likely pre-releases contain bugs, not that the specific bugs are know |
It just sounded like you were discouraging reporting on bugs in beta releases, when this is exactly what I wanted from the prerelease, bug reports. :) |
I was objecting to requesting to yank the pre-release because it contains a bug. I think it works against the goal of having the pre-release in the first place. Of course the bug report is well accepted! Thank you for testing and reporting back @woutdenolf ! I'm sorry if I made it sound like your report was not appreciated. It was not my intention. |
Ah, yes. Didn't know it was about the yanking. I like yanking anything with a known bug of this size (no unicode); I agree it's not critically required, though. You can still install it after yanking, so backend authors could still play with it, though I plan to make a b2 with a fix soon. |
In the CI of all our projects we use
As you can see the last option would be easier for downstream projects. But I understand you want the buggy pre-release to stay so other projects can report other bugs until you deployed a fix, after which yanking or not does not matter anymore. So up to you. Thanks for the fast response! |
There are two types of projects:
In the former case, I would say that yanking a beta is very unproductive. It's perhaps unfortunate for the handful of projects with the issue, but a good balance here may be to have two CI jobs, one of which tests pre-releases, and then just... resign yourself to a red X-mark in your CI statuses for a short while? In the latter case, yanking usually makes sense because the bug is kind of a showstopper and projects which cannot build due to one bug aren't going to find a different bug at the same time; yanking the release keeps things less noisy for pre-release CI and means that those projects are more likely to notice it's still broken once you release another beta with a fix. The question is always, which one do you think your project is? :D My gut feeling is that in this case the answer is "probably" number 2. |
Can you verify if 0.9.0 beta 2 fixes this? |
I think it depends entirely on the breakage: if it's something that was supposed to break then you should not yank a beta. That's a perfect chance for people to adapt to the change, and it's great they found it early. But if it was not supposed to break, then you should yank it, because you don't want people to adapt to the change, it wasn't intentional! |
I should probably have also added "and depending on how easy it is to roll out a new beta". Depending on the project it could be something that is only practical to do once a week. I don't think people will "adapt" to a runtime crash either way. |
(Also, yes, it can depend on how likely projects using it are affected by the change, too; this was any project using unicode, which I think was large enough to be problematic). And yes, how easy it is to ship. A pure Python project is pretty easy. I'm not worrying about getting signoff from a majority of the maintainers for beta releases like I will for the full release. You could adapt by removing the unicode, I suppose. :P |
Testing it locally:
And... Traceback: Traceback (most recent call last):
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 152, in prepare_metadata_for_build_wheel
whl_basename = backend.build_wheel(metadata_directory, config_settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/mesonpy/__init__.py", line 1020, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/mesonpy/__init__.py", line 1074, in build_wheel
return project.wheel(out).name
^^^^^^^^^^^^^^^^^^
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/mesonpy/__init__.py", line 925, in wheel
return builder.build(directory)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/mesonpy/__init__.py", line 463, in build
self._wheel_write_metadata(whl)
File "/Users/henryschreiner/git/software/pybind11/tmp/.venv/lib/python3.12/site-packages/mesonpy/__init__.py", line 451, in _wheel_write_metadata
whl.writestr(f'{self._distinfo_dir}/METADATA', bytes(self._metadata.as_rfc822()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/message.py", line 194, in __bytes__
return self.as_bytes()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/message.py", line 208, in as_bytes
g.flatten(self, unixfrom=unixfrom)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 117, in flatten
self._write(msg)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 182, in _write
self._dispatch(msg)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 219, in _dispatch
meth(msg)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 446, in _handle_text
super(BytesGenerator,self)._handle_text(msg)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 263, in _handle_text
self._write_lines(payload)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 156, in _write_lines
self.write(line)
File "/usr/local/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/email/generator.py", line 420, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'ascii' codec can't encode character '\xf8' in position 94: ordinal not in range(128 |
Ahh, it's this character that's causing the issue: 🕵️. Ahh, it's only the payload that's causing issues (now). If there's a unicode character in the payload, then |
As you figured out it doesn't. I appreciate the swift attempt at a fix though!
Not 500 but a few dozen projects depend on
I thought it was this character but I might be mistaken: ø. |
In beta 1 is was ø. Now it's 🕵️, since I fixed unicode handling for bytes and str for header values, but only for str for payloads, and the readme is the payload. :) |
For what is worth, this https://github.com/pypa/setuptools/blob/56fc3111ba97470d81bf98b1be997dcf791d4a5a/setuptools/command/bdist_wheel.py#L596-L602 is what setuptools does. |
But elsewhere it does something completely different. But setuptools being the stratification of hacks it is, it is difficult to understand which code is actually executed... |
As far as I can tell, the email module doesn't support anything other than hardcoded ascii for its BinaryGenerator when it's dealing with payloads. I don't see how to get around https://github.com/python/cpython/blob/84e3f6078b5bab7d2d6413de6b089013d4aa4aa0/Lib/email/generator.py#L419-L420. The docs for It gets used here: https://github.com/python/cpython/blob/84e3f6078b5bab7d2d6413de6b089013d4aa4aa0/Lib/email/message.py#L196-L209 Otherwise, this almost works perfectly with EmailMessage: class MetadataPolicy(email.policy.EmailPolicy):
utf8 = True
mangle_from_ = False
max_line_length = 0
def header_store_parse(self, name: str, value: str) -> tuple[str, str]:
size = len(name) + 2
value = value.replace('\n', '\n' + ' ' * size)
return (name, value) |
Seems to be fine now with |
Please close if you can verify. But my little command sequence above passes, at least. :) |
I can confirm |
Thanks for testing pre-releases! |
Done. I've also invited you to the project @henryiii. |
Python 3.7.17 (still supported):
This installs
pyproject-metadata-0.9.0b1
which gives the following errorAn "ø" character in the README.rst file of the
fabio
project causes the UnicodeEncodeError error.This works fine:
The text was updated successfully, but these errors were encountered: