-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mystery solved! A fix for the infamous 5 year old \x bug, that is driving users away. #422
Comments
Code from the gists for inspection, so you don't have to download random untrusted .py files:
test_bug_148_fixed.py
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The test function test_bug_148 in /tests/test_api.py was incorrectly coded five years ago [^1]. This has prevented /uiri/toml's automated testing and CI from showing that the original fix for bug 148 was also incorrect (a notoriously difficult to understand block of code in encoder._dump_str) .
Guys, it's about time we put that right!
Over the last 5 years, numerous frustrated users have raised multiple separate issues to no avail [^3], which unbeknownst to them, related to this unfixed bug, which this flawed test enabled to persist undetected. Furthermore, a couple of projects [^4] have migrated to tomli-w. And the core Python developers decided against /uiri/toml, to read TOML natively in Python 3.11.
[^1] This gist of mine contains two test files that prove this, test_bug_148_original.py and test_bug_148_fixed.py
Firstly, in test_bug_148_original.py, the original test function has been abstracted into a factory returning a function
that simply applies that same test to any module containing a dumps function. It also contains a second factory, that
uses exactly the same string literals (and dictionaries) as in the original test_bug_148, but applies toml.loads to the
LHSs instead of toml.dumps to the RHSs of the three equations. [^5]
test_bug_148_original.py enables a fair comparison between toml, tomli-w, tomlkit and toml_tools using
the original test function test_bug_148. Running test_bug_148_original.py with pytest in a venv (with all the deps installed), one test passes, but all 7 other tests fail (8 in >= Python 3.11). toml passes its own test as would be expected. But not only do Python 3.11's native tomllib, its precursor tomli, tomli-w, tomlkit, and toml_tools (and my own fork of toml) all fail it, toml.loads fails the inverse version of this same test, that only toml.dumps passes.
I assert test_bug_148 was incorrectly coded and hope you agree. But how should it be fixed, or what should replace it?
Inside the second file in the gist, test_bug_148_correctly.py the code is much the same, but contains what I believe to be the correct string literals, to achieve what /uiri originally intended to test (that a Python string literal r'\x' is encoded in a TOML
quoted string literal r'\x's, even when (and only when) an even number of backslashes precedes it in its repr string). This
not only allows a fair comparison between toml, tomli/tomllib, tomlkit and toml_tools as before, but between toml.dumps and toml.loads.
Running this we find that 7 tests pass (8 in >= Python 3.11), and only toml.dumps fails its test function from the new factory (for encoders).
Who doesn't love the colourised output from pytest?:
Same results in Ubuntu 22.04 WSL, Python 3.10:
I believe this was an honest mistake on the part of /uiri . Considering how they chose to write the original bug fix by analysing strings almost manually, instead of using a regular expression, and how they used normal string literals instead of r-string literals containing half the number of \s in test_bug_148, perhaps their test function was a victim of the backslash plague.
I will submit PRs shortly that correct these issues, one containing test functions from the factories in test_bug_148_correctly.py and one containing the fix I wrote for my fork (toml_tools -I have my own specific requirements (Iron Python) that mean I do need my own fork of toml).
[^3] These issues include:
#411
#404
#261
#201
and more
[^4] Projects that have migrated from toml to tomli-w et al:
davidfokkema/tailor@ece266a
gramineproject/gramine@5619486
remarshal-project/remarshal#19 (comment)
dictation-toolbox/Caster@0ba75e1
[^5] Coding note/ apology. I could've defined the string literals as constants, to guaranteed the second factory uses the same values as the first. However I wanted to demonstrate to anyone reading that it contains exactly the same code as in /toml/tests/tests/test_api.py:test_bug_148
FYI, in your own code, using pytest.mark.parametrize would achieve the same goal more concisely. But similarly, I wanted to be transparent, and to produce understandable test results in the command line output via the test functions' names.
The text was updated successfully, but these errors were encountered: