-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify Python code with some ruff rules SIM #13485
Conversation
d627982
to
aa0a4ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these changes appear to be about inlining conditions onto a single line, which isn't immediately obvious is always superior. And it's not even mentioned in the PR description, let alone the commit message.
In general, the SIM rules are very opinionated stylistic things that have a good reason to not be enabled by default (or for flake8, even installed by default). I would rather have better justifications for using such rules...
I'm in favor of universally getting rid of bad uses of |
aa0a4ce
to
7ccadc6
Compare
Removed if-else-block-instead-of-if-exp (SIM108)Derived from the flake8-simplify linter. Fix is sometimes available. What it doesCheck for Why is this bad?
Exampleif foo:
bar = x
else:
bar = y Use instead: bar = x if foo else y References |
7ccadc6
to
39be303
Compare
The mods in this PR are code transformations that can happen once or they can be linted for on an ongoing basis. The latter is more complex. |
@@ -87,7 +87,7 @@ def __native_headers(self, state: ModuleState, args: T.Tuple[T.List[mesonlib.Fil | |||
'@INPUT@', | |||
]) | |||
|
|||
prefix = classes[0] if not package else package | |||
prefix = package if package else classes[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix = package if package else classes[0] | |
prefix = package or classes[0] |
Some of these rules like SIM118 (see commit message) are shorter and more efficient. Others are just transforms that make the intent easier to understand with fewer lines of code (-77 lines in this PR). The first line of the git commit contains a link to the list of all As we saw with rule SIM108, some folks like certain transforms while others do not. I have grown quite used to reading ternary-if and now just write them by habit. Turning rules on or off is a choice for each project.
pylint lints with ruff and says in https://pylint.readthedocs.io/en/stable/#advised-linters-alongside-pylint
|
Thanks for pointing all that out. All this information was also my starting premise. My question was not what does the PR do, but rather, whether each choice should be made... there's a reason we didn't simply install flake8-simplify. P.S. I also know what ruff is! I package it for a linux distro, submit bug reports for it, and use it with meson for a couple things already. I don't need copious pages of copy-pasting snippets of ruff documentation to tell me what it does. And until ruff supports all the flake8 rules I'm not eager to switch. (Ruff initially didn't implement many rules because they suggested using black instead. Meson does not, and will not, use black because black produces terrible code. Even the rules that ruff has slowly added support for are mostly in preview.) |
if os.path.isabs(libname) and os.path.exists(libname): | ||
return [libname] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the intent easier to understand with fewer lines of code (-77 lines in this PR).
Simply reducing the number of lines is not the same as making the intent easier to understand. For an example, python supports rewriting this:
if os.path.isabs(libname) and os.path.exists(libname): | |
return [libname] | |
if os.path.isabs(libname) and os.path.exists(libname): return [libname] |
But most people wouldn't do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not swim upstream against PEP8.
referenced_vars = set() | ||
optnames = [x.name for x in BUILTIN_DIR_OPTIONS.keys()] | ||
optnames = [x.name for x in BUILTIN_DIR_OPTIONS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python also supports this:
referenced_vars = set() | |
optnames = [x.name for x in BUILTIN_DIR_OPTIONS.keys()] | |
optnames = [x.name for x in BUILTIN_DIR_OPTIONS] | |
referenced_vars = set(); optnames = [x.name for x in BUILTIN_DIR_OPTIONS] |
It, too, is fewer lines. It, too, is widely considered less readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8
I'm trying to point out a general principle here, that quoting dogma is not the same as stating a reason. But I suspect the disconnect here may be part and parcel of the principle that:
|
mesonbuild/utils/universal.py
Outdated
if substr in s: | ||
return True | ||
return False | ||
return any(substr in s for s in strlist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC these any() approaches are much slower. It's an interesting quirk of a low-level function whose body is basically entirely defined by either a short-circuiting for loop or a short-circuiting any()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ruff Why is this bad?
section says any()
is faster but perhaps you have a benchmark that proves it wrong.
reimplemented-builtin (SIM110)
Why is this bad?
Using a builtin function is more concise and readable. Builtins are also
more efficient than for
loops.
I am OK to turn off ruff rule SIM110
if you think that is best. I find that when I read the code out loud, the use of any()
makes the code read the way I think. Others might disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really unfortunate because I think that any()
/all()
is MUCH more readable than the for loop, and the vertical space saving is very nice. However, any seems to be about 1/3 as fast (Python 3.11.9, NixOS):
loop = 'abcdef'.split()
found = 'f'
nfound = 'g'
def test1():
for x in loop:
if x == found:
return True
return False
def test2():
return any(x == found for x in loop)
def test3():
for x in loop:
if x == nfound:
return True
return False
def test4():
return any(x == nfound for x in loop)
if __name__ == "__main__":
import timeit
print('for loop (found) :', timeit.timeit(test1))
print('for loop (not found):', timeit.timeit(test3))
print('any() (found) :', timeit.timeit(test2))
print('any() (not found) :', timeit.timeit(test4))
for loop (found) : 0.051076093994197436
for loop (not found): 0.04388196699437685
any() (found) : 0.15422860698890872
any() (not found) : 0.15568504799739458
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Python 3.13 release candidate 1 (in Docker) on M1 Mac:
for loop (found) : 0.09455837499990594
for loop (not found): 0.09435533400028362
any() (found) : 0.27777291600068565
any() (not found) : 0.27852720800001407
Should we drop SIM110 ?
I am not advocating that you switch. Ruff is not yet at v1. flake8 and pylint continue to be adequate. This PR is about looking at a few simplifying code transformations. If a rule does not improve the code then we should drop it. If none of the rules improve the code then we should close the PR.
I recognize that you wish to format code by hand so I am not suggesting that you drop flake8 or pylint. I am not aware of any flake8 rules that ruff does not support except the Exxx rules in preview that are better handled by a code formatter. I don't think that the Ruff team spends time convincing people to use psf/black anymore given that they now have
This is computer science, not religion or politics. The
I agree 100%. The initial commit |
39be303
to
96dbb2f
Compare
Removed all instances of SIM110 because it is less efficient than the code it releases. Added a pull request to amend the Ruff docs at https://github.com/astral-sh/docs/pulls |
https://docs.astral.sh/ruff/rules/#flake8-simplify-sim
Related to
%
ruff check --select=SIM --ignore=SIM105,SIM108,SIM110,SIM112,SIM114,SIM910 --fix --unsafe-fixes
%
ruff rule SIM118
in-dict-keys (SIM118)
Derived from the flake8-simplify linter.
Fix is always available.
What it does
Checks for key-existence checks against
dict.keys()
calls.Why is this bad?
When checking for the existence of a key in a given dictionary, using
key in dict
is more readable and efficient thankey in dict.keys()
,while having the same semantics.
Example
Use instead:
Fix safety
Given
key in obj.keys()
,obj
could be a dictionary, or it could beanother type that defines a
.keys()
method. In the latter case, removingthe
.keys()
attribute could lead to a runtime error. The fix is markedas safe when the type of
obj
is known to be a dictionary; otherwise, itis marked as unsafe.
References