Simplify Python code with some ruff rules SIM #13485

cclauss · 2024-07-30T08:34:44Z

https://docs.astral.sh/ruff/rules/#flake8-simplify-sim

Related to

Minor optimizations in Python code #13478

% ruff check --select=SIM --ignore=SIM105,SIM108,SIM110,SIM112,SIM114,SIM910 --fix --unsafe-fixes

Found 224 errors (139 fixed, 85 remaining).

% ruff rule SIM118

in-dict-keys (SIM118)

Derived from the flake8-simplify linter.

Fix is always available.

What it does

Checks for key-existence checks against dict.keys() calls.

Why is this bad?

When checking for the existence of a key in a given dictionary, using
key in dict is more readable and efficient than key in dict.keys(),
while having the same semantics.

Example

key in foo.keys()

Use instead:

key in foo

Fix safety

Given key in obj.keys(), obj could be a dictionary, or it could be
another type that defines a .keys() method. In the latter case, removing
the .keys() attribute could lead to a runtime error. The fix is marked
as safe when the type of obj is known to be a dictionary; otherwise, it
is marked as unsafe.

References

Python documentation: Mapping Types

eli-schwartz

Most of these changes appear to be about inlining conditions onto a single line, which isn't immediately obvious is always superior. And it's not even mentioned in the PR description, let alone the commit message.

In general, the SIM rules are very opinionated stylistic things that have a good reason to not be enabled by default (or for flake8, even installed by default). I would rather have better justifications for using such rules...

unittests/linuxliketests.py

mesonbuild/backend/ninjabackend.py

mesonbuild/compilers/mixins/clike.py

mesonbuild/compilers/mixins/visualstudio.py

run_project_tests.py

unittests/linuxliketests.py

unittests/machinefiletests.py

dcbaker · 2024-07-30T18:35:30Z

I'm in favor of universally getting rid of bad uses of .keys() as well as replacing loops with any/all, but I agree wtih @eli-schwartz that not all of these changes (especially always using ternaries) is a blanket improvement. I'd also really like to see any of these that we're going to turn on have a lint that is permanently on in Meson, either through pylint or flake8

cclauss · 2024-07-30T19:35:42Z

Removed ruff rule SIM108

if-else-block-instead-of-if-exp (SIM108)

Derived from the flake8-simplify linter.

Fix is sometimes available.

What it does

Check for if-else-blocks that can be replaced with a ternary operator.

Why is this bad?

if-else-blocks that assign a value to a variable in both branches can
be expressed more concisely by using a ternary operator.

Example

if foo:
    bar = x
else:
    bar = y

Use instead:

bar = x if foo else y

References

Python documentation: Conditional expressions

cclauss · 2024-07-30T22:30:22Z

I'd also really like to see any of these that we're going to turn on have a lint that is permanently on in Meson, either through pylint or flake8

The mods in this PR are code transformations that can happen once or they can be linted for on an ongoing basis. The latter is more complex. flake8 plus https://pypi.org/project/flake8_simplify could be configured to lint the changes in this PR. There are no automated fixers with flake8 as there are with ruff but flake can lint code to catch regressions.

cclauss · 2024-07-30T22:46:11Z

mesonbuild/modules/java.py

@@ -87,7 +87,7 @@ def __native_headers(self, state: ModuleState, args: T.Tuple[T.List[mesonlib.Fil
            '@INPUT@',
        ])

-        prefix = classes[0] if not package else package
+        prefix = package if package else classes[0]


Suggested change

prefix = package if package else classes[0]

prefix = package or classes[0]

cclauss · 2024-07-30T23:51:12Z

I would rather have better justifications for using such rules...

Some of these rules like SIM118 (see commit message) are shorter and more efficient. Others are just transforms that make the intent easier to understand with fewer lines of code (-77 lines in this PR). The first line of the git commit contains a link to the list of all SIM rules and each of those is a link to a full page description of that rule. The git commit also demonstrates ruff rule SIM118 command which can provide useful on-the-fly docs.

As we saw with rule SIM108, some folks like certain transforms while others do not. I have grown quite used to reading ternary-if and now just write them by habit. Turning rules on or off is a choice for each project.

https://docs.astral.sh/ruff

Ruff can be used to replace Flake8 (plus dozens of plugins), Black, isort, pydocstyle, pyupgrade, autoflake, and more, all while executing tens or hundreds of times faster than any individual tool.

pylint lints with ruff and says in https://pylint.readthedocs.io/en/stable/#advised-linters-alongside-pylint

Advised linters alongside pylint

Projects that you might want to use alongside pylint include ruff (really fast, with builtin auto-fix and a large number of checks taken from popular linters, but implemented in rust)

eli-schwartz · 2024-07-31T01:03:35Z

Thanks for pointing all that out. All this information was also my starting premise.

My question was not what does the PR do, but rather, whether each choice should be made... there's a reason we didn't simply install flake8-simplify.

P.S. I also know what ruff is! I package it for a linux distro, submit bug reports for it, and use it with meson for a couple things already. I don't need copious pages of copy-pasting snippets of ruff documentation to tell me what it does. And until ruff supports all the flake8 rules I'm not eager to switch. (Ruff initially didn't implement many rules because they suggested using black instead. Meson does not, and will not, use black because black produces terrible code. Even the rules that ruff has slowly added support for are mostly in preview.)

eli-schwartz · 2024-07-31T01:07:50Z

mesonbuild/compilers/mixins/emscripten.py

+        if os.path.isabs(libname) and os.path.exists(libname):
+            return [libname]


make the intent easier to understand with fewer lines of code (-77 lines in this PR).

Simply reducing the number of lines is not the same as making the intent easier to understand. For an example, python supports rewriting this:

Suggested change

if os.path.isabs(libname) and os.path.exists(libname):

return [libname]

if os.path.isabs(libname) and os.path.exists(libname): return [libname]

But most people wouldn't do that.

Let's not swim upstream against PEP8.

eli-schwartz · 2024-07-31T01:09:23Z

mesonbuild/modules/pkgconfig.py

        referenced_vars = set()
-        optnames = [x.name for x in BUILTIN_DIR_OPTIONS.keys()]
+        optnames = [x.name for x in BUILTIN_DIR_OPTIONS]


python also supports this:

Suggested change

referenced_vars = set()

optnames = [x.name for x in BUILTIN_DIR_OPTIONS.keys()]

optnames = [x.name for x in BUILTIN_DIR_OPTIONS]

referenced_vars = set(); optnames = [x.name for x in BUILTIN_DIR_OPTIONS]

It, too, is fewer lines. It, too, is widely considered less readable.

eli-schwartz · 2024-07-31T01:13:08Z

I'm trying to point out a general principle here, that quoting dogma is not the same as stating a reason. But I suspect the disconnect here may be part and parcel of the principle that:

because an automated linter includes the selectable option to say you CAN do it, therefore you SHOULD do it

eli-schwartz · 2024-07-31T02:06:18Z

mesonbuild/utils/universal.py

-        if substr in s:
-            return True
-    return False
+    return any(substr in s for s in strlist)


IIRC these any() approaches are much slower. It's an interesting quirk of a low-level function whose body is basically entirely defined by either a short-circuiting for loop or a short-circuiting any().

The ruff Why is this bad? section says any() is faster but perhaps you have a benchmark that proves it wrong.

% ruff rule SIM110

reimplemented-builtin (SIM110)

Why is this bad?

Using a builtin function is more concise and readable. Builtins are also
more efficient than for loops.

I am OK to turn off ruff rule SIM110 if you think that is best. I find that when I read the code out loud, the use of any() makes the code read the way I think. Others might disagree.

It's really unfortunate because I think that any()/all() is MUCH more readable than the for loop, and the vertical space saving is very nice. However, any seems to be about 1/3 as fast (Python 3.11.9, NixOS):

loop = 'abcdef'.split() found = 'f' nfound = 'g' def test1(): for x in loop: if x == found: return True return False def test2(): return any(x == found for x in loop) def test3(): for x in loop: if x == nfound: return True return False def test4(): return any(x == nfound for x in loop) if __name__ == "__main__": import timeit print('for loop (found) :', timeit.timeit(test1)) print('for loop (not found):', timeit.timeit(test3)) print('any() (found) :', timeit.timeit(test2)) print('any() (not found) :', timeit.timeit(test4))

for loop (found) : 0.051076093994197436 for loop (not found): 0.04388196699437685 any() (found) : 0.15422860698890872 any() (not found) : 0.15568504799739458

On Python 3.13 release candidate 1 (in Docker) on M1 Mac:

for loop (found) : 0.09455837499990594 for loop (not found): 0.09435533400028362 any() (found) : 0.27777291600068565 any() (not found) : 0.27852720800001407

Should we drop SIM110 ?

cclauss · 2024-07-31T11:22:01Z

until ruff supports all the flake8 rules I'm not eager to switch.

I am not advocating that you switch. Ruff is not yet at v1. flake8 and pylint continue to be adequate. This PR is about looking at a few simplifying code transformations. If a rule does not improve the code then we should drop it. If none of the rules improve the code then we should close the PR.

(Ruff initially didn't implement many rules because they suggested using black instead. Meson does not, and will not, use black because black produces terrible code. Even the rules that ruff has slowly added support for are mostly in preview.)

I recognize that you wish to format code by hand so I am not suggesting that you drop flake8 or pylint. I am not aware of any flake8 rules that ruff does not support except the Exxx rules in preview that are better handled by a code formatter. I don't think that the Ruff team spends time convincing people to use psf/black anymore given that they now have ruff format.

quoting dogma is not the same as stating a reason.

This is computer science, not religion or politics. The Why is this bad? section for each rule is a curated explanation. If the explanation does not justify the change then we should drop that rule.

because an automated linter includes the selectable option to say you CAN do it, therefore you SHOULD do it

I agree 100%. The initial commit --ignored five SIM rules and through our discussions, we have added a sixth.

cclauss · 2024-08-02T12:47:42Z

Removed all instances of SIM110 because it is less efficient than the code it releases.

Added a pull request to amend the Ruff docs at https://github.com/astral-sh/docs/pulls

cclauss requested review from dcbaker, mensinda, xclaesse and jpakkane as code owners July 30, 2024 08:34

cclauss force-pushed the simplify-with-ruff-rules-sim branch from d627982 to aa0a4ce Compare July 30, 2024 08:55

cclauss changed the title ~~Siimplify Python code with some ruff rules SIM~~ Simplify Python code with some ruff rules SIM Jul 30, 2024

eli-schwartz reviewed Jul 30, 2024

View reviewed changes

unittests/linuxliketests.py Show resolved Hide resolved

dcbaker requested changes Jul 30, 2024

View reviewed changes

cclauss force-pushed the simplify-with-ruff-rules-sim branch from aa0a4ce to 7ccadc6 Compare July 30, 2024 19:33

cclauss force-pushed the simplify-with-ruff-rules-sim branch from 7ccadc6 to 39be303 Compare July 30, 2024 22:11

cclauss commented Jul 30, 2024

View reviewed changes

eli-schwartz reviewed Jul 31, 2024

View reviewed changes

Siimplify Python code with some ruff rules SIM

96dbb2f

cclauss force-pushed the simplify-with-ruff-rules-sim branch from 39be303 to 96dbb2f Compare August 2, 2024 12:44

cclauss closed this Aug 16, 2024

cclauss deleted the simplify-with-ruff-rules-sim branch August 16, 2024 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify Python code with some ruff rules SIM #13485

Simplify Python code with some ruff rules SIM #13485

cclauss commented Jul 30, 2024 •

edited

Loading

eli-schwartz left a comment

dcbaker commented Jul 30, 2024

cclauss commented Jul 30, 2024

cclauss commented Jul 30, 2024 •

edited

Loading

cclauss Jul 30, 2024

cclauss commented Jul 30, 2024

Advised linters alongside pylint

eli-schwartz commented Jul 31, 2024

eli-schwartz Jul 31, 2024

cclauss Jul 31, 2024

eli-schwartz Jul 31, 2024

cclauss Jul 31, 2024

eli-schwartz commented Jul 31, 2024

eli-schwartz Jul 31, 2024

cclauss Jul 31, 2024 •

edited

Loading

dcbaker Jul 31, 2024

cclauss Jul 31, 2024 •

edited

Loading

cclauss commented Jul 31, 2024 •

edited

Loading

cclauss commented Aug 2, 2024

	prefix = package if package else classes[0]
	prefix = package or classes[0]

		if os.path.isabs(libname) and os.path.exists(libname):
		return [libname]

Simplify Python code with some ruff rules SIM #13485

Simplify Python code with some ruff rules SIM #13485

Conversation

cclauss commented Jul 30, 2024 • edited Loading

in-dict-keys (SIM118)

What it does

Why is this bad?

Example

Fix safety

References

eli-schwartz left a comment

Choose a reason for hiding this comment

dcbaker commented Jul 30, 2024

cclauss commented Jul 30, 2024

if-else-block-instead-of-if-exp (SIM108)

What it does

Why is this bad?

Example

References

cclauss commented Jul 30, 2024 • edited Loading

cclauss Jul 30, 2024

Choose a reason for hiding this comment

cclauss commented Jul 30, 2024

Advised linters alongside pylint

eli-schwartz commented Jul 31, 2024

eli-schwartz Jul 31, 2024

Choose a reason for hiding this comment

cclauss Jul 31, 2024

Choose a reason for hiding this comment

eli-schwartz Jul 31, 2024

Choose a reason for hiding this comment

cclauss Jul 31, 2024

Choose a reason for hiding this comment

eli-schwartz commented Jul 31, 2024

eli-schwartz Jul 31, 2024

Choose a reason for hiding this comment

cclauss Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

reimplemented-builtin (SIM110)

Why is this bad?

dcbaker Jul 31, 2024

Choose a reason for hiding this comment

cclauss Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

cclauss commented Jul 31, 2024 • edited Loading

cclauss commented Aug 2, 2024

cclauss commented Jul 30, 2024 •

edited

Loading

cclauss commented Jul 30, 2024 •

edited

Loading

cclauss Jul 31, 2024 •

edited

Loading

cclauss Jul 31, 2024 •

edited

Loading

cclauss commented Jul 31, 2024 •

edited

Loading