Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor optimizations in Python code #13478

Closed
wants to merge 1 commit into from

Conversation

cclauss
Copy link
Contributor

@cclauss cclauss commented Jul 27, 2024

% ruff check --select=C4,PERF,PIE810 --statistics | grep "\[\*\]"

Edit as requested below:

Ruff has two main command verbs: check to lint Python code and format to format code in a way that is highly compatible with psf/black. The --select option allows us to choose which of the 800+ linting rules we wish to apply to the code. Here we focus on minor optimizations. The --statistics option provides only summary information.

The grep "\[\*\]" command says that we are only interested in rules that have automated fixers.

The detailed description of each rule including Why is this bad? can be obtained with a command of the form ruff rule RUF017 as demonstrated below.

24	PIE810 	[*] multiple-starts-ends-with
13	C408   	[*] unnecessary-collection-call
 8	C419   	[*] unnecessary-comprehension-in-call
 5	C416   	[*] unnecessary-comprehension
 2	C400   	[*] unnecessary-generator-list
 2	C413   	[*] unnecessary-call-around-sorted
 2	PERF102	[*] incorrect-dict-iterator
 1	C401   	[*] unnecessary-generator-set
 1	C405   	[*] unnecessary-literal-set
 1	C417   	[*] unnecessary-map

% ruff check --select=C4,PERF,PIE810 --fix --unsafe-fixes

Found 140 errors (58 fixed, 82 remaining).

% ruff rule RUF017

quadratic-list-summation (RUF017)

Derived from the Ruff-specific rules linter.

Fix is always available.

What it does

Checks for the use of sum() to flatten lists of lists, which has
quadratic complexity.

Why is this bad?

The use of sum() to flatten lists of lists is quadratic in the number of
lists, as sum() creates a new list for each element in the summation.

Instead, consider using another method of flattening lists to avoid
quadratic complexity. The following methods are all linear in the number of
lists:

  • functools.reduce(operator.iadd, lists, [])
  • list(itertools.chain.from_iterable(lists))
  • [item for sublist in lists for item in sublist]

When fixing relevant violations, Ruff defaults to the functools.reduce
form, which outperforms the other methods in microbenchmarks.

Example

lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
joined = sum(lists, [])

Use instead:

import functools
import operator


lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
functools.reduce(operator.iadd, lists, [])

References

Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the commit message could probably be a bit better (nothing huge, but I'd at least include in the extended description the command you used to reproduce the changes).

The other big thing to me is my comment below about the second commit.

mesonbuild/ast/interpreter.py Show resolved Hide resolved
mesonbuild/compilers/cuda.py Outdated Show resolved Hide resolved
@@ -233,7 +235,7 @@ def _shield_nvcc_list_arg(cls, arg: str, listmode: bool = True) -> str:
# There are single quotes. Double-quote them, and single-quote the
# strings between them.
l = [cls._shield_nvcc_list_arg(s) for s in arg.split(SQ)]
l = sum([[s, DQSQ] for s in l][:-1], []) # Interleave l with DQSQs
l = functools.reduce(operator.iadd, [[s, DQSQ] for s in l][:-1], []) # Interleave l with DQSQs
return ''.join(l)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole logic feels kinda hairy honestly. I wonder if there's a better way to do it that doesn't need additional imports...

Copy link
Contributor Author

@cclauss cclauss Jul 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hairy indeed. The commit message lists the three linear solutions and why ruff defaults to this one. Given its hairiness, let's revert this change and deal with it in a separate PR.

run_meson_command_tests.py Outdated Show resolved Hide resolved
test cases/unit/101 relative find program/foo.py Outdated Show resolved Hide resolved
mesonbuild/arglist.py Outdated Show resolved Hide resolved
mesonbuild/compilers/mixins/visualstudio.py Outdated Show resolved Hide resolved
@cclauss cclauss force-pushed the minor-optimizations branch 2 times, most recently from 039d1ec to d754adf Compare July 29, 2024 17:19
run_project_tests.py Outdated Show resolved Hide resolved
unittests/allplatformstests.py Outdated Show resolved Hide resolved
Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please no "Apply suggestions from code review" commits or other forms of fixup for a previous commit in the same PR.

@cclauss
Copy link
Contributor Author

cclauss commented Jul 30, 2024

Can someone please stop and rerun the runaway GitHub Action? Five hours is a bit too long!

@eli-schwartz
Copy link
Member

$ git --no-pager range-diff origin/master 3f998e47b 6709c1800
1:  3f998e47b ! 1:  6709c1800 minor optimizations
    @@ Metadata
      ## Commit message ##
         minor optimizations
     
    - ## ci/ciimage/build.py ##
    -@@ ci/ciimage/build.py: class ImageDef:
    -         data = json.loads(path.read_text(encoding='utf-8'))
    - 
    -         assert isinstance(data, dict)
    --        assert all([x in data for x in ['base_image', 'env']])
    -+        assert all(x in data for x in ['base_image', 'env'])
    -         assert isinstance(data['base_image'], str)
    -         assert isinstance(data['env'],  dict)
    - 
    -
      ## docs/jsonvalidator.py ##
     @@ docs/jsonvalidator.py: root: dict

Any particular reason for this? Just curious...

Copy link
Member

@dcbaker dcbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here look good to me. Once Eli is happy with everything I'm fine to merge this.

@cclauss
Copy link
Contributor Author

cclauss commented Jul 30, 2024

I reverted ci/ciimage/build.py because the three build_images jobs were failing…. I could not see how this modification would cause tests to fail but out of an abundance of caution…

@eli-schwartz
Copy link
Member

They fail in git master. Those jobs are only run on a weekly schedule or whenever the job definition itself changes -- so avoiding to change the job definition, doesn't make any difference to the test other than preventing its status from showing up.

It is perfectly fine to include that change...

@cclauss cclauss closed this Aug 16, 2024
@cclauss cclauss deleted the minor-optimizations branch August 16, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants