Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-type-checking] Adds implementation for TC007 and TC008 #12927

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

Daverball
Copy link
Contributor

@Daverball Daverball commented Aug 16, 2024

This is part of a series of pull requests fulfilling my promise in #9573 to port some of the enhancements and new rules from flake8-type-checking to ruff.

Summary

This adds the missing flake8-type-checking rules TC007 and TC008 including auto fixes.

There is some overlap between TC007 and TC004, currently the existing rule takes precedence, although ideally we would still emit both violations and share a fix for overlaps based on the selected strategy (if it is possible for violations of two different rules to share the same fix). We could probably move the analysis for TC007 into the same function as TC004 in order to accomplish that. But we could defer that to a future pull request.

There is also some potential overlap between TC008 and TC010. Currently TC010 is prioritized, but since it currently has no fix it might make more sense to either prioritize TC008 or add a fix for TC010, although that could be covered by a future pull request.

Test Plan

cargo nextest run

The implementation for TCH007 is incomplete and also requires some
changes to TCH004 in order to avoid conflicts between TCH004 and TCH007
Copy link
Contributor

github-actions bot commented Aug 16, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+6 -0 violations, +0 -0 fixes in 2 projects; 52 projects unchanged)

latchbio/latch (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ src/latch_cli/snakemake/config/utils.py:13:33: TC008 [*] Remove quotes from type alias

zulip/zulip (+5 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ analytics/views/stats.py:342:14: TC008 [*] Remove quotes from type alias
+ analytics/views/stats.py:344:16: TC008 [*] Remove quotes from type alias
+ confirmation/models.py:76:5: TC008 [*] Remove quotes from type alias
+ confirmation/models.py:77:5: TC008 [*] Remove quotes from type alias
+ zerver/lib/push_notifications.py:75:49: TC008 [*] Remove quotes from type alias

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
TC008 6 6 0 0 0

@Daverball

This comment was marked as resolved.

Copy link

codspeed-hq bot commented Aug 16, 2024

CodSpeed Performance Report

Merging #12927 will not alter performance

Comparing Daverball:feat/tch007-tch008 (c8e0d99) with main (e25e704)

Summary

✅ 32 untouched benchmarks

@Daverball

This comment was marked as resolved.

@Daverball

This comment was marked as resolved.

@Daverball

This comment was marked as resolved.

@Daverball
Copy link
Contributor Author

@charliermarsh It might be worth to move the TCH010 logic to where the TCH008 logic currently sits, so we can actually autofix TCH010 in the same cases where we would otherwise emit a TCH008 by removing the quotes when we know that the quoted expression should be entirely available at runtime.

Concretely we would pass it at the stage where we parse deferred string annotations and check if the current parent expression is a | BinOp. Rather than walk the AST of type definitions to collect the string literals (which doesn't appear to currently properly handle deeply nested cases like eg. list["Foo" | None]).

The drawback would be that it may be more difficult to create an autofix in the opposite case that expands the quotes to the entire type expression, since we would need to walk the parent nodes in order to determine the root node of the type expression.

@Daverball Daverball marked this pull request as ready for review August 20, 2024 16:01
@MichaReiser MichaReiser added the rule Implementing or modifying a lint rule label Sep 2, 2024
@charliermarsh
Copy link
Member

Thanks @Daverball, sorry that I haven't had a chance to review this yet.

@Daverball
Copy link
Contributor Author

@charliermarsh Just a friendly reminder in case this PR slipped through the cracks somehow

@dhruvmanila dhruvmanila added the preview Related to preview mode features label Nov 18, 2024
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This overall looks good to me, but I know very little about typing or that part of the code base. That's why it would be great to get a second opinion from either @carljm or @AlexWaygood

crates/ruff_linter/src/rules/flake8_type_checking/mod.rs Outdated Show resolved Hide resolved
/// from foo import Foo
/// OptFoo: TypeAlias = "Foo | None"
/// ```
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we can fill the gap for new rules. But I agree, filling the gap for existing rules can be done in a separate PR

Comment on lines +380 to +381
// FIXME: This does not seem quite right, if only TC004 is enabled
// then we don't need to collect the runtime imports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason we can't fix this today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, I just didn't want to introduce unrelated changes to this PR, so I opted for a comment as reminder to myself to clean this up in a follow-up pull request.

crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
crates/ruff_python_semantic/src/binding.rs Outdated Show resolved Hide resolved
Comment on lines 665 to 666
// FIXME: Shouldn't this happen above where `class_variables_visible` is set?
seen_function |= scope.kind.is_function();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on why you think this should happen above?

My assumption is that it's here because it updates the state for the next iteration. The only possible issue I see is that it fails updating the seen_function if the self.bindings[binding_id].kind is an Annotation. I, unfortunately, don't have a good enough understanding of this code to assess whether this is a) possible and b) a problem

Copy link
Contributor Author

@Daverball Daverball Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly confident it's incorrect to skip updating seen_function if you see a BindingKind::Annotation.

That being said, situations where this would actually lead to the wrong result should be incredibly rare, since the corresponding Python code itself would be highly suspect and probably incorrect.

/// Simulates a runtime load of a given [`ast::ExprName`].
///
/// This should not be run until after all the bindings have been visited.
pub fn simulate_runtime_load(&self, name: &ast::ExprName) -> Option<BindingId> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'm the right person reviewing this. @AlexWaygood or @carljm could either of you take a look to see if that makes sense (overall)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look and left some comments, though I think familiarity with Ruff's semantic model would make @AlexWaygood a better reviewer here.

Ensures there's no circularity issues between `TCH007` and `TCH008` by
always running test cases with both rules enabled.
crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
Comment on lines +672 to +673
/// Simulates a runtime load of a given [`ast::ExprName`].
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was very much unclear to me, until I spent quite a while reading through the code, what it meant to "simulate a runtime load" and why or how this is different from what lookup_symbol and resolve_load do.

It seems the differences are a) this method ignores typing-only bindings (this explains the name choice), and b) this method is supposed to run after all bindings have been collected, but still do a flow-sensitive lookup (i.e. find the applicable binding for a name at a particular point in control flow, which is not necessarily the end of the scope.) Or semi-flow-sensitive, anyway; flow-sensitive as approximated by "source order within the scope, with no understanding of branches."

Difference (a) makes sense, though taken alone it seems like it could be a flag rather than an entirely separate method.

I have a few questions/thoughts about difference (b), some of which may be due to lack of familiarity with Ruff and its semantic model.

It seems like this can be an improvement over Ruff's current model in some cases. But why is this PR the one introducing this capability? What is it about TCH007 and TCH008 that require this capability, when Ruff's many other rules involving looking up name bindings have so far gotten away without this?

My understanding is that largely Ruff handles this by doing name lookups while constructing the semantic model, so inapplicable bindings just don't exist yet. Is that approach not workable in this case? Why do we need a method that must instead be run after all bindings are collected, but then ignores the "late" ones, in inner scopes?

If this capability isn't critical to the functioning of the new rules specifically, but is really a more general improvement to Ruff's semantic model, perhaps it should be looked at more holistically in view of all Ruff rules, and not introduced as part of this PR?

If we do definitely need this new method for these new rules to work at all, then I think this also should be more clearly described in a longer doc comment on the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that largely Ruff handles this by doing name lookups while constructing the semantic model, so inapplicable bindings just don't exist yet. Is that approach not workable in this case? Why do we need a method that must instead be run after all bindings are collected, but then ignores the "late" ones, in inner scopes?

The main reason is that TCH007/TCH008 perform speculative lookups, rather than the ones that would actually happen at runtime, to ensure that we don't produce violations for the parts of the type alias value that need to available at runtime and vice-versa. (in fact they actually perform both the speculative and regular lookup to avoid violations/fixes going in circles)

The normal flow of the semantic model does not cover the speculative nature of questions like "what if this annotation wasn't a forward reference" or "what if this annotation was turned into a forward reference?".

There are certainly other ways to implement this check (i.e. performing the speculative lookup at the right time during semantic analysis), but this seemed like the least disruptive one (especially since at the time I wrote this there wasn't yet a cache for parsing string annotations) and matches my implementation in the flake8-plugin I wrote, so I can be more confident that it works and produces zero false positives in several large code-bases of mine.

I was also worried that speculatively walking the type definition earlier/later than it was supposed to could lead to other rules triggering or unwanted state leaking into the semantic model. It would be difficult to ensure that this excursion would have no unwanted side-effects.

Difference (a) makes sense, though taken alone it seems like it could be a flag rather than an entirely separate method.

And in fact in my own implementation in the flake8 plugin they are the same method with a flag. However the potentially incorrect placement of the seen_function update prompted me to keep this method separate for now, in order to avoid potentially subtle regressions in lookup_symbol.

We could either take the risk and merge these methods now or put it off until later, when we're more confident that the method is stable.

/// Simulates a runtime load of a given [`ast::ExprName`].
///
/// This should not be run until after all the bindings have been visited.
pub fn simulate_runtime_load(&self, name: &ast::ExprName) -> Option<BindingId> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look and left some comments, though I think familiarity with Ruff's semantic model would make @AlexWaygood a better reviewer here.

crates/ruff_python_semantic/src/model.rs Outdated Show resolved Hide resolved
crates/ruff_python_semantic/src/model.rs Outdated Show resolved Hide resolved
crates/ruff_python_semantic/src/model.rs Outdated Show resolved Hide resolved
@Daverball Daverball changed the title [flake8-type-checking] Adds implementation for TCH007 and TCH008 [flake8-type-checking] Adds implementation for TC007 and TC008 Nov 20, 2024
@Daverball Daverball changed the title [flake8-type-checking] Adds implementation for TC007 and TC008 [flake8-type-checking] Adds implementation for TC007 and TC008 Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants