Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-simplify] Also simplify annotated assignments (SIM108) #15665

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

InSyncWithFoo
Copy link
Contributor

Summary

Resolves #15554.

Test Plan

cargo nextest run and cargo insta test.

@InSyncWithFoo
Copy link
Contributor Author

I also took the liberty to do some minor refactoring. Feel free to revert if they are undesired.

Copy link
Contributor

github-actions bot commented Jan 22, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+12 -0 violations, +0 -0 fixes in 4 projects; 51 projects unchanged)

apache/airflow (+6 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ airflow/models/taskinstance.py:2174:9: SIM108 Use ternary operator `map_index: int | None = None if ti.map_index < 0 else ti.map_index` instead of `if`-`else`-block
+ airflow/utils/timeout.py:85:1: SIM108 Use ternary operator `timeout: type[TimeoutWindows | TimeoutPosix] = TimeoutWindows if IS_WINDOWS else TimeoutPosix` instead of `if`-`else`-block
+ providers/src/airflow/providers/amazon/aws/hooks/logs.py:192:13: SIM108 Use ternary operator `token_arg: dict[str, str] = {"nextToken": next_token} if next_token is not None else {}` instead of `if`-`else`-block
+ providers/src/airflow/providers/amazon/aws/triggers/ecs.py:203:13: SIM108 Use ternary operator `token_arg: dict[str, str] = {"nextToken": next_token} if next_token is not None else {}` instead of `if`-`else`-block
+ providers/src/airflow/providers/google/cloud/triggers/bigquery.py:438:21: SIM108 Use ternary operator `first_job_row: str | None = None if not first_records else first_records.pop(0)` instead of `if`-`else`-block
+ providers/src/airflow/providers/google/cloud/triggers/bigquery.py:445:21: SIM108 Use ternary operator `second_job_row: str | None = None if not second_records else second_records.pop(0)` instead of `if`-`else`-block

bokeh/bokeh (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ src/bokeh/command/subcommands/file_output.py:148:9: SIM108 Use ternary operator `outputs: list[str] = [] if args.output is None else list(args.output)` instead of `if`-`else`-block

pypa/cibuildwheel (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ cibuildwheel/oci_container.py:443:9: SIM108 Use ternary operator `output_io: IO[bytes] = io.BytesIO() if capture_output else sys.stdout.buffer` instead of `if`-`else`-block

zulip/zulip (+4 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ zerver/actions/bots.py:145:9: SIM108 Use ternary operator `stream_name: str | None = stream.name if stream else None` instead of `if`-`else`-block
+ zerver/actions/bots.py:186:9: SIM108 Use ternary operator `stream_name: str | None = stream.name if stream else None` instead of `if`-`else`-block
+ zerver/lib/narrow.py:339:9: SIM108 Use ternary operator `maybe_negate: ConditionTransform = not_ if negated else lambda cond: cond` instead of `if`-`else`-block
+ zilencer/management/commands/calculate_first_visible_message_id.py:31:9: SIM108 Use ternary operator `realms: Iterable[Realm] = Realm.objects.all() if target_realm is None else [target_realm]` instead of `if`-`else`-block

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
SIM108 12 12 0 0 0

@MichaReiser
Copy link
Member

MichaReiser commented Jan 22, 2025

Thanks for working on this. As mentioned on the issue, I'm not sure if we should do this (as discussed in the issue) because it can lead to new or different typing errors.

@InSyncWithFoo
Copy link
Contributor Author

I made extra care only to rewrite when the body branch has an annotation but the else branch doesn't. This is perhaps safe enough, as the pattern is well-recognized by various type checkers (Mypy, Pyright, Pyre, Pytype, perhaps even PyCharm; Red Knot too will need to support it).

@Daverball
Copy link
Contributor

Daverball commented Jan 22, 2025

In fact annotating both branches is considered a redefinition error (even if both annotations are the same), so this should be fine.

What I do think is worth looking into however is if we can handle type declarations that have been moved outside the two branches. I find myself doing that quite often, since it makes it more clear that the variable is shared between the two branches and that the value will be used afterwards.

So concretely the following:

x: int
if foo:
    x = 5
else:
    x = 0

becomes:

x: int = 5 if foo else 0

rather than

x: int
x = 5 if foo else 0

@InSyncWithFoo
Copy link
Contributor Author

x: int
x = 5 if foo else 0

This should probably be handled by another rule, similar to the UP031/UP032 and FURB140/RUF058 pairs. Such a rule doesn't exist yet, however.

@MichaReiser
Copy link
Member

In fact annotating both branches is considered a redefinition error (even if both annotations are the same), so this should be fine.

I'd have to test if that's true for all existing type checkers or is this defined anywhere in the typing spec?

I'm still leaning towards leaving it as is because I currently can't prioritize doing the necessary research to see if this is safe for all type checker and while nice, it doesn't seem urgent. I can try to find time to review this change if someone invests the necessary time and documents the PR accordingly (where does it provide fixes, why, and why is it safe in those circumstnaces)

@InSyncWithFoo
Copy link
Contributor Author

It is rather hard to explain why the fix is safe (it is actually always marked as unsafe). Here's how I would explain it for someone who has basic knowledge of static typing in Python:

This pattern is recognized by major type checkers, which won't consider the second assignment as a redeclaration. Simplifying the if block to a single assignment thus won't cause a type checking error.

On the other hand, if both assignments have annotations or only the second have, some type checkers might report an error:

(playgrounds: Mypy, Pyright, Pyre)

if flag:
    a: int = 1
else:
    a = 2       # mypy, pyright, pyre, pytype => fine, not redeclaration

if flag:
    b: int = 1
else:
    b: int = 2  # pyright, pyre, pytype       => fine, not redeclaration
                # mypy                        => error, redeclaration

if flag:
    c = 1
else:
    c: int = 2  # pyright, pyre, pytype       => fine, not declaration
                # mypy                        => error, declaration

Is that easy to understand?

@Daverball
Copy link
Contributor

I think the other thing @MichaReiser is worried about is inference behavior if one of the branches doesn't match the annotated type of the other branch.

Unfortunately it seems like pytype is actually extremely forgiving here, so the following would have a different output for pytype:

if flag:
    a: int = 1
else:
    a = '2'
reveal_type(a)  # Union[int, str]

than the reformatted version:

a: int = 1 if flag else '2'  # error: annotation-type-mismatch
reveal_type(a)  # int

But other than that I don't think there's any type checker that would pass the above example. Even pyre disallows it, but it allows you to have different annotations for each branch and then infers the union outside the branch, just like pytype.

@InSyncWithFoo
Copy link
Contributor Author

Unfortunately it seems like pytype is actually extremely forgiving here [...]

I think this is not too big a problem, considering that the fix is marked as unsafe, but, yes, it is nevertheless a major drawback of the PR.

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule needs-decision Awaiting a decision from a maintainer labels Jan 22, 2025
@MichaReiser
Copy link
Member

I think this airflow example is interesting because it changes the declared types:

https://github.com/apache/airflow/blob/66b1712acc548a582267b7fc5e87149ca9cbb59e/airflow/utils/timeout.py#L85-L88

@InSyncWithFoo
Copy link
Contributor Author

InSyncWithFoo commented Jan 22, 2025

@MichaReiser I'm not sure if I follow. That block would be simplified to:

timeout: type[TimeoutWindows | TimeoutPosix] = TimeoutWindows if IS_WINDOWS else TimeoutPosix

...which is perfectly fine type-checking-wise as far as I can tell.

@MichaReiser
Copy link
Member

It's fine for Windows but it results in a wider type for unix platforms where the type was simply:

timeout: type[TimeoutPosix] = TimeoutPosix

at least if the type checker understands the platform-specific branches.

@InSyncWithFoo
Copy link
Contributor Author

@MichaReiser Normally sys.platform is used for that purpose:

import sys

if sys.platform == 'win32':
	timeout = TimeoutWindows
else:
	timeout = TimeoutPosix

SIM108 does indeed recognize this as well as TYPE_CHECKING. As for custom variables, that should be the topic of another issue/PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIM108 false negative when variable has type annotation
3 participants