Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Consider all definitions after terminal statements unreachable #15676

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Terminal statements

## Introduction

Terminal statements complicate a naive control-flow analysis.

As a simple example:

```py
def f(cond: bool) -> str:
if cond:
x = "test"
else:
raise ValueError
return x

def g(cond: bool):
if cond:
x = "test"
reveal_type(x) # revealed: Literal["test"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
raise ValueError
reveal_type(x) # revealed: Literal["test"]
```

In `f`, we should be able to determine that the `else` branch ends in a terminal statement, and that
the `return` statement can only be executed when the condition is true. We should therefore consider
the reference always bound, even though `x` is only bound in the true branch.

Similarly, in `g`, we should see that the assignment of the value `"unreachable"` can never be seen
by the final `reveal_type`.

## `return` is terminal

```py
def f(cond: bool) -> str:
if cond:
x = "test"
else:
return "early"
return x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, since it took me a moment on first read, even though you just explained it above:

Suggested change
return x
return x # no possibly-unresolved-reference diagnostic!


def g(cond: bool):
if cond:
x = "test"
reveal_type(x) # revealed: Literal["test"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
return
reveal_type(x) # revealed: Literal["test"]
```

## `continue` is terminal within its loop scope

```py
def f(cond: bool) -> str:
while True:
if cond:
x = "test"
else:
continue
return x

def g(cond: bool, i: int):
x = "before"
while i < 5:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to switch to for loop instead or increase i's value inside the loop, otherwise it's infinite loop when i<5.

if cond:
x = "loop"
reveal_type(x) # revealed: Literal["loop"]
else:
x = "continue"
reveal_type(x) # revealed: Literal["continue"]
continue
reveal_type(x) # revealed: Literal["loop"]
reveal_type(x) # revealed: Literal["before", "loop"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious in which combination of cond and i it'd be Literal["loop"] at here.

```

## `break` is terminal within its loop scope

```py
def f(cond: bool) -> str:
while True:
if cond:
x = "test"
else:
break
return x
return "late"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could return x here as well and expect the possibly-unresolved-reference error?


def g(cond: bool, i: int):
x = "before"
while i < 5:
if cond:
x = "loop"
reveal_type(x) # revealed: Literal["loop"]
else:
x = "break"
reveal_type(x) # revealed: Literal["break"]
break
reveal_type(x) # revealed: Literal["loop"]
reveal_type(x) # revealed: Literal["before", "loop", "break"]
```

## `return` is terminal in nested scopes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "nested scopes" is the right term here; Python if statements don't establish nested scopes. This heading would suggest a test with e.g. a nested function.

Suggested change
## `return` is terminal in nested scopes
## `return` is terminal in nested conditionals


```py
def f(cond1: bool, cond2: bool) -> str:
if cond1:
if cond2:
x = "test1"
else:
return "early"
else:
x = "test2"
return x

def g(cond1: bool, cond2: bool):
if cond1:
if cond2:
x = "test1"
reveal_type(x) # revealed: Literal["test1"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
return
reveal_type(x) # revealed: Literal["test1"]
else:
x = "test2"
reveal_type(x) # revealed: Literal["test2"]
reveal_type(x) # revealed: Literal["test1", "test2"]
```
27 changes: 22 additions & 5 deletions crates/red_knot_python_semantic/src/semantic_index/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,13 @@ impl<'db> SemanticIndexBuilder<'db> {
.record_visibility_constraint(VisibilityConstraint::VisibleIf(constraint))
}

/// Records that all remaining statements in the current block are unreachable, and therefore
/// not visible.
fn record_unreachable_visibility(&mut self) -> ScopedVisibilityConstraintId {
let id = self.add_visibility_constraint(VisibilityConstraint::AlwaysTrue);
self.record_negated_visibility_constraint(id)
}

/// Records a [`VisibilityConstraint::Ambiguous`] constraint.
fn record_ambiguous_visibility(&mut self) -> ScopedVisibilityConstraintId {
self.current_use_def_map_mut()
Expand Down Expand Up @@ -1019,11 +1026,6 @@ where
}
self.visit_body(body);
}
ast::Stmt::Break(_) => {
if self.loop_state().is_inside() {
self.loop_break_states.push(self.flow_snapshot());
}
}

ast::Stmt::For(
for_stmt @ ast::StmtFor {
Expand Down Expand Up @@ -1270,6 +1272,21 @@ where
// - https://github.com/astral-sh/ruff/pull/13633#discussion_r1788626702
self.visit_body(finalbody);
}

ast::Stmt::Raise(_) | ast::Stmt::Return(_) | ast::Stmt::Continue(_) => {
walk_stmt(self, stmt);
// Everything in the current block after a terminal statement is unreachable.
self.record_unreachable_visibility();
}

ast::Stmt::Break(_) => {
if self.loop_state().is_inside() {
self.loop_break_states.push(self.flow_snapshot());
}
// Everything in the current block after a terminal statement is unreachable.
self.record_unreachable_visibility();
}

_ => {
walk_stmt(self, stmt);
}
Expand Down
16 changes: 0 additions & 16 deletions crates/ruff_benchmark/benches/red_knot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,22 +25,6 @@ const TOMLLIB_312_URL: &str = "https://raw.githubusercontent.com/python/cpython/
static EXPECTED_DIAGNOSTICS: &[&str] = &[
// We don't support `*` imports yet:
"error[lint:unresolved-import] /src/tomllib/_parser.py:7:29 Module `collections.abc` has no member `Iterable`",
// We don't support terminal statements in control flow yet:
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:66:18 Name `s` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:98:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:101:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:104:14 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:115:14 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:126:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:348:20 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:353:5 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:453:24 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:455:9 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:482:16 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:566:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:573:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:579:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:580:63 Name `char` used when possibly not defined",
// We don't handle intersections in `is_assignable_to` yet
"error[lint:invalid-argument-type] /src/tomllib/_parser.py:626:46 Object of type `@Todo & ~AlwaysFalsy` cannot be assigned to parameter 1 (`match`) of function `match_to_datetime`; expected type `Match`",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:629:38 Name `datetime_obj` used when possibly not defined",
Expand Down
Loading