Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

noti0na1 · 2024-09-21T17:58:09Z

This PR improves the flow typing for returning and exceptions.

The NotNullInfo is defined as following now:

case class NotNullInfo(asserted: Set[TermRef] | Null, retracted: Set[TermRef]):

retracted contains variable references that are ever assigned to null;
if asserted is not null, it contains val or var references that are known to be not null, after the tree finishes executing normally (non-exceptionally);
if asserted is null, the tree is know to terminate, by throwing, returning, or calling a function with Nothing type. Hence, it acts like a universal set.

alt is defined as <a1,r1>.alt(<a2,r2>) = <a1 intersect a2, r1 union r2>.

The difficult part is the try ... catch ... finally .... We don't know at which point an exception is thrown in the body, and the catch cases may be not exhaustive, we have to collect any reference that is once retracted.

Fix #21619

olhotak · 2024-09-22T01:49:00Z

I think this is the right idea but reasoning about three sets is hard and it's hard to convince oneself that the implementation is correct for all cases.

I wonder if a simpler implementation would be possible with only two sets. Let's still call them asserted and retracted. We lift the invariant that asserted & retracted is empty, so it's possible for one variable to be in both. asserted is the set that we know is non-null at the very end of evaluating the expression (if no exception happens). retracted is the set that is retracted anywhere within the expression, even if it is later asserted. When we assert a variable, we add it to asserted but do not remove it from retracted. When we retract a variable, we add it to retracted and remove it from asserted. If we evaluate the expression to the very end, if a variable is in both asserted and retracted, we consider it asserted. When we're at a control flow point where the expression may not have executed to the end (because it threw an exception), we just clear the asserted set and keep only the retracted set. At control flow merges, we intersect asserted and union retracted. For a sequential composition (a1,r1) ; (a2,r2), we get ((a1-r2) | a2, r1|r2).

Do you think this would work or do you see any flaw in it?

I think maybe what I've described above are the current sets asserted and onceRetracted in this pull request. I guess the question is, if we have asserted and onceRetracted, can we get rid of retracted? (Then eventually we could rename onceRetracted to just retracted, to make the name shorter and simpler.)

olhotak · 2024-09-22T01:53:31Z

In the above proposal:

retracted means that the variable is set to null somewhere, anywhere in the expression
asserted means that the variable is set to non-null after all program points that could set it to null (if any)

noti0na1 · 2024-09-22T15:09:29Z

retracted means that the variable is set to null somewhere, anywhere in the expression

asserted means that the variable is set to non-null after all program points that could set it to null (if any)

This rule looks reasonable to me. I will think about more examples to double-check this.

noti0na1 · 2024-09-22T15:13:11Z

TODO: some other improvements I want for flow typing but haven't been implemented yet:

For definition like var a: String | Null = "", we should add a to asserted info immediately .
We only considered Nothing body for if right now. We should consider this for other control expressions as well.

olhotak · 2024-10-11T14:19:33Z

compiler/src/dotty/tools/dotc/typer/Nullables.scala


    /** The alternative path combination with another not-null info. Used to merge
     *  the nullability info of the two branches of an if.
     */
    def alt(that: NotNullInfo): NotNullInfo =
      NotNullInfo(this.asserted.intersect(that.asserted), this.retracted.union(that.retracted))

+    def withRetracted(that: NotNullInfo): NotNullInfo =


Here it would be useful to have a comment explaining when this function is intended to be used. The name explains what the function does, but the idea is to say why one would want to do that. (Compare with the alt function, whose name says when you want it.) It looks like the use case is similar to alt, but in a case when one branch is known to not complete execution normally (e.g. an infinite loop), only exceptionally. We can say that in the comment and/or consider incorporating that into the name somehow.

This also raises the question of what to do when both alternatives cannot complete normally. Theoretically one would want to assert all variables, but we don't have a way to do that in the implementation. Perhaps that case is not common enough in practice to care about.

I don't understand why we want to assert all variables if none of the alternatives complete normally? Do you have an example?

In general the analysis proves the following condition: in every execution trace leading to a given program point $p$, variable x is not null. When the code before $p$ does not terminate, there are no traces leading to $p$ so the condition is true for every variable. If there were an instruction derefencing x at $p$, that instruction would never execute, so it would be safe to assume that x is not null.

The specific case of withRetracted handles something like alt, but where the that branch does not complete normally. It is defined as <a1,r1>.withRetracted(<a2,r2>) = <a1, r1 union r2>. alt is defined as <a1,r1>.alt(<a2,r2>) = <a1 intersect a2, r1 union r2>. If we could represent the universal set in a2, then a1 intersect a2 would be a1, so we could implement the equivalent of withRetracted using alt.

I don't know whether actual use cases in practice justify the code complexity of adding a representation of the universal set to the implementation, but it is cleaner in theory.

If we don't do that, then the comment/name of withRetracted should suggest that it is like alt, but for the case where that cannot complete normally.

compiler/src/dotty/tools/dotc/typer/Nullables.scala

olhotak · 2024-10-11T14:31:10Z

compiler/src/dotty/tools/dotc/typer/Typer.scala

+      nnInfo = nothingCases.foldLeft(nnInfo):
+        (nni, c) => nni.withRetracted(c.notNullInfo)
+      if normalCases.nonEmpty then
+        nnInfo = nnInfo.seq(normalCases.map(_.notNullInfo).reduce(_.alt(_)))


This seems an abuse of seq, since there is no sequencing with the nothingCases; conceptually it is alt.

A conceptually cleaner approach would be to handle the normalCases first, then apply the withRetracted for the nothingCases to the result of the normalCases.

compiler/src/dotty/tools/dotc/typer/Typer.scala

tests/explicit-nulls/neg/i21380c.scala

noti0na1 · 2024-11-04T16:25:03Z

I tried to implemented a TerminatedInfo, whose asserted set acts like a universal set, but I don't like the result.

Ideally with the TerminatedInfo, we can analyse more cases where statements are followed by a terminating expression (type Nothing), like return, throw, and other applications with result type Nothing. However, the implementation becomes much more complicated, without significant improvement. We still cannot analyse any deeply-nested exceptions. In addition, I don't like the behaviour of treating all refs non-nullable after a termination point, even though I understand it is theoretical ok to do so.

If we ignore these Nothing` expressions, we would only lose some specific tracking, and I think it is an acceptable tradeoff.

val s: String | Null
if ... then 
  s = null
  return
else
  s = ""

s // treating s as non-nullable or not?

olhotak · 2024-11-04T16:47:53Z

OK, I don't think an Info with an asserted set that acts like a universal set is necessary.

Co-authored-by: Ondřej Lhoták <[email protected]>

olhotak

Otherwise LGTM.

olhotak · 2024-12-09T15:32:29Z

compiler/src/dotty/tools/dotc/typer/Nullables.scala

+    val retractedVars = ctx.notNullInfos.flatMap(info =>
+      if info.asserted == null then Set.empty
+      else info.asserted.filter(isRetracted)
+    ).toSet
    ctx.addNotNullInfo(NotNullInfo(Set(), retractedVars))


Here we lose knowledge about the loop being unreachable. The result always has asserted == Set(), even if it was null before.

I think this is fine for the NotNullInfo in the ctx. Suppose there is a terminated info in the ctx, adding a new non-terminated info will not change its behaviour: still treating all symbols as non-nullable.

olhotak · 2024-12-09T15:35:22Z

compiler/src/dotty/tools/dotc/typer/Typer.scala

  }

+  private def notNullInfoFromCases(initInfo: NotNullInfo, cases: List[CaseDef])(using Context): NotNullInfo =
+    if cases.nonEmpty then


In what instances would the cases be empty? Depending on the answer, perhaps a comment would be useful.

I don't think the cases can be empty. Just want to avoid the exception when calling reduce on empty list.

Well, it seems the empty cases can happen after inlining.

OK. A question is, if empty cases does happen, what does it mean in terms of runtime semantics? If it definitely fails to match, it would be more precise to return terminatedInfo, but initInfo is sound and safer in case empty cases means something different in some special situations.

I think it's fine to leave it as it is, and you can merge this PR now.

noti0na1 added the area:nullability label Sep 21, 2024

noti0na1 marked this pull request as ready for review September 21, 2024 23:48

noti0na1 requested a review from olhotak September 21, 2024 23:48

noti0na1 assigned olhotak Sep 21, 2024

olhotak assigned noti0na1 and unassigned olhotak Sep 22, 2024

noti0na1 force-pushed the fix-21619 branch from 4ab4b0a to 0a319be Compare October 11, 2024 04:24

noti0na1 assigned olhotak and unassigned noti0na1 Oct 11, 2024

olhotak reviewed Oct 11, 2024

View reviewed changes

olhotak assigned noti0na1 and unassigned olhotak Oct 11, 2024

noti0na1 and others added 6 commits December 6, 2024 16:59

Refactor NotNullInfo to record every reference which is retracted once.

585dda9

Use a different rule for NotNullInfo

bcc9e68

Consider cases with Nothing type

05c630a

Apply suggestions from code review

d44147b

Co-authored-by: Ondřej Lhoták <[email protected]>

Add terminated info

f859afe

Fix deep NotNullInfo

158af7d

noti0na1 force-pushed the fix-21619 branch from ada9da3 to 158af7d Compare December 6, 2024 16:01

noti0na1 requested a review from olhotak December 6, 2024 18:54

noti0na1 assigned olhotak and unassigned noti0na1 Dec 6, 2024

Treat asserted set of terminated NotNullInfo as universal set; fix test

00430c0

olhotak approved these changes Dec 9, 2024

View reviewed changes

olhotak assigned noti0na1 and unassigned olhotak Dec 9, 2024

Comment on the empty cases in notNullInfoFromCases.

200c038

noti0na1 merged commit ee0dd7a into scala:main Dec 10, 2024
29 checks passed

noti0na1 deleted the fix-21619 branch December 10, 2024 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

noti0na1 commented Sep 21, 2024 •

edited

Loading

olhotak commented Sep 22, 2024

olhotak commented Sep 22, 2024

noti0na1 commented Sep 22, 2024

noti0na1 commented Sep 22, 2024 •

edited

Loading

olhotak Oct 11, 2024

noti0na1 Oct 14, 2024

olhotak Oct 14, 2024

olhotak Oct 11, 2024

noti0na1 commented Nov 4, 2024 •

edited

Loading

olhotak commented Nov 4, 2024

olhotak left a comment

olhotak Dec 9, 2024

noti0na1 Dec 10, 2024

olhotak Dec 9, 2024

noti0na1 Dec 10, 2024

noti0na1 Dec 10, 2024

olhotak Dec 10, 2024

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Conversation

noti0na1 commented Sep 21, 2024 • edited Loading

olhotak commented Sep 22, 2024

olhotak commented Sep 22, 2024

noti0na1 commented Sep 22, 2024

noti0na1 commented Sep 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noti0na1 commented Nov 4, 2024 • edited Loading

olhotak commented Nov 4, 2024

olhotak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noti0na1 commented Sep 21, 2024 •

edited

Loading

noti0na1 commented Sep 22, 2024 •

edited

Loading

noti0na1 commented Nov 4, 2024 •

edited

Loading