Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the abstract domain of global init checker to compile http4s #22179

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

EnzeXing
Copy link
Contributor

@EnzeXing EnzeXing commented Dec 10, 2024

This is a draft PR that contains fixes to the initialization checker, so that it can successfully check http4s in community-build

[test_scala2_library_tasty]

@EnzeXing EnzeXing changed the title Refactor pattern matching, skipping cases when safe to do so Refactor pattern matching in global init checker, skipping cases when safe to do so Dec 10, 2024
@EnzeXing EnzeXing force-pushed the fix-http4s branch 2 times, most recently from c64586b to 49b8d44 Compare December 14, 2024 21:13
@EnzeXing EnzeXing marked this pull request as ready for review January 13, 2025 13:29
@EnzeXing EnzeXing changed the title Refactor pattern matching in global init checker, skipping cases when safe to do so Refactor the abstract domain of global init checker to compile http4s Feb 5, 2025
@EnzeXing
Copy link
Contributor Author

@liufengyun @olhotak I think this PR is ready for review. Could you review it?

@liufengyun
Copy link
Contributor

@EnzeXing I enabled the test_scala2_library_tasty check by modifying the PR description. Could you please touch your last commit to trigger the tests?

@@ -443,7 +443,7 @@ private sealed trait YSettings:
val YnoKindPolymorphism: Setting[Boolean] = BooleanSetting(ForkSetting, "Yno-kind-polymorphism", "Disable kind polymorphism.")
val YexplicitNulls: Setting[Boolean] = BooleanSetting(ForkSetting, "Yexplicit-nulls", "Make reference types non-nullable. Nullable types can be expressed with unions: e.g. String|Null.")
val YnoFlexibleTypes: Setting[Boolean] = BooleanSetting(ForkSetting, "Yno-flexible-types", "Disable turning nullable Java return types and parameter types into flexible types, which behave like abstract types with a nullable lower bound and non-nullable upper bound.")
val YcheckInitGlobal: Setting[Boolean] = BooleanSetting(ForkSetting, "Ysafe-init-global", "Check safe initialization of global objects.")
val YcheckInitGlobal: Setting[String] = ChoiceSetting(ForkSetting, "Ysafe-init-global", "[report-unknown, ignore-unknown]", "Check safe initialization of global objects.", List("report-unknown", "ignore-unknown", "off"), "off")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing from future, anything defined here will become some constraints when we stablize the flag. I think It's a bit too early to decide on what options to put here. Therefore, I'd suggest not change this flag.

If I understand correctly, currently the main motivation for introducing the options is for debugging instead of for end users. For the purpose, maybe we can consider using a static field or an environment variable?

@@ -225,6 +227,20 @@ class Objects(using Context @constructorOnly):
case class Fun(code: Tree, thisV: ThisValue, klass: ClassSymbol, env: Env.Data) extends ValueElement:
def show(using Context) = "Fun(" + code.show + ", " + thisV.show + ", " + klass.show + ")"

/**
* Represents common base values like Int, String, etc.
* Assumption: all methods calls on such values should be pure (no side effects)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be pure (no side effects)

It seems we can weaken the assumption to something as follows:

have no effect on initialization of global objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't trigger initialization of global objects or read/write mutable fields

@@ -92,11 +93,12 @@ class Objects(using Context @constructorOnly):
* | OfClass(class, vs[outer], ctor, args, env) // instance of a class
* | OfArray(object[owner], regions)
* | Fun(..., env) // value elements that can be contained in ValueSet
* | SafeValue // values on which method calls and fields won't cause warnings. Int, String, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fields -> field accesses

case object Cold extends Value:
def show(using Context) = "Cold"
case object UnknownValue extends Value:
def show(using Context): String = "UnknownValue"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice documentation 👍

case Cold => Cold
case UnknownValue => UnknownValue
case Package(_) => a
case SafeValue(_) => a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can do better for Package and SafeValue.

report.warning("Using unknown value. " + Trace.show, Trace.position)
Bottom
else
UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two kinds of unknowns: (1) due to widening; (2) missing source code.

Previously, we always report warning for (1), but suppress warnings for (2). Do we intend this behavior change or it's accidental due to the fact that we overload the meaning of UnknownValue for both cases?

// Assume such method is pure. Check return type, only try to analyze body if return type is not safe
val target = resolve(v.typeref.symbol.asClass, meth)
if !target.hasSource then
UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a test to ensure we do not have warnings:

object A:
  val a = f(10)
  def f(x: Int) = x * 2 + 5


case Bottom =>
Bottom

// Bottom arguments mean unreachable call
case _ if args.map(_.value).contains(Bottom) =>
Bottom
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use Bottom in the fixed-point computation. Theoretically, in the abstract domain, the bottom of the lattice can have a meaning specific to the analysis. In particular, it does not need to be the empty set of values as it's the case in type theory. It depends on the properties of programs the analysis wants to capture.

report.warning("Using unknown value. " + Trace.show, Trace.position)
Bottom
else
UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment on calling a method on an UnknownValue.

else if ref.hasVal(target) then
ref.valValue(target)
else if ref.isObjectRef && ref.klass.hasSource then
report.warning("Access uninitialized field " + field.show + ". " + Trace.show, Trace.position)
Bottom
else
// initialization error, reported by the initialization checker
Bottom
UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the two cases above, the usage of UnknownValue might lead to annoying error messages --- Bottom was used to suppress warnings.

if catchesAllOf(caseDef, tpe) then
remainingScrutinee = remainingScrutinee.remove(value)

caseResults.join
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealing with pattern matches is a headache. The code above improves the situation.

Another possibility is to just print a warning and skip analyzing it. This is a thought after reading the rule 10 of Nasa's 10 rules: https://spinroot.com/gerard/pdf/P10.pdf

I think there is still a design spot where the analysis can give up for some complicated cases while remain useful and sound for most use cases. We just need to clearly define the boundary and have nice messages.

@EnzeXing EnzeXing force-pushed the fix-http4s branch 3 times, most recently from ce6ba43 to c38dcaf Compare February 26, 2025 04:36
@EnzeXing
Copy link
Contributor Author

@liufengyun @olhotak I think it's ready for a final review. Previously testing with Scala2CCLibrary failed, but it seems to use an older version of scala2 library source than Scala2Library, so it still gives some outdated warnings in scala2 library. Shall we suppress testing with Scala2CCLibrary for now?

@@ -1,9 +1,6 @@
## See #18882
patmat.scala
t9312.scala
unapplySeq-implicit-arg.scala
unapplySeq-implicit-arg2.scala
unapplySeq-implicit-arg3.scala
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our goal is to move all tests shown here to warn-tasty since they rely on tasty, but they also trigger some warnings in scala2 library which is hard to document

@olhotak
Copy link
Contributor

olhotak commented Feb 26, 2025

@liufengyun @olhotak I think it's ready for a final review. Previously testing with Scala2CCLibrary failed, but it seems to use an older version of scala2 library source than Scala2Library, so it still gives some outdated warnings in scala2 library. Shall we suppress testing with Scala2CCLibrary for now?

I don't think we should test the initialization checker with the CC library at all. We should test the initialization checker with the Scala 2 Tasty (not CC) library.

@@ -18,6 +18,7 @@ import util.{ SourcePosition, NoSourcePosition }
import config.Printers.init as printer
import reporting.StoreReporter
import reporting.trace as log
import reporting.trace.force as forcelog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this before merging.

assert(SafeValue.safeTypeSymbols.contains(typeSymbol), "Invalid creation of SafeValue! Type = " + tpe)
def show(using Context): String = "SafeValue of " + typeSymbol.show
override def equals(that: Any): Boolean =
that.isInstanceOf[SafeValue] && that.asInstanceOf[SafeValue].typeSymbol == typeSymbol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we every read tpe out of a SafeValue? Since the identity of a SafeValue is the typeSymbol, we should make SafeValue have a typeSymbol field as its initializer, and move this code that computes the typeSymbol from a Type out into a factory method.


val Bottom = ValueSet(ListSet.empty)

/** Possible types for 'this' */
type ThisValue = Ref | Cold.type
type ThisValue = Ref | TopWidenedValue.type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the big comment at the top.

case (_, Cold) => Cold
case (TopWidenedValue, _) => TopWidenedValue
case (_, TopWidenedValue) => TopWidenedValue
case (Package(_), _) => UnknownValue // should not happen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it shouldn't happen, can we make it an assertion?

case (TopWidenedValue, _) => TopWidenedValue
case (_, TopWidenedValue) => TopWidenedValue
case (Package(_), _) => UnknownValue // should not happen
case (_, Package(_)) => UnknownValue
case (Bottom, b) => b
case (a, Bottom) => a
case (ValueSet(values1), ValueSet(values2)) => ValueSet(values1 ++ values2)
case (a : ValueElement, ValueSet(values)) => ValueSet(values + a)
case (ValueSet(values), b : ValueElement) => ValueSet(values + b)
case (a : ValueElement, b : ValueElement) => ValueSet(ListSet(a, b))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ListSet and not just Set?

@@ -1022,10 +1154,10 @@ class Objects(using Context @constructorOnly):
case fun: Fun =>
given Env.Data = Env.ofByName(sym, fun.env)
eval(fun.code, fun.thisV, fun.klass)
case Cold =>
report.warning("Calling cold by-name alias. " + Trace.show, Trace.position)
case UnknownValue | TopWidenedValue =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this warning also depend on reportUnknown?

Could we somehow factor out the multiple copies of the code that reports warnings in the case of UnknownValue or TopWidenedValue?

@@ -1036,7 +1168,7 @@ class Objects(using Context @constructorOnly):
report.warning("Calling cold by-name alias. " + Trace.show, Trace.position)
Bottom
else
Cold
UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this case ever happen? If not, it should be an assertion or internal error. If yes, add a comment to say when it can happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky question. We discussed that resolveEnv should be refactored and should never fail, so the whole case None branch will become unreachable. Shall we put the refactor of resolveEnv in another PR?

@@ -1467,17 +1606,20 @@ class Objects(using Context @constructorOnly):
end if
end if
end if
scrutinee
// TODO: receiverType is the companion object type, not the class itself;
// cannot filter scritunee by this type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scrutinee

else if target.isStaticObject then
val res = ObjectRef(target.moduleClass.asClass)
if elideObjectAccess then res
else accessObject(target)
else
thisV match
case Bottom => Bottom
case Cold => Cold
case UnknownValue => UnknownValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these ever happen? Can thisV be Bottom, UnknownValue, or TopWidenedValue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically thisV should always have type Ref (or ThisValue), but call sites of resolveThis do not have such enforcement. Shall we also refactor resolveThis in another PR?

@@ -0,0 +1,3 @@
object O:
val a: Int = 5
val b = a.toDouble
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test be able to work without Tasty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen some cases where the call to toDouble is labeled as a Select tree, but this test case does not reproduce that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants