Subtype checking of type lambdas with bounds is too restrictive #17919
Replies: 30 comments
-
Can you extend this to the case where the upper and lower bounds of L have different kinds ? e.g.: trait A {
type L >: [Y <: Any] => Any <: [Y <: String] => Any
} |
Beta Was this translation helpful? Give feedback.
-
Yes, you can. But the two bounds need to have at least the same simple kind, i.e. the same number of arguments. It doesn't work if one bound is, say, In your example, the common kind could be [Y <: Any] => Any :: (Y <: String) -> *
[Y <: String] => Any :: (Y <: String) -> * But that's not the only option, e.g. Whether or not there's always a comon kind depends on how liberal you allow bounds to be. For example, the type lambdas Does that answer your question. |
Beta Was this translation helpful? Give feedback.
-
Yes, makes sense. |
Beta Was this translation helpful? Give feedback.
-
But in A, L takes |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
Wait no, my example is wrong and does type-check, nevermind, it's too late for me to think about this properly :). But maybe you can tell me if the body of the lambdas are always guaranteed to typecheck with the widened kind. |
Beta Was this translation helpful? Give feedback.
-
Here's a better example of what I have in mind, given: trait A {
type L >: [X] => Nothing <: [X] => Any
def foo: L[Int]
} I think your rules allow me to write: trait B extends A {
type L = [X <: String] => Any
} But then the type of |
Beta Was this translation helpful? Give feedback.
-
That can't happen: a wider lambda-kind gives narrower bounds, which makes more things typecheck (by Liskov substitution principle). So your example would be a problem, but the rules should allow contravariant bound refinement (just like for methods). So, IIUC, this issue is about allowing the "reverse" of your example, as follows:
== In Sandro's example, kind To be clear, |
Beta Was this translation helpful? Give feedback.
-
I don't get it. The original post says: G, X >: (L1 | L2) <: (U1 & U2) |- T1 <: T2
-----------------------------------------------------
G |- [X >: L1 <: U1] => T1 <: [X >: L2 <: U2] => T2 From this I get |
Beta Was this translation helpful? Give feedback.
-
No, it's not a typo. But I had the same reaction as @Blaisorblade when copy-pasting the code from the old issue. What follows really only addresses the error reported by the compiler about incompatibility with At first I thought, once the error w.r.t. In fact, I think this is related to @smarter's comment:
This is very interesting! You may just have found a counter example to the second approach. You're right, the bounds check out, but what is happening is that they only do so in the wider kind that In Now in So how come solution 2 did not show this? Well, because it automatically finds the narrowest kind that fits both lambdas being compared rather than taking an expected kind as an input. Here, this kind is I have no clue how this issue with solution 2 could be fixed without passing an expected kind. Any ideas? |
Beta Was this translation helpful? Give feedback.
-
After typechecking, we'll check that the type definition is within bounds of the definitions in each mixin, one by one.
Nope, I was hoping you'd have some magic solution :) |
Beta Was this translation helpful? Give feedback.
-
Going back to solution 1, I'm not sure how being kind-directed would work in practice. How do you decide which kind to use in the first place ? |
Beta Was this translation helpful? Give feedback.
-
OK, so there is no linearization order of something? If these checks should be driven by some expected kind from the parent traits, then a common one would have to be found first. For example, one could pick the first one (here What would you do if
One possibility I see is to have a flag in the subtype checker to indicate whether you want to check from above or below. When checking
Well, it depends on where you are performing the subtype check, and it may not be obvious in all situations. But a good rule of thumb would be that you do something similar to what you do when you're using expected types (e.g. for checking definitions). For example, in the above case, you somehow manage to figure out the right type to use when you compare refinements of method signatures. If you have traits trait A { def foo(x: Int): Any }
trait B { def foo(x: String): Any }
trait C extends A with B { def foo(x: Nothing): Any } this somehow works out, so you have to find a common "expected type" for |
Beta Was this translation helpful? Give feedback.
-
Whoops, this does not do what I thought it would. I thought that |
Beta Was this translation helpful? Give feedback.
-
In the body of
I hope we can formalize this somehow :).
You can refine in a override the result type of a method covariantly, but you can't refine the parameter types contravariantly, that results in overloads indeed. |
Beta Was this translation helpful? Give feedback.
-
So do I :-) I have an analogy from mathematics that might help clarify the situation a bit and that also suggests a possible way forward. (It's based on a similar analogy I mention in #6369.) Imagine we want to prove some abstract theorem about real numbers and real-valued functions. The theorem doesn't hold for all reals though, only for those who satisfy a set of inequalities. So in our theorem we might say things like "assume a positive real number x ..." which is just another way of saying "assume a real number x such that 0 ≤ x". When we define a trait with some bounded abstract type members, we are doing the same thing, but for types. The declaration type X >: Int just says "assume a type Going back to our math analogy, assume we want to state a theorem about some abstract real-valued function f: ℝ → ℝ which is bounded by some other functions, e.g. g(x) = 0 and h(x) = 2 x for every x ∈ ℝ. This is analogous to declaring an abstract type operator with a lower and upper bound, e.g. type F[X] >: G[X] <: H[X] So what about bounds on the arguments? Assume our theorem is a bout an abstract function f: ℝ → ℝ that is lower-bounded by g(x) = log(x). We know that the function g is not defined for all reals, only for the strictly positive ones. In other words, the argument of g is itself lower-bounded! If we were to use some sort of pseudo-Scala syntax, we could write it's definition as g(x > 0) = log(x). With that in mind, what does it mean when we ask that f be lower-bounded by g? It means that f(x) ≥ g(x) for all positive reals. When we compare f and g it simply does not make sense to ask that f(-1) ≥ g(-1) because g is ill-defined at -1, even when f itself is defined for all the reals. So by saying that f ≥ g for this particular choice of g, we're implicitly constraining the domain where we can say meaningful things about f. Similarly, when we write a declaration like type F >: [X <: String] => ... we restrict the domain where we can say meaningful things about With that in mind, let's revisit your earlier example: trait A {
type L >: [X] => Nothing <: [X] => Any
def foo: L[Int]
}
trait B extends A {
type L = [X <: String] => Any
} This is a bit like stating a theorem So how to solve this? I see two obvious approaches:
and enforce that kind, irrespective of how it came about, in refinements. The problem with (1) is that Scala doesn't have a kind language, the problem with (2) is that it will likely break a bunch of code where a type definition uses a bound, but its declaration in a super-class/-trait does not (like in the above example). |
Beta Was this translation helpful? Give feedback.
-
This is a really useful analogy, thanks !
Right, so there's a trade-off here. If we were to go with (2) would the original example involving mixins from #6320 (comment) still work ? |
Beta Was this translation helpful? Give feedback.
-
Good question. I think it should. By analogy, if Lemma A holds for functions on positive reals, and Lemma B holds for functions on odd numbers, then the conjunction (i.e. the intersection A&B) should hold for any function defined on positive reals and odd numbers. In particular, it should hold for functions on arbitrary reals as well. What's less obvious to me is whether the kind of |
Beta Was this translation helpful? Give feedback.
-
@sstucki On your last Q, I'd still go with (2). Except that, it seems to me, inferring the least interval kind means we can do away with expected kinds, and we should just check that the Notation: following that Gist, I'll sometimes write interval kinds ProposalTranslating the surface syntax from HK-DOT to Scala, we get the following variant of the OP (1), which combines HK-DOT (K-..-<::-..) and (K-→-<::-→):
where Do we have an example where this rule differs from what you propose? I don't have a mathematical metaphor, and I can't summarize yours, but here's the naive set-theoretic semantics I have in mind: for each arity, kinds map to set of "types", and subkinding to a subset relationship. Answers
I think it should be the more precise kind
[...]
Honestly, it seems to me that if you infer the least kind and compare that, it's enough to compare the bounds directly. Except that "compare the bounds" also includes the comparison of type argument bounds, that you move to the comparison of kinds. Evaluation of the proposed rule:
EDITs: formatting, fixing typos. |
Beta Was this translation helpful? Give feedback.
-
@sstucki Comparing the rule I just proposed
under some (I think reasonable) assumptions. |
Beta Was this translation helpful? Give feedback.
-
@Blaisorblade, I'll start by summarizing your proposal as I understand it – correct me if I missed anything:
TLDR: I agree with (1) but not with (2) and (3). Explanations and counter examples follow. There's not much to be said about (1), that is essentially what I advocate. It is consistent with the subset of Scala that I have studied and proven safe. The problem with (2) is the supposed equivalence between kinds and type intervals. It's very natural and very appealing: it suggests that we don't need to introduce a separate kind language because we can just keep track of lower and upper bounds, which are types. It also seems consistent with the trick Pierce uses in TAPL, where kinds are encoded through their extremal types (kind The problem is that this only works if you keep track of both ends of the interval, the lower and the upper bound, at all times, and enforce their consistency. If you don't do that, the bound annotations in lambdas may get out of sync, and then you suddenly don't know which one you should enforce. In particular, when you compare two function intervals (intervals where both bounds are functions), there are a total of four pairs of bound annotations involved, and picking the right one will depend on whether your comparing the upper or lower bounds of the two intervals. This is also where (3) breaks down because it is not the case that bound annotations in lambdas should always be compared contravariantly. Let me illustrate this once again with the maths analogy: let's say we have two abstract functions, each bounded by some concrete functions:
Note that there are no signatures, we're only characterizing the functions via their bounds. Now suppose we want to instantiate these two functions to
Note the change of bounds annotations (i.e. domain of definition). Which of these instantiations are OK? Intuitively, it's clear that (1) is problematic: we cannot use a function defined only on the positive reals where one defined on all reals is expected. The converse is possible though, so (2) is fine. But what does our bounds-only algorithm say? To check that, we have to compare the bounds of all the functions, including their bound annotations. We have to
where I have omitted upper bound annotations for readability. You can already see the problem: in each case, we need to compare lambdas with wider bound annotations to those with narrower bound annotation and vice versa. If we simply always pick the annotation from the LHS or RHS lambdas (i.e. check lambdas covariantly or contravariantly), then at least one check in (1) and (2), respectively, will always fail, and both instantiations are rejected. If you are sceptical of this type of analogy, here's a Scala example that exhibits the same issue: trait B {
type F[X <: Int] >: Int <: Any
type G[X] >: Int <: Any
type H1[X >: Int] >: Int <: Any
type H2[X >: Int] >: Int <: Any
}
trait C extends B {
type F[X] = Any // should be OK
type G[X <: Int] = Any // should fail
type H1[X >: Int] = X // should be OK
type H2[X] = X // should be OK
} Scala 2.12 seems to get So why does the kind-based approach (1) not suffer from this issue? In the kind language, there is only one domain kind (and therefore only one bounds annotation) for function kinds and hence bound annotations cannot get "out of sync". The kind-based approach does all the same bound checks as the interval-based one you suggest, but there's no confusion which bound annotations to use when doing so. |
Beta Was this translation helpful? Give feedback.
-
Indeed, time ago Martin mentioned that this is not done and should ideally be changed, but it's somewhat invasive. And your examples of regressions from 2.12 probably raise the priority of this issue a bit.
Does 3 break down also if you keep the bounds in sync? It seems not... |
Beta Was this translation helpful? Give feedback.
-
@sstucki Here's an updated tentative proposal — as concrete as possible. Rationale coming separately. Mismatched bounds in Scala 3 and Dotty, vs HK-DOTScala 2 forbids bound mismatch in upper and lower bounds: you can write object Foo1 { type T[X] >: Int <: Any } but not object Foo2 { type T >: [X] => Int <: [X <: Int] => Any } Scala 3 supports
but As Proposal (fixed)Based on Gist HK-DOT, so intervals can only be formed at kind *, which prevents applying
Simplest changeConcretely, the fix that Martin hinted at would represent internally That fix would be invasive, but would not require using an expected kind in the subtype checker, or the addition of a subkind checker. I am not sure which change is less invasive. |
Beta Was this translation helpful? Give feedback.
-
@sstucki It seems that's what Gist HK-DOT (tries to) do, and IIUC succeeds at?
I see where that goes wrong: that only works at "kind" *, not at higher kinds.
IIUC, to check that the "kind" of
I'm happy to use a set-theoretic semantics to justify these rules, but I struggle to compile back and forth from Scala to these examples, and to understand where exactly things go wrong. Explanation of the fixed proposalLast, let me explain the fixed proposal in your words.
Yes. The rule I wrote is for subtype checking, but here in fact we need a rule for subkind checking, that is, (K-..→-<::-..→').
This is rule
We should not use |
Beta Was this translation helpful? Give feedback.
-
@Blaisorblade the notation that you are using in (K-..→-<::-..→') is confusing to me. This is a kinding rule, right? But the notation looks a lot like you're comparing type lambdas. But I guess these are dependent arrow kinds? Maybe we shouldn't overload
Yes, (3) breaks down even when you keep the bounds in sync. It's important not to confuse two related but distinct issues here:
The first point is a job for kind checking (or rather, subkind checking), the second point is a question of subtyping. Unfortunately, these seem to be heavily entangled in Dotty at the moment, but they are really not the same thing. The confusion stems from the fact that kinds may be thought of as intervals and subkinding as checking that the bounds of intervals are compatible. I have already explained how this intuition can break down if the kinds of the two ends of an interval are not kept in sync. However, even if subtyping of lambdas is currently used to perform subkind checking, that does not mean fixing the latter somehow fixes the former. It does not help that my OP uses subkind checking (1) to trigger a bug in subtype checking for lambdas (2). But again, fixing (1) will not fix (2). Obviously both should be fixed. That being said, it's really hard to come up with actual Scala snippets that trigger subtype checking of lambdas but not some form of kind/subkind checking. I'll have to think a bit more about that. For now, I'll make arguments based on the theory that I'm familiar with. OK, so back to subtyping lambdas with different bound annotations. You may wonder, why not enforce equal bounds when subtyping lambdas? But that is too restrictive: if you do this, subsumption (for types) in combination with eta/beta-conversion doesn't work as you'd expect. You really want to be able to relate Having well-behaved intervals also doesn't change the fact that a single subtyping rule for lambdas that always picks either the LHS or RHS bounds is too restrictive. E.g. in the above example The only way I see, for comparing lambdas consistently, is to somehow take into account the common bounds of the lambdas as well, something like
where
which, IMO, just looks like a kinded subtyping judgment again. So, it's unclear to me what you'd gain by this. My impression is that this whole business of mixing up kinding and subtyping looks like a nice shortcut initially, but it is not really simplifying things. If anything, it's actually just making things more complicated in the end.
I'm not sure why you say the rule should be discarded. This part of the example (instantiation of g) should be accepted. It's the instantiation of f that is problematic. Rule (K-..→-<::-..→') should not be discarded.
The problem with the set-theoretic interpretation for kinds and types is that it's easy to confuse typing and kinding, subtyping and subkinding, because both types and kinds are – in some sense – sets of values. If we use numbers and sets of numbers instead, it's clear that they are very different objects, and there is no confusion between the order on the reals and the subset order. The fact that it's hard to translate back and forth between them is intended, to some degree, because it forces us to be careful about the assumptions and intuitions that hold for the subset order but not necessarily for other orders (in this case, subtyping). |
Beta Was this translation helpful? Give feedback.
-
Current summaryAs @sstucki highlighted, we have two issues:
Since the discussion here is intertwined, we'll keep this one thread, but eventually this discussion should produce distinct and issues for the two parts. Comments
Yes,
Sure, Subtyping of lambdas vs kind-checking
HK-DOT has allows different bounds in subtyping of lambdas, thanks to (T-→-<:-→). I see you don't like that, but I don't see yet crystal-clear arguments.
Given The subtlest part: subtyping of lambdas per seI didn't get your argument here, but it's clearer in your thesis, and I am thinking about it. |
Beta Was this translation helpful? Give feedback.
-
So, I have found the one case where one would use
"Γ ⊢ K0 promotes to ∀ X :: K1. K2" means the same as in TAPL: find the least superkind of And in your thesis, that seems the only case where the expected kind K1 is not inferrable from either T1 or T2. ConcretelyIn Scala, when checking that |
Beta Was this translation helpful? Give feedback.
-
@Blaisorblade, I'm not sure I fully understand your reasoning, but the idea of using type application to trigger subtype checking makes sense to me. Here's a variant (?) of your code that should trigger the same subtype checks as the OP as well as the example involving beta-eta-equality I mentioned earlier. object LambdaTests {
/**
* This abstract type function takes arguments that are themselves
* type functions with their first arguments bounded by `Int`. In
* other words, it "fixes" the expected kind of its arguments to be
* `(X <: Int) -> *`. Any subtype check of the form
*
* FixA[T1] <: FixA[T2]
*
* should therefore trigger
*
* 1. a kind check T :: (X <: Int) -> * ,
* 2. a subtype check T1 <: T2 at kind (X <: Int) -> * .
*
*/
type FixA[+X[Y <: Int]]
// Similar fixtures for (X <: String) -> * and (X <: Int | String) -> *
type FixB[+X[Y <: String]]
type FixAB[+X[Y <: Int | String]]
// Lower bounds of type declarations from OP
type LA[Y <: Int] = Int
type LB[Y <: String] = String
type LAB[Y <: Int | String] = Int | String // lower bound of L in intersection A & B
type LC[Y <: Any] = Any
// Duplicate definitions of the above for sanity checks.
type LA2[Y <: Int] = Int
type LB2[Y <: String] = String
type LAB2[Y <: Int | String] = Int | String
type LC2[Y <: Any] = Any
/***** TRUE POSITIVES: should pass and do pass *****/
// Sanity checks: types with the same definitions should be equal (all work).
def tARefl(x: FixA[LA2]): FixA[LA] = x
def tBRefl(x: FixB[LB2]): FixB[LB] = x
def tAABRefl(x: FixAB[LA2]): FixAB[LA] = x
def tBABRefl(x: FixAB[LB2]): FixAB[LB] = x
def tACRefl(x: FixA[LC2]): FixA[LC] = x
def tBCRefl(x: FixB[LC2]): FixB[LC] = x
def tABCRefl(x: FixAB[LC2]): FixAB[LC] = x
// Variations of test AL below
def tAL2(x: FixA[[Y <: Int] => Int]): FixA[[Y <: Int] => Int] = x
def tAL3(x: FixA[[Y <: Int] => Int]): FixA[[Y <: Int] => Any] = x
def tAL4(x: FixA[[Y <: Any] => Int]): FixA[[Y <: Int] => Int] = x
def tAL5(x: FixA[[Y <: Any] => Int]): FixA[[Y <: Any] => Int] = x
/***** FALSE NEGATIVES: should pass but don't *****/
// Subtype checks between lower bounds from the OP
def tA(x: FixA[LC]): FixA[LA] = x
def tB(x: FixB[LC]): FixB[LB] = x
def tAB1(x: FixAB[LC]): FixAB[LAB] = x
def tAB2(x: FixAB[LC]): FixAB[LA | LB] = x
// Variant of subtype checks from the OP using lambdas
def tAL(x: FixA[[Y] => Any]): FixA[[Y <: Int] => Int] = x
def tBL(x: FixB[[Y] => Any]): FixB[[Y <: String] => String] = x
def tABL(x: FixAB[[Y] => Any]): FixAB[[Y <: Int | String] => Int | String] = x
// Variations of test AL with contravariant bounds
def tAL6(x: FixA[[Y <: Int] => Int]): FixA[[Y] => Int] = x
def tAL7(x: FixA[[Y <: Int] => Int]): FixA[[Y] => Any] = x
/***** TRUE NEGATIVES: should not pass and don't *****/
// Subtype checks with incompatible fixtures
def tFAB(x: FixA[LB]): FixA[LB] = x
def tFBA(x: FixB[LA]): FixB[LA] = x
def tFABL(x: FixA[[Y <: String] => String]): FixA[[Y <: String] => String] = x
def tFBAL(x: FixB[[Y <: Int] => Int]): FixB[[Y <: Int] => Int] = x
} |
Beta Was this translation helpful? Give feedback.
-
Interesting, if I may bring variance into the discussion (like things weren't complicated enough already!), this restriction seems to be why we need adaptHkVariances: https://github.com/lampepfl/dotty/blob/3ef991ddbe7db95de2863357f3cc1a34ca95fb6d/compiler/src/dotty/tools/dotc/core/TypeApplications.scala#L292-L323 |
Beta Was this translation helpful? Give feedback.
-
Interesting! Though it doesn't seem to help when subtyping applications: trait EtaExpandVariance {
type FixVar[F[X]] // Expect operator with invariant argument
type ListInv[ X] = List[X]
type ListCov[+X] = List[X]
// Both of these should work and do work.
val li: FixVar[ListInv] // type checks
val lc: FixVar[ListCov] // type checks
// These should all work -- eta-expansion should not matter!
val lc1: FixVar[ListCov] = li // fails
val li1: FixVar[ListInv] = lc // fails
val lc2: FixVar[[X] => ListCov[X]] = li // type checks
// Uses X in an invariant position. Should probably not kind check...
val li2: FixVar[[+X] => ListInv[X]] = lc // type checks
} For completeness: an analogous example with bounds instead of variances. trait EtaExpandBounds {
type FixBnd[F[X <: Int]]
type ListBounded[X <: Int] = List[X]
type ListUnbounded[X] = List[X]
// Both of these should work and do work.
val lb: FixBnd[ListBounded] // type checks
val lu: FixBnd[ListUnbounded] // type checks
// These should all work -- eta-expansion should not matter!
val lu1: FixBnd[ListUnbounded] = lb // fails
val lb1: FixBnd[ListBounded] = lu // fails
val lu2: FixBnd[[X <: Int] => ListUnbounded[X]] = lb // type checks
// Uses unbounded X where a bound is expected. Should not kind check.
val lb2: FixBnd[[X] => ListBounded[X]] = lu // fails
} |
Beta Was this translation helpful? Give feedback.
-
This issue was previously reported in a comment to #1252. I'm re-reporting this as requested by @smarter. (The original issue was fixed, but the issue reported in the comment was not).
Given the code
The compiler reports
The problem is that, when the compiler checks that the bounds of
L
declared inC
conform to those declared inB
, it compares the bound annotations of the type-lambdas that form those bounds. But when checking whether two lambdas are subtypes, their bounds need not be compared, as long as both lambdas have a common kind.Concretely, here we want all the lambdas to have the (wider) kind
(Y <: String) -> *
which is the declared kind ofL
in the parent traitB
:so all the bounds are well-kinded and their kinds conform to the signature in the parent class. Then we need to check that the definitions of the lambdas agree in this kind, which they do:
With @smarter and @Blaisorblade, we have discussed two possible fixes:
Make subtype checking kind driven: when comparing two lambdas, use the expected bounds from the kind when comparing the bodies. This can be summed up in the following kinded subtyping rule.
The advantage of this solution is that it's known to be sound (it has been verified at least for a subset of the type system in my thesis). The drawback is that subtype checking would have to be rewritten to take an additional input: expected kind of the types being compared. That would be a major change.
Find a compatible, common bound by computing the intersection/union of the bounds in the individual lambdas. As a subtyping rule, this would look as follows.
The advantage of this solution is that subtype checking does not have to be redesigned completely, while the drawback is that it requires computing the intersections/unions of higher-kinded types. That can probably be done point-wise (i.e. by pushing intersections/unions into lambdas) but I have not worked out the details nor have I got any sort of soundness proof.
Beta Was this translation helpful? Give feedback.
All reactions