Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP static dispatch for kwargs and structural fields #16580

Closed
wants to merge 4 commits into from
Closed

Conversation

carnaval
Copy link
Contributor

I tried a couple approaches for kwargs but this one I like.
It introduces a structural named-fields counterpart to tuple types (unimaginatively called Struct). You can make a Struct(x = 1.0, y = 2.0) and it has type Struct{(:x,:y),Tuple{Float,Float}} with fields x and y.

There is a couple functions to merge or diff them and it uses that as the basis for kwargs splatting and unpacking.

This PR is not ready yet and is up here mostly to discuss the implications. The problematic one are probably over-specialization and the use of staged functions in the implementation (as always).

Assuming we want this, here are a few points I'm wondering about :

  • Should we aim to merge those things with Tuple so that we have only one kind of structural type. We'd get dispatch-on-kwargs for "free" (== retrofit all code that touches tuple types ... hem). This is probably a long term question and keeping them separate is a decent first step in that direction anyway.
  • Should they be covariant ? I'm guessing yes.
  • Should they have vararg forms ? I'm guessing yes too, so you could match on "at least those fields with those types"
  • Should they have an optional-field form ? Like {x::Int, y?::Int} that has {x::Int,y::Int} and {x::Int} as subtypes. I tink this is necessary if we want to express our current kwarg semantics using only normal dispatch and avoid generating a lot of methods.
  • Should we always sort fields ? For now Struct types take ordering into account but the kwarg lowering/merging/diffing always keeps them sorted to avoid generating a specialization for every permutation.

Other than that, it currently seem to work when used as a replacement for our current kwarg lowering and we get specialization, inlining, inference etc.
Going forward with this we can probably remove a bit of the kwfunc business but for now I only did minimal integration.

I feel that the Struct type is interesting in itself anyway.

@JeffBezanson
Copy link
Member

Very interesting. Having a nicer approach to keyword args, with covariant records, definitely seems important for the language. (Prior art includes e.g. NamedTuples.jl)

@carnaval carnaval force-pushed the ob/stkw branch 2 times, most recently from 4a169cb to 2aec001 Compare May 28, 2016 13:30
@carnaval
Copy link
Contributor Author

carnaval commented Jun 6, 2016

This is passing tests.
It's technically breaking since the type of a splatted kwarg has changed from an array of tuples to a struct, but it was quite minor, at least in Base, as long as it supports the iteration protocol.
I think it's ready for an initial review. I'm just wondering what would be a good test case for realistic overspecialization issues. A package with heavy use of kwargs ?

@carnaval
Copy link
Contributor Author

carnaval commented Jun 6, 2016

As for all the typesystem issues, it'll probably need a healthy dose of discussion. In the meantime, if this approach is not too flawed it can probably just be used internally for keyword args.

@JeffBezanson JeffBezanson added this to the 0.6.0 milestone Jun 6, 2016
@nalimilan
Copy link
Member

How does this compare with ImmutableDict which is used by IOContext? I noticed @vtjnash avoided keyword arguments when calling IOContext (passing varargs of pairs instead), presumably to a avoid the overhead of collecting them before converting them to an ImmutableDict. Does that mean a Struct could directly be used there?

@carnaval
Copy link
Contributor Author

carnaval commented Jun 6, 2016

The main difference is that the ImmutableDict is not specialized on the set of keys it holds. I'm not sure but I'd assume that it's a little overkill for something like the printing system where the lookup of the printing options is probably negligible compared to the stuff you're doing there.

@nalimilan
Copy link
Member

The main difference is that the ImmutableDict is not specialized on the set of keys it holds. I'm not sure but I'd assume that it's a little overkill for something like the printing system where the lookup of the printing options is probably negligible compared to the stuff you're doing there.

Yeah, but if all keyword arguments are collected that way, there would be little point in opting out of that optimization, at the cost of a slightly non-standard syntax.

@vtjnash
Copy link
Member

vtjnash commented Jun 6, 2016

You could, but IOContext and Struct should have roughly the same runtime performance. But IOContext also should have no compile-time or dispatch costs since it doesn't require any code specialization to achieve that performance, whereas those costs for Structs could be quite high.

I agree this is very interesting, although it has some minor implementation items to fix (methods in Core shouldn't be extended, meaning this should be a TopModule feature, not a CoreModule). I would also like to see some examples where this hits worst-case behaviors and what metrics we can apply to limit those. Maybe that would be something like addprocs, where there are a lot of options that get combined with other options and forwarded to several different consumers.

@carnaval
Copy link
Contributor Author

carnaval commented Jun 6, 2016

You could, but IOContext and Struct should have roughly the same runtime performance. But IOContext also should have no compile-time or dispatch costs since it doesn't require any code specialization to achieve that performance, whereas those costs for Structs could be quite high.

I think the trade-off is more complex than that since for IOContext you don't get type information out of the dict right ? I agree that in a lot of cases the compilation overhead will trump whatever you're winning from that (and I think it's definitely the case for the IO stuff).
However in some places you'd want to use kw args out of convenience without loosing the properties you get from positional arg (ie inference and static dispatch).

I'm also quite curious on what a torture test will look like for that, although to be fair we already have similar problems with large tuple arg splatting in a couple places. I'll try to have a look this week (but anyway we can discuss that in person too since I'm landing @ stata friday afternoon :-) )

@JeffBezanson
Copy link
Member

JeffBezanson commented Jun 6, 2016

Worth pointing out that in the current scheme, kwargs are converted to positional args, after which we still specialize on the types of all of them. Eventually we might want to do stuff like using the kwarg type partially, to do static dispatch but not specialize on every case (static dispatch to partially-specialized code).

It will be much more work, but it would be great to eventually eliminate the separate "kwfunc" by putting tuple and struct signatures in the same method table.

I'm landing @ stata friday afternoon

Yay!

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Sep 16, 2016

How is this going? I'm really interested in the improvements here, not just to be able to dispatch on kwargs, but the Struct() itself will be very useful. I think Structs can solve my problem of creating "anonymous types" for holding "named parameters" in functions for differential equations (mauro3/Parameters.jl#17 and SciML/Roadmap#2).

@kshyatt kshyatt added the types and dispatch Types, subtyping and method dispatch label Sep 21, 2016
@ChrisRackauckas
Copy link
Member

Is this still on track for v0.6?

I'm just wondering what would be a good test case for realistic overspecialization issues. A package with heavy use of kwargs ?

Plots.jl and DifferentialEquations.jl make heavy use of keyword arguments and I would be happy to do some tests. Gadfly is another case too. Optim.jl used to, but it did a re-design with an options type to get around this problem.

@JeffBezanson
Copy link
Member

JeffBezanson commented Dec 15, 2016

I had a slightly crazy idea along these lines the other day. We could add the ability to annotate elements of a Tuple type with names, like so:

Tuple{Int, Named{:a, Int}, Named{:b, String}}

syntax for an instance: (1, a=2, b="")

possible syntax for the type: {Int, a=Int, b=String} or {Int, a::Int, b::String}

A tuple with this type would behave exactly like a Tuple{Int,Int,String}, except you could also do x.a and x.b to get the second and third elements. The Named pseudo-type is troubling, and I'd rather avoid repeating the mistake of Vararg, but this type has some interesting properties:

  1. It's largely compatible with tuples and the changes for it would be pretty minimal.
  2. It provides all the features of NamedTuples.
  3. It allows both positional and named entries, potentially describing a full argument list including both positional and keyword arguments.

Unfortunately this doesn't seem to serve up a solution for keyword arguments on a silver platter; subsetting keyword args is still a problem, and the fact that this type is inherently ordered doesn't seem to help either. But it could be an interesting design to consider.

@davidanthoff
Copy link
Contributor

I really like the idea that both "normal" tuples and named tuples are the same type.

I still have the problem that I would very much want the named tuple syntax to infer names of members, if possible. E.g. if () was the syntax for named tuples, it would be great if (x.a, x.b) was equivalent to (a=x.a, b=x.b). But I don't see how that could be combined with the current syntax for "normal" (unnamed) tuple construction... So I think I would still favor {} as the syntax for named tuples, or rather the kind of types that @JeffBezanson describes in the previous comment.

Btw., have you seen the object spread feature in TypeScript 2.1? I think that would be another fantastic syntax to have with named tuples.

@andyferris
Copy link
Member

andyferris commented Jan 8, 2017

@JeffBezanson would there be any hope in allowing Named{} in user-defined structs? E.g. a way of injecting field names into a base type (for example, a strongly-typed table/dataframe and its rows?)

E.g. here's something that would be quite useful (combined with an invented use of Vararg in type definitions):

immutable Table{Cols <: Vararg{Named}} # Maybe `Cols...` would be short for this?
    Cols...
end

So long as one could iterate over the fields in some way (e.g. in a recursive, "lispy-like" way, like we do for tuple), then we would have a useful TypedTable-like structure without the generated function nonsense.


{i.LastName, i.FirstName} for i in source

vs
{LastName=i.LastName, FirstName=i.FirstName} for i in source

For the language, I kind-of prefer the latter (but with ()) (though I see why you would want the former in any SQL-like language).

@JeffBezanson
Copy link
Member

Looking this over again, it looks pretty good and I'll probably try to revive this for 1.0. The needed changes are:

  • Rebase (or more likely just rewrite)
  • Rename Struct to NamedTuple. I expect this to fully subsume that package, so this should be ok. Other name suggestions welcome.
  • Make them covariant.
  • Remove the sorting, and preserve order.

@davidanthoff
Copy link
Contributor

Very cool!

Is the plan to implement #1974 and then introduce a type like

struct NamedTuple{T,N}
    values::T
end

where N would be some type that holds the names, and then use the dot overloading machinery to implement field access, or are you planning to make named tuples some special type in some other way?

I also assume that as a first step this would just introduce the type, and any syntactic sugar for constructing named tuples would come later?

@JeffBezanson
Copy link
Member

This would have to include syntactic sugar, naturally! :) Actually I think it's pretty important to add the (a=1, b=2) syntax, since that makes it easier to change the implementation.

JeffBezanson added a commit that referenced this pull request Jun 1, 2017
@JeffBezanson JeffBezanson mentioned this pull request Jun 2, 2017
JeffBezanson added a commit that referenced this pull request Jun 4, 2017
JeffBezanson added a commit that referenced this pull request Jun 8, 2017
JeffBezanson added a commit that referenced this pull request Jun 12, 2017
JeffBezanson added a commit that referenced this pull request Jun 20, 2017
@StefanKarpinski
Copy link
Member

#22194 supercedes this, so we don't need to keep this open or on the 1.0 milestone anymore.

@StefanKarpinski StefanKarpinski removed this from the 1.0 milestone Jul 20, 2017
JeffBezanson added a commit that referenced this pull request Oct 17, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 25, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 25, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 25, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 26, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 26, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 26, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Oct 26, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Nov 1, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
JeffBezanson added a commit that referenced this pull request Nov 2, 2017
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.
@DilumAluthge DilumAluthge deleted the ob/stkw branch March 25, 2021 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
types and dispatch Types, subtyping and method dispatch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants