Enso offers one keyword for defining new types:
type Foo x y z
This creates a type Foo
with three type arguments. For example:
main = 42:(Foo Float Integer Text)
fails with Type error: expected expression to be Foo, but got Integer.
as 42
isn't value of type Foo
.
One can add any number of atom constructors to such a type. For example following code snippet
type Foo x y z
Zero
One u:x
Two u:x v:y
Three u:x v:y w:z
adds four atom constructors named Foo.Zero
, Foo.One
, etc. with fields
named x
, u
, w
initialized to provided arguments when these constructors
are used. The type Foo
automatically offers getters for fields u
, v
and
w
. However some of these fields may yield undefined error value when other
than Foo.There
constructor (which assignes all the field values) is used. With
the updated Foo
type definition following two versions of main
function type
check fine:
main = Foo.Zero:(Foo Float Integer Text) # yields Zero
main = (Foo.One 3.14):(Foo Float Integer Text) # yields (One 3.14)
Arguments to atom constructors can have defaults and explicit type signatures.
Warning
Following seems to be off a bit:
Due to two things that feel like 'primes' popping up in the conversation below, we've chosen to denote one with 'x.type' and the other with 'Code x'.
However, there are some limitations on type signatures that will be discussed later. There is a distinction between type and its atoms. Types themselves are effectively represented as atoms and type constructors are atom constructors as well.
List
plays multiple roles here. It acts as a higher-kinded type constructor
which can be fed type arguments to build a fully saturated type:
Int : Type
List : Type -> Type
List Int : Type
Given a type declaration, we desugar it into atoms and atom constructors and define subtyping relationships in a two-step process.
First, we gather all of the type variables used in its definition and build a new type with those type variables expressed in sequence. Then if certain conditions are met (e.g. the two have the exact same shape), we allow punning between the basic type constructor and this elaborated type constructor.
e.g.
type Book author title year
is equivalent to defining
type Book.type author_type title_type year_type
type Book (author : author_type) (title : title_type) (year : year_type)
whereupon we notice that Book.type and Book are the same shape, and allow
punning between Book.type
and Book
.
Book "Jane Austen" "Pride and Prejudice" 1832 : Book.type Text Text Int
Book "Jane Austen" "Pride and Prejudice" 1832 : Book Text Text Int
The eagle-eyed among you might note that author_type and author are different field names, so this would actually elaborate out to
type Book.type author title year
type Book (author : author) (title : title) (year : year)
where we are careful with lexical scopes to distinguish between the term level author and the type level author, enabling this pun.
The List
example
type List a
Nil
Cons (head : a) (tail : List a)
main =
List.Cons 3
List.Cons 14
List.Nil
constructs a two item list and yields (Cons 3 (Cons 14 Nil))
.
Warning
This is not at all what we currently do:
this elaborates on each of these type declarations and finds a couple of puns:
type List.type a = List a
type Nil.type = Nil
and one definition that emerges as not a pun:
type Cons.type a
type Cons (head : a) (tail : List a)
List
acts as a container holding its static methods.
Warning
Type constructors aren't really implemented in Enso as of now:
List serves double duty in the above. It acts as the type constructor for
lists, so that List Int
is a well formed Type
, but it also ...
List Int
is currently erased to List
and the additional type element
information of Int
is lost during execution.
Warning
Enso supports just static methods
There are two forms of static methods that we are interested in in the long term. We'll classify them as 'normal' static methods and 'explicit' static methods. For now, we're going to support 'normal' static methods as they are needed for 'from' conversions. Still, there is an opportunity to allow more explicit statics later on for some uses involving meta-programming.
type Vector a
MkVec (unsafe_to_array: Array a)
from (that : List x): Vector x = MkVec (Array.from that)
Warning
This is not how it works:
this internally elaborates to
type Vector a
type MkVec.type a
type MkVec (unsafe_to_array: Array a)
...
with the obvious subtyping relationship
MkVec.type a <: Vector a
Similarly to the pun story above, when you have a single type
statement
contained in a parent type
statement, and the shape aligns, we should be
willing to allow punning between them, eliminating the distinct MkVec.type
here. This results in a nice "obvious" usage pattern, where we aren't
distinguishing between several vector types. The type of the module lives in
the type constructor, and the type of its individual constructor that is
limited to only holding arrays.
Static methods live in the type
. Using Vector
itself as the place where the
statics live allows for a convenient usage pattern:
Vector.from_polyglot_array [1,2,3] : Vector Int
where 'Vector' is playing double duty as a type constructor (currently erased
just to raw type Vector
) and as the bag of static methods.
Warning
Another example with .type
that's unlikely valid:
type Book.type
type Book (author: String) (title: String) (year: Int)
static whatever = ...
where 'whatever' is a method on Book.type
, not Book
, which is not a pun.
Like java, static methods don't appear as static methods inside subtypes, and,
in our case, terms that inhabit our times. This is a deliberate design choice,
by choosing not to have static methods push down into atom constructors that
build terms in a type for now we retain the option to revisit this later.
There's a tension here between the 'ease of use' of getting list methods off
of a proxy list you happen to have in hand as a 'prototype', and the ability
to allow subtypes to have their own statics later on, which would be useful if
we want to meaningfully subtype non-empty lists and lists, but allow different
conversions 'to' non-empty lists. We don't have to choose now, so we prefer to
leave the option open, until more sophisticated usage scenarios and users
force a choice.
If we carefully read the above rules, these rules allow us to construct module-like objects via types:
type Bucket
static wat : Foo -> Bar
which at first seems to not give me any way to construct a term of type Bucket, but recall it elaborates to
type Bucket.type
type Bucket
static wat : Foo -> Bar
and then puns Bucket.type = Bucket
so we can talk about Bucket.wat exactly
as if Bucket was a qualified module import. This supplies us with an
implementation semantics for what 'modules' mean. They are just types with no
parameters, where all their top level methods are lifted into static, and
where we cut the subtyping relationship with any types you define inside the
module and the module itself.
The static methods get attached to the type definition. This means they do not have access to any of the type's atom arguments themselves.
Warning
No plans for explicit
static methods
A syntax for this would something like
type Vector a
type MkVec (unsafe_to_array : Array a)
explicit static from (that : Json) = map (x -> x.to this.a) Json.expectArray
Notice how this "explicit" static method is accessing the type arguments of
'Vector' through this
.
Note: This isn't something we should do in general as it requires explicitly
passing type arguments everywhere, and would encumber from
conversions
unduly if it was the only implementation technique available to us. This would
be analogous to building a type system without any "erasure" if we were forced
to pass explicit types everywhere, and would come at an unacceptable
performance tax.
Returning to the view of case objects and modules above, ML-style functors would correspond to having 'explicit' statics rather than static methods in the case object, which carries with it the concerns of whether modules have applicative or generative semantics. By punting on this for now we don't have to decide such things, and don't have to pay a universal performance tax, while still leaving the door to that becoming a feature we build in the future into the convenient semantic 'dead space' we leave undefined.
Future Direction: Normalization-by-evaluation for partial evaluation, behavioral reflection, and an improved editor experience
Pedantically, normalization-by-evaluation is an implementation technique commonly used in dependent type checkers or evaluators where you have a starting surface syntax (terms) that you compile down into internal semantic domain objects (values), but where you want the ability to reify all values back into terms that are in some sense canonical, (e.g. have been reduced to a so-called "normal" form).
This is made complicated by the fact that normal forms walk underneath functions, so you need to be able to evaluate functions with arguments they 'don't know yet' to reach this normal form. We call such arguments we don't know 'stuck' or 'neutral' terms. Any attempt to eliminate them (extract information, apply them, case on them, etc.) builds a bigger neutral term that captures what you attempted to do to it, and then we convert between these stuck terms back into full expressions in the surface language.
The magic is in slightly extending the rules of the interpreter we are using to allow these stuck terms to be passed, and to build up these larger stuck terms when elimination forms are applied to them.
One such usage scenario is when dealing with how to show the user the 'result' of a computation that has not yet been wired up to all of its inputs. Consider a world where we have a stuck or out of scope 'a', 'b' and 'c'. In this setting when we go to evaluate
foldr (+) 0 [a,b,c]
we get the stuck term
a + (b + (c + 0))
and can show this to users even in the absence of a known a b and c, and then as they wire those up to concrete inputs, it'll gradually pop in complexity level by level until they have it fully wired up and they can see the final answer.
Another direct analogue can be found in LINQ in the dot-net ecosystem. There Erik Meijer and Anders Hejlsberg allowed Visual Basic and C# lambdas to be inspected to see their contents, so that LINQ providers could transcode them into things like SQL FROM or HAVING clauses for SELECT statements where possible and the like.
This opens the door towards, say, an efficient library for 'Apache spark' calculations in Enso, where we can marshal lambdas out closer to the data, rather than have to bring the data to the lambda.
Morally this looks a lot like BER MetaOcaml, where we're deliberately being "evil" and allowing you to use naked terms in our language through NBE (like LINQ does above) instead of using an explicit syntax or typed template haskell 'Code' type. This places a stronger burden on users of the advanced reflection behaviors to get things right, but allows them to express the full array of LINQ-like options in the language.
The LINQ scenarios are predicated on being able to feed stuck values into programs and reify the result as a syntax tree for the partially evaluated result. The meta-programming or "template Enso" scenario is predicated on being able to reflect that syntax tree back into an Enso term or function.
This isn't a corner of the design space we need to fully realize at this point, and there is more room for expansion. There are some open questions we'd like to answer.
How should we represent the creation and feeding of stuck terms?
One could envision supplying a syntax for
foo :: A -> B neutral "x" foo
which feeds foo a stuck term of type A, and lifts the resulting term of type B
to Code B
, where it is explicitly known you can manipulate it as a syntax
tree of the partially evaluated result. But this becomes a bit awkward to
invoke.
More pragmatically, this is probably built ontop of a richer programming interface like
type Code a
# has a side-effect of producing a fresh variable
# morally this lives in a variable supply monad like TH's 'Q'
static fresh : String -> Name
type Var (name: Name)
static map : (a -> b) -> Code a -> Code b
# other constructors for Code
type ...
lower : a
or
type Code a
static fresh : String -> Name
type Var (name: Name)
static pure : a -> Code a -- this is where the magic happens
static apply : Code (a -> b) -> Code a -> Code b
static map (f : a -> b) (self : Code a): Code b = apply (pure f) self
static join (x : Code (Code a)) : Code a = x.lower
# other constructors for Code
type ...
lower : a
where we can start to observe that the behavior of this 'linq' like Code construct looks like a monad/comonad in Enso due to all the things NBE let us identify here.
There remain other design options to explore here. Do we limit ourselves to terms like BER MetaOcaml? Do we allow elaboration to statements and other complex expressions like template-haskell? None of that needs to be resolved today.