Skip to content

Language Internal Representation (LIR)

Stephen Zhao edited this page Sep 18, 2017 · 5 revisions

This package contains all of the elements of the CaSCAS Language Internal Representation (LIR).

The LIR is a set of classes and objects that provide the final internal representation of the structures and behaviours that make up an executable piece of CaSCAS code. After passing through the lexer and parser, a translator will transform the syntax tree, made of ParseNodes, into LIR Objects (LIRO). This note details the components of LIR and their purposes.

From a high level, LIR can somewhat be split into two domains: the program structure data, in the form of a LIR tree, made up of structural LIRO, and the runtime data, in the form of Contexts. This allows the LIR to dynamically evaluate programmed expressions, which makes it very versatile with respect to manipulations and performing operations on the code itself.

The LIRO Sub-Package

LIRO are how the original code's structure is encoded. They also contain the behaviour definitions for certain operations that may be performed on them in a general sense.

There are currently four behaviours that must be implemented for all LIRO, defined in the root trait Object:

  • a type checking method, to produce a Boolean indicating whether the given type and the type of the object match or not, with respect to a Context:
    • checkType(ctx: Context, tpe: TypeIdentifier): Boolean
  • a type inference method, to produce the type of the object with respect to a Context:
    • inferType(ctx: Context): Option[TypeIdentifier]
  • an evaluation method, to produce the result of executing the object, as well as any changes to the context as a result of reassignments during the evaluation:
    • eval(ctx: Context): Evaluation
  • a to-representation method, to produce a human-readable, intuitive form of the object (like a toString, but not bloated with information, and more useful as a overview of the structure of a LIRO tree)
    • toRepr: String

NOTE: The evaluation of a compound LIRO may simply result in a slightly more simplified LIRO if the values needed to compute a literal-valued result are not yet present. e.g. eval(x + x + 2) => 2*x + 2

From Object stem three branches of object kinds: Identifiers, Literals, and Exprs.

The generated LIR tree from the parser looks very similar to the original syntax tree because it needs to store structural information on the original code. From this perspective, LIR Literals are analogous to parsed TerminalNodes, and LIR Exprs are analogous to parsed NonTerminalNodes. Identifiers are a bit quirkier because they directly involve a Context, the other main aspect of LIR. The main difference between the syntax tree and LIR is that LIROs build on top of ParseNodes' representation of syntax to give further semantic meaning.

Literals

Literals can be thought of as the leaves of the LIR tree. Literals are the structural elements that have terminal type checking, type inference, and evaluation behaviours--as in the concrete implementations of these behaviours for each Literal kind do not make further recursive dispatching calls to these behaviours.

RationalNumber

RationalNumber is currently the only somewhat implemented Literal. It represents a rational number with an integer numerator, and a non-negative, non-zero denominator.

checkType(ctx: Context, tpe: TypeIdentifier) => if tpe == Number

inferType(ctx: Context) => Some(Number)

eval(ctx: Context) => a reduced self, no context changes

toRepr => numerator/denominator

Exprs

Exprs can be thought of as the internal nodes of the LIR tree. Type checking, type inference, and evaluation implementations for Exprs involve making recursive dispatching calls to these behaviours in child LIROs.

OperatorExpr

The most fundamental Expr kind is the OperatorExpr. An OperatorExpr is the internal representation for named operators (e.g. +, isPolynomial, differentiate) that have a defined body and lambda operators (e.g. lambda(params)(body)). OperatorExprs store a list of formal parameters (FormalParameter) and a body LIRO. FormalParameters are in the form of "Identifier : TypeIdentifier".

When an OperatorExpr is evaluated, the body is simplified as much as possible. For example an OperatorExpr's body might contain a while loop that can be executed to produce an explicit list without needing to know the values of the formal parameters that are referenced within it. In this case, the body will be simplified to an explicit list. When the Operator is actually applied, a simple substitution will be done instead of evaluating a while loop.

NOTE: Evaluating an OperatorExpr is NOT the same as applying the operation that it describes; that is is called Application of the OperatorExpr!

OperatorExprs type to specific OperatorType instances, detailed in the Type System section below.

checkType(ctx: Context, tpe: TypeIdentifier) => {
    tpe is an OperatorType,
    the tpe.args length matches self's formal parameter list length,
    each of the tpe.args checkType to the corresponding self's formal parameter,
    tpe.ret checkTypes to self's body's type
}
inferType(ctx: Context) => {
    //TODO: is it possible to infer the type of an Operator without explicitly annotating the formal parameters?
}
eval(ctx: Context) => {
    let evaldBody = evaluate the body w.r.t. the context + introductions from the formal parameters
    return simplified OperatorExpr(self.args, evaldBody), keep only reassignments from context delta
}
toRepr => lambda(args)(body)

BuiltInExpr

This Expr envelopes the same idea as an OperatorExpr except that its body is implemented in Scala, within the builtin package. There is a special interface (BuiltInDefinition) which can be used by developers to facilitate the implementation of a new builtin operator.

BuiltInExprs also type to specific OperatorType instances, detailed in the Type System section below.

checkType(ctx: Context, tpe: TypeIdentifier) => {
    tpe is an OperatorType,
    the tpe.args length matches self's formal parameter list length,
    each of the tpe.args checkType to the corresponding self's formal parameter,
    tpe.ret checkTypes to self.ret //This is different from OperatorExpr
}
inferType(ctx: Context) => {
    Some(OperatorType(self.formalParams, self.ret))
}
eval(ctx: Context) => {
    use the passed in onEval function to evaluate this if onEval exists,
    else just return self because you can't simplify a non-existent body.
}
toRepr => self.name

ApplyExpr

An ApplyExpr represents the application of a LIRO that types to an OperatorType on a list of actual parameters (also LIROs). The evaluation of an ApplyExpr will produce the result of applying the Operator to the parameters.

ApplyExprs type to the return type of the OperatorType type of the applied Operator.

ListExpr

WhileExpr

IfExpr

ForExpr

//TODO further sections