Skip to content

Normalized AST

github-actions[bot] edited this page Nov 23, 2024 · 1 revision

This document was generated from 'src/documentation/print-normalized-ast-wiki.ts' on 2024-11-23, 17:55:16 UTC presenting an overview of flowR's normalized ast (v2.1.7, using R v4.4.0).

flowR produces a normalized version of R's abstract syntax tree (AST), offering the following benefits:

  1. abstract away from intricacies of the R parser
  2. provide a version-independent representation of the program
  3. decorate the AST with additional information, e.g., parent relations and nesting information

In general, the mapping should be rather intuitive and focused primarily on the syntactic structure of the program. Consider the following example which shows the normalized AST of the code

x <- 2 * 3 + 1

Each node in the AST contains the type, the id, and the lexeme (if applicable). Each edge is labeled with the type of the parent-child relationship (the "role").

flowchart LR
    n7(["RExpressionList (7)
 "])
    n6(["RBinaryOp (6)
#60;#45;"])
    n7 -->|"expr-list-child-0"| n6
    n0(["RSymbol (0)
x"])
    n6 -->|"binop-lhs"| n0
    n5(["RBinaryOp (5)
#43;"])
    n6 -->|"binop-rhs"| n5
    n3(["RBinaryOp (3)
#42;"])
    n5 -->|"binop-lhs"| n3
    n1(["RNumber (1)
2"])
    n3 -->|"binop-lhs"| n1
    n2(["RNumber (2)
3"])
    n3 -->|"binop-rhs"| n2
    n4(["RNumber (4)
1"])
    n5 -->|"binop-rhs"| n4

Loading

(The analysis required 11.78 ms (including parsing with the R shell) within the generation environment.)

 

Tip

If you want to investigate the normalized AST, you can either use the Visual Studio Code extension or the :normalize* command in the REPL (see the Interface wiki page for more information).

Indicative of the normalization is the root expression list node, which is present in every normalized AST. In general, we provide node types for:

  1. literals (e.g., numbers and strings)
  2. references (e.g., symbols, parameters and function calls)
  3. constructs (e.g., loops and function definitions)
  4. branches (e.g., next and break)
  5. operators (e.g. +, -, and *)
Complete Class Diagram

Every node is a link, which directly refers to the implementation in the source code. Grayed-out parts are used for structuring the AST, grouping together related nodes.

classDiagram
direction RL
class RNode~Info = NoInfo~
    <<type>> RNode
style RNode opacity:.35,fill:#FAFAFA
click RNode href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L163" "The #96;RNode#96; type is the union of all possible nodes in the R#45;ast. It should be used whenever you either not care what kind of node you are dealing with or if you want to handle all possible nodes. #60;p#62;  All other subtypes (like; #60;code#62;RLoopConstructs#60;/code#62;; ) listed above can be used to restrict the kind of node. They do not have to be exclusive, some nodes can appear in multiple subtypes."
class RExpressionList~Info = NoInfo~
    <<interface>> RExpressionList
    RExpressionList : type#58; RType.ExpressionList
    RExpressionList : grouping#58; #91;start#58; RSymbol#60;Info, string#62;, end#58; RSymbol#60;Info, string#62;#93;
click RExpressionList href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-expression-list.ts#L9" "Holds a list of expressions (and hence may be the root of an AST, summarizing all expressions in a file). The #96;grouping#96; property holds information on if the expression list is structural or created by a wrapper like #96;#123;#125;#96; or #96;()#96;."
    RExpressionList : children#58; readonly Children#91;#93; [from WithChildren]
    RExpressionList : type#58; RType [from Base]
    RExpressionList : lexeme#58; LexemeType [from Base]
    RExpressionList : info#58; Info #38; Source [from Base]
class RFunctions~Info~
    <<type>> RFunctions
style RFunctions opacity:.35,fill:#FAFAFA
click RFunctions href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L146" "This subtype of; #60;code#62;RNode#60;/code#62;; represents all types related to functions (calls and definitions) in the normalized AST."
class RFunctionDefinition~Info = NoInfo~
    <<interface>> RFunctionDefinition
    RFunctionDefinition : type#58; RType.FunctionDefinition
    RFunctionDefinition : parameters#58; RParameter#60;Info#62;#91;#93;
    RFunctionDefinition : body#58; RNode#60;Info#62;
click RFunctionDefinition href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-function-definition.ts#L14" "#96;#96;#96;r function(#60;parameters#62;) #60;body#62; #96;#96;#96; or#58; #96;#96;#96;r #92;(#60;parameters#62;) #60;body#62; #96;#96;#96;"
    RFunctionDefinition : type#58; RType [from Base]
    RFunctionDefinition : lexeme#58; LexemeType [from Base]
    RFunctionDefinition : info#58; Info #38; Source [from Base]
    RFunctionDefinition : location#58; SourceRange [from Location]
class RFunctionCall~Info = NoInfo~
    <<type>> RFunctionCall
style RFunctionCall opacity:.35,fill:#FAFAFA
click RFunctionCall href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L39" ""
class RNamedFunctionCall~Info = NoInfo~
    <<interface>> RNamedFunctionCall
    RNamedFunctionCall : type#58; RType.FunctionCall
    RNamedFunctionCall : named#58; true
    RNamedFunctionCall : functionName#58; RSymbol#60;Info, string#62;
    RNamedFunctionCall : arguments#58; readonly RFunctionArgument#60;Info#62;#91;#93;
click RNamedFunctionCall href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L15" "Calls of functions like #96;a()#96; and #96;foo(42, #34;hello#34;)#96;."
    RNamedFunctionCall : type#58; RType [from Base]
    RNamedFunctionCall : lexeme#58; LexemeType [from Base]
    RNamedFunctionCall : info#58; Info #38; Source [from Base]
    RNamedFunctionCall : location#58; SourceRange [from Location]
class RUnnamedFunctionCall~Info = NoInfo~
    <<interface>> RUnnamedFunctionCall
    RUnnamedFunctionCall : type#58; RType.FunctionCall
    RUnnamedFunctionCall : named#58; false
    RUnnamedFunctionCall : calledFunction#58; RNode#60;Info#62;
    RUnnamedFunctionCall : infixSpecial#58; boolean
    RUnnamedFunctionCall : arguments#58; readonly RFunctionArgument#60;Info#62;#91;#93;
click RUnnamedFunctionCall href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L29" "Direct calls of functions like #96;(function(x) #123; x #125;)(3)#96;."
    RUnnamedFunctionCall : type#58; RType [from Base]
    RUnnamedFunctionCall : lexeme#58; LexemeType [from Base]
    RUnnamedFunctionCall : info#58; Info #38; Source [from Base]
    RUnnamedFunctionCall : location#58; SourceRange [from Location]
class RParameter~Info = NoInfo~
    <<interface>> RParameter
    RParameter : type#58; RType.Parameter
    RParameter : name#58; RSymbol#60;Info, string#62;
    RParameter : special#58; boolean
    RParameter : defaultValue#58; RNode#60;Info#62;
click RParameter href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-parameter.ts#L8" "Represents a parameter of a function definition in R."
    RParameter : type#58; RType [from Base]
    RParameter : lexeme#58; LexemeType [from Base]
    RParameter : info#58; Info #38; Source [from Base]
    RParameter : location#58; SourceRange [from Location]
class RArgument~Info = NoInfo~
    <<interface>> RArgument
    RArgument : type#58; RType.Argument
    RArgument : name#58; RSymbol#60;Info, string#62;
    RArgument : value#58; RNode#60;Info#62;
click RArgument href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-argument.ts#L8" "Represents a named or unnamed argument of a function definition in R."
    RArgument : type#58; RType [from Base]
    RArgument : lexeme#58; LexemeType [from Base]
    RArgument : info#58; Info #38; Source [from Base]
    RArgument : location#58; SourceRange [from Location]
class ROther~Info~
    <<type>> ROther
style ROther opacity:.35,fill:#FAFAFA
click ROther href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L151" "This subtype of; #60;code#62;RNode#60;/code#62;; represents all types of otherwise hard to categorize nodes in the normalized AST. At the moment these are the comment#45;like nodes."
class RComment~Info = NoInfo~
    <<interface>> RComment
    RComment : type#58; RType.Comment
    RComment : content#58; string
click RComment href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-comment.ts#L9" "#96;#96;#96;r # I am a line comment #96;#96;#96;"
    RComment : location#58; SourceRange [from Location]
class RLineDirective~Info = NoInfo~
    <<interface>> RLineDirective
    RLineDirective : type#58; RType.LineDirective
    RLineDirective : line#58; number
    RLineDirective : file#58; string
click RLineDirective href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-line-directive.ts#L7" "Special comment to signal line mappings (e.g., in generated code) to the interpreter."
    RLineDirective : location#58; SourceRange [from Location]
class RConstructs~Info~
    <<type>> RConstructs
style RConstructs opacity:.35,fill:#FAFAFA
click RConstructs href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L141" "As an extension to; #60;code#62;RLoopConstructs#60;/code#62;; , this subtype of; #60;code#62;RNode#60;/code#62;; includes the; #60;code#62;RIfThenElse#60;/code#62;; construct as well."
class RLoopConstructs~Info~
    <<type>> RLoopConstructs
style RLoopConstructs opacity:.35,fill:#FAFAFA
click RLoopConstructs href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L136" "This subtype of; #60;code#62;RNode#60;/code#62;; represents all looping constructs in the normalized AST."
class RForLoop~Info = NoInfo~
    <<interface>> RForLoop
    RForLoop : type#58; RType.ForLoop
    RForLoop : variable#58; RSymbol#60;Info, string#62;
    RForLoop : vector#58; RNode#60;Info#62;
    RForLoop : body#58; RExpressionList#60;Info#62;
click RForLoop href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-for-loop.ts#L11" "#96;#96;#96;r for(#60;variable#62; in #60;vector#62;) #60;body#62; #96;#96;#96;"
    RForLoop : type#58; RType [from Base]
    RForLoop : lexeme#58; LexemeType [from Base]
    RForLoop : info#58; Info #38; Source [from Base]
    RForLoop : location#58; SourceRange [from Location]
class RRepeatLoop~Info = NoInfo~
    <<interface>> RRepeatLoop
    RRepeatLoop : type#58; RType.RepeatLoop
    RRepeatLoop : body#58; RExpressionList#60;Info#62;
click RRepeatLoop href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-repeat-loop.ts#L10" "#96;#96;#96;r repeat #60;body#62; #96;#96;#96;"
    RRepeatLoop : type#58; RType [from Base]
    RRepeatLoop : lexeme#58; LexemeType [from Base]
    RRepeatLoop : info#58; Info #38; Source [from Base]
    RRepeatLoop : location#58; SourceRange [from Location]
class RWhileLoop~Info = NoInfo~
    <<interface>> RWhileLoop
    RWhileLoop : type#58; RType.WhileLoop
    RWhileLoop : condition#58; RNode#60;Info#62;
    RWhileLoop : body#58; RExpressionList#60;Info#62;
click RWhileLoop href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-while-loop.ts#L10" "#96;#96;#96;r while(#60;condition#62;) #60;body#62; #96;#96;#96;"
    RWhileLoop : type#58; RType [from Base]
    RWhileLoop : lexeme#58; LexemeType [from Base]
    RWhileLoop : info#58; Info #38; Source [from Base]
    RWhileLoop : location#58; SourceRange [from Location]
class RIfThenElse~Info = NoInfo~
    <<interface>> RIfThenElse
    RIfThenElse : type#58; RType.IfThenElse
    RIfThenElse : condition#58; RNode#60;Info#62;
    RIfThenElse : then#58; RExpressionList#60;Info#62;
    RIfThenElse : otherwise#58; RExpressionList#60;Info#62;
click RIfThenElse href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-if-then-else.ts#L10" "#96;#96;#96;r if(#60;condition#62;) #60;then#62; #91;else #60;otherwise#62;#93; #96;#96;#96;"
    RIfThenElse : type#58; RType [from Base]
    RIfThenElse : lexeme#58; LexemeType [from Base]
    RIfThenElse : info#58; Info #38; Source [from Base]
    RIfThenElse : location#58; SourceRange [from Location]
class RNamedAccess~Info = NoInfo~
    <<interface>> RNamedAccess
    RNamedAccess : operator#58; #34;$#34; | #34;@#34;
    RNamedAccess : access#58; #91;RUnnamedArgument#60;Info#62;#93;
click RNamedAccess href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-access.ts#L19" "Represents an R named access operation with #96;$#96; or #96;@#96;, the field is a string."
    RNamedAccess : type#58; RType.Access [from RAccessBase]
    RNamedAccess : accessed#58; RNode#60;Info#62; [from RAccessBase]
    RNamedAccess : operator#58; #34;#91;#34; | #34;#91;#91;#34; | #34;$#34; | #34;@#34; [from RAccessBase]
class RIndexAccess~Info = NoInfo~
    <<interface>> RIndexAccess
    RIndexAccess : operator#58; #34;#91;#34; | #34;#91;#91;#34;
    RIndexAccess : access#58; readonly (#34;#60;#62;#34; | RArgument#60;Info#62;)#91;#93;
click RIndexAccess href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-access.ts#L25" "access can be a number, a variable or an expression that resolves to one, a filter etc."
    RIndexAccess : type#58; RType.Access [from RAccessBase]
    RIndexAccess : accessed#58; RNode#60;Info#62; [from RAccessBase]
    RIndexAccess : operator#58; #34;#91;#34; | #34;#91;#91;#34; | #34;$#34; | #34;@#34; [from RAccessBase]
class RUnaryOp~Info = NoInfo~
    <<interface>> RUnaryOp
    RUnaryOp : type#58; RType.UnaryOp
    RUnaryOp : operator#58; string
    RUnaryOp : operand#58; RNode#60;Info#62;
click RUnaryOp href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-unary-op.ts#L7" "Unary operations like #96;#43;#96; and #96;#45;#96;"
    RUnaryOp : type#58; RType [from Base]
    RUnaryOp : lexeme#58; LexemeType [from Base]
    RUnaryOp : info#58; Info #38; Source [from Base]
    RUnaryOp : location#58; SourceRange [from Location]
class RBinaryOp~Info = NoInfo~
    <<interface>> RBinaryOp
    RBinaryOp : type#58; RType.BinaryOp
    RBinaryOp : operator#58; string
    RBinaryOp : lhs#58; RNode#60;Info#62;
    RBinaryOp : rhs#58; RNode#60;Info#62;
click RBinaryOp href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-binary-op.ts#L7" "Operators like #96;#43;#96;, #96;==#96;, #96;#38;#38;#96;, etc."
    RBinaryOp : type#58; RType [from Base]
    RBinaryOp : lexeme#58; LexemeType [from Base]
    RBinaryOp : info#58; Info #38; Source [from Base]
    RBinaryOp : location#58; SourceRange [from Location]
class RSingleNode~Info~
    <<type>> RSingleNode
style RSingleNode opacity:.35,fill:#FAFAFA
click RSingleNode href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L132" "This subtype of; #60;code#62;RNode#60;/code#62;; represents all types of; #60;code#62;Leaf#60;/code#62;; nodes in the normalized AST."
class RSymbol~Info = NoInfo, T extends string = string~
    <<interface>> RSymbol
    RSymbol : type#58; RType.Symbol
    RSymbol : content#58; T
click RSymbol href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-symbol.ts#L12" "Represents identifiers (variables)."
    RSymbol : namespace#58; string [from Namespace]
    RSymbol : location#58; SourceRange [from Location]
class RConstant~Info~
    <<type>> RConstant
style RConstant opacity:.35,fill:#FAFAFA
click RConstant href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/model.ts#L127" "This subtype of; #60;code#62;RNode#60;/code#62;; represents all types of constants represented in the normalized AST."
class RNumber~Info = NoInfo~
    <<interface>> RNumber
    RNumber : type#58; RType.Number
    RNumber : content#58; RNumberValue
click RNumber href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-number.ts#L6" "includes numeric, integer, and complex"
    RNumber : location#58; SourceRange [from Location]
class RString~Info = NoInfo~
    <<interface>> RString
    RString : type#58; RType.String
    RString : content#58; RStringValue
click RString href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-string.ts#L8" "Represents a string like #96;#34;hello#34;#96;, including raw strings like #96;r#34;(hello)#34;#96;."
    RString : location#58; SourceRange [from Location]
class RLogical~Info = NoInfo~
    <<interface>> RLogical
    RLogical : type#58; RType.Logical
    RLogical : content#58; boolean
click RLogical href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-logical.ts#L9" "Represents logical values (#96;TRUE#96; or #96;FALSE#96;)."
    RLogical : location#58; SourceRange [from Location]
class RBreak~Info = NoInfo~
    <<interface>> RBreak
    RBreak : type#58; RType.Break
click RBreak href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-break.ts#L7" "A #96;break#96; statement."
    RBreak : location#58; SourceRange [from Location]
class RNext~Info = NoInfo~
    <<interface>> RNext
    RNext : type#58; RType.Next
click RNext href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-next.ts#L7" "A #96;next#96; statement."
    RNext : location#58; SourceRange [from Location]
class RPipe~Info = NoInfo~
    <<interface>> RPipe
    RPipe : type#58; RType.Pipe
    RPipe : lhs#58; RNode#60;Info#62;
    RPipe : rhs#58; RNode#60;Info#62;
click RPipe href "https://github.com/flowr-analysis/flowr/tree/main//src/r-bridge/lang-4.x/ast/model/nodes/r-pipe.ts#L7" "Variant of the binary operator, specifically for the new, built#45;in pipe operator."
    RPipe : type#58; RType [from Base]
    RPipe : lexeme#58; LexemeType [from Base]
    RPipe : info#58; Info #38; Source [from Base]
    RPipe : location#58; SourceRange [from Location]
RExpressionList .. RNode
Info .. RNode
RFunctions .. RNode
RFunctionDefinition .. RFunctions
Info .. RFunctions
RFunctionCall .. RFunctions
RNamedFunctionCall .. RFunctionCall
Info .. RFunctionCall
RUnnamedFunctionCall .. RFunctionCall
Info .. RFunctionCall
Info .. RFunctions
RParameter .. RFunctions
Info .. RFunctions
RArgument .. RFunctions
Info .. RFunctions
Info .. RNode
ROther .. RNode
RComment .. ROther
Info .. ROther
RLineDirective .. ROther
Info .. ROther
Info .. RNode
RConstructs .. RNode
RLoopConstructs .. RConstructs
RForLoop .. RLoopConstructs
Info .. RLoopConstructs
RRepeatLoop .. RLoopConstructs
Info .. RLoopConstructs
RWhileLoop .. RLoopConstructs
Info .. RLoopConstructs
Info .. RConstructs
RIfThenElse .. RConstructs
Info .. RConstructs
Info .. RNode
RNamedAccess .. RNode
Info .. RNode
RIndexAccess .. RNode
Info .. RNode
RUnaryOp .. RNode
Info .. RNode
RBinaryOp .. RNode
Info .. RNode
RSingleNode .. RNode
RComment .. RSingleNode
Info .. RSingleNode
RSymbol .. RSingleNode
Info .. RSingleNode
RConstant .. RSingleNode
RNumber .. RConstant
Info .. RConstant
RString .. RConstant
Info .. RConstant
RLogical .. RConstant
Info .. RConstant
Info .. RSingleNode
RBreak .. RSingleNode
Info .. RSingleNode
RNext .. RSingleNode
Info .. RSingleNode
RLineDirective .. RSingleNode
Info .. RSingleNode
Info .. RNode
RPipe .. RNode
Info .. RNode
Loading

The generation of the class diagram required 833.17 ms.

Node types are controlled by the RType enum (see ./src/r-bridge/lang-4.x/ast/model/type.ts), which is used to distinguish between different types of nodes. Additionally, every AST node is generic with respect to the Info type which allows for arbitrary decorations (e.g., parent inforamtion or dataflow constraints). Most notably, the info field holds the id of the node, which is used to reference the node in the dataflow graph.

In summary, we have the following types:

Normalized AST Node Types
  • RNode
    The RNode type is the union of all possible nodes in the R-ast. It should be used whenever you either not care what kind of node you are dealing with or if you want to handle all possible nodes.

    All other subtypes (like RLoopConstructs ) listed above can be used to restrict the kind of node. They do not have to be exclusive, some nodes can appear in multiple subtypes.

    Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L163
    /**
     * The `RNode` type is the union of all possible nodes in the R-ast.
     * It should be used whenever you either not care what kind of
     * node you are dealing with or if you want to handle all possible nodes.
     * <p>
     *
     * All other subtypes (like {@link RLoopConstructs}) listed above
     * can be used to restrict the kind of node. They do not have to be
     * exclusive, some nodes can appear in multiple subtypes.
     */
    export type RNode<Info = NoInfo>  = RExpressionList<Info> | RFunctions<Info>
        | ROther<Info> | RConstructs<Info> | RNamedAccess<Info> | RIndexAccess<Info>
        | RUnaryOp<Info> | RBinaryOp<Info> | RSingleNode<Info>  | RPipe<Info>
    • RExpressionList
      Holds a list of expressions (and hence may be the root of an AST, summarizing all expressions in a file). The grouping property holds information on if the expression list is structural or created by a wrapper like {} or ().

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-expression-list.ts#L9
      /**
       * Holds a list of expressions (and hence may be the root of an AST, summarizing all expressions in a file).
       * The `grouping` property holds information on if the expression list is structural or created by a wrapper like `{}` or `()`.
       */
      export interface RExpressionList<Info = NoInfo> extends WithChildren<Info, RNode<Info>>, Base<Info, string | undefined>, Partial<Location> {
          readonly type:     RType.ExpressionList;
          /** encodes wrappers like `{}` or `()` */
          readonly grouping: undefined | [start: RSymbol<Info>, end: RSymbol<Info>]
      }
    • RFunctions
      This subtype of RNode represents all types related to functions (calls and definitions) in the normalized AST.

      Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L146
      /**
       * This subtype of {@link RNode} represents all types related to functions
       * (calls and definitions) in the normalized AST.
       */
      export type RFunctions<Info>      = RFunctionDefinition<Info> | RFunctionCall<Info> | RParameter<Info> | RArgument<Info>
      • RFunctionDefinition

        function(<parameters>) <body>

        or:

        \(<parameters>) <body>
        Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-function-definition.ts#L14
        /**
         * ```r
         * function(<parameters>) <body>
         * ```
         * or:
         * ```r
         * \(<parameters>) <body>
         * ```
         */
        export interface RFunctionDefinition<Info = NoInfo> extends Base<Info>, Location {
            readonly type: RType.FunctionDefinition;
            /** the R formals, to our knowledge, they must be unique */
            parameters:    RParameter<Info>[];
            body:          RNode<Info>;
        }
      • RFunctionCall

        Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L39
        export type RFunctionCall<Info = NoInfo> = RNamedFunctionCall<Info> | RUnnamedFunctionCall<Info>;
        • RNamedFunctionCall
          Calls of functions like a() and foo(42, "hello").

          Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L15
          /**
           * Calls of functions like `a()` and `foo(42, "hello")`.
           *
           * @see RUnnamedFunctionCall
           */
          export interface RNamedFunctionCall<Info = NoInfo> extends Base<Info>, Location {
              readonly type:      RType.FunctionCall;
              readonly named:     true;
              functionName:       RSymbol<Info>;
              /** arguments can be empty, for example when calling as `a(1, ,3)` */
              readonly arguments: readonly RFunctionArgument<Info>[];
          }
        • RUnnamedFunctionCall
          Direct calls of functions like (function(x) { x })(3).

          Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-function-call.ts#L29
          /**
           * Direct calls of functions like `(function(x) { x })(3)`.
           *
           * @see RNamedFunctionCall
           */
          export interface RUnnamedFunctionCall<Info = NoInfo> extends Base<Info>, Location {
              readonly type:      RType.FunctionCall;
              readonly named:     false | undefined;
              calledFunction:     RNode<Info>; /* can be either a function definition or another call that returns a function etc. */
              /** marks function calls like `3 %xx% 4` which have been written in special infix notation; deprecated in v2 */
              infixSpecial?:      boolean;
              /** arguments can be undefined, for example when calling as `a(1, ,3)` */
              readonly arguments: readonly RFunctionArgument<Info>[];
          }
      • RParameter
        Represents a parameter of a function definition in R.

        Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-parameter.ts#L8
        /**
         * Represents a parameter of a function definition in R.
         */
        export interface RParameter<Info = NoInfo> extends Base<Info>, Location {
            readonly type: RType.Parameter;
            /* the name is represented as a symbol to additionally get location information */
            name:          RSymbol<Info>;
            /** is it the special ... parameter? */
            special:       boolean;
            defaultValue:  RNode<Info> | undefined;
        }
      • RArgument
        Represents a named or unnamed argument of a function definition in R.

        Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-argument.ts#L8
        /**
         * Represents a named or unnamed argument of a function definition in R.
         */
        export interface RArgument<Info = NoInfo> extends Base<Info>, Location {
            readonly type: RType.Argument;
            /* the name is represented as a symbol to additionally get location information */
            name:          RSymbol<Info> | undefined;
            value:         RNode<Info> | undefined;
        }
    • ROther
      This subtype of RNode represents all types of otherwise hard to categorize nodes in the normalized AST. At the moment these are the comment-like nodes.

      Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L151
      /**
       * This subtype of {@link RNode} represents all types of otherwise hard to categorize
       * nodes in the normalized AST. At the moment these are the comment-like nodes.
       */
      export type ROther<Info>          = RComment<Info> | RLineDirective<Info>
    • RConstructs
      As an extension to RLoopConstructs , this subtype of RNode includes the RIfThenElse construct as well.

      Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L141
      /**
       * As an extension to {@link RLoopConstructs}, this subtype of {@link RNode} includes
       * the {@link RIfThenElse} construct as well.
       */
      export type RConstructs<Info>     = RLoopConstructs<Info> | RIfThenElse<Info>
      • RLoopConstructs
        This subtype of RNode represents all looping constructs in the normalized AST.

        Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L136
        /**
         * This subtype of {@link RNode} represents all looping constructs in the normalized AST.
         */
        export type RLoopConstructs<Info> = RForLoop<Info> | RRepeatLoop<Info> | RWhileLoop<Info>
        • RForLoop

          for(<variable> in <vector>) <body>
          Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-for-loop.ts#L11
          /**
           * ```r
           * for(<variable> in <vector>) <body>
           * ```
           */
          export interface RForLoop<Info = NoInfo> extends Base<Info>, Location {
              readonly type: RType.ForLoop
              /** variable used in for-loop: <p> `for(<variable> in ...) ...`*/
              variable:      RSymbol<Info>
              /** vector used in for-loop: <p> `for(... in <vector>) ...`*/
              vector:        RNode<Info>
              /** body used in for-loop: <p> `for(... in ...) <body>`*/
              body:          RExpressionList<Info>
          }
        • RRepeatLoop

          repeat <body>
          Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-repeat-loop.ts#L10
          /**
           * ```r
           * repeat <body>
           * ```
           */
          export interface RRepeatLoop<Info = NoInfo> extends Base<Info>, Location {
              readonly type: RType.RepeatLoop
              body:          RExpressionList<Info>
          }
        • RWhileLoop

          while(<condition>) <body>
          Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-while-loop.ts#L10
          /**
           * ```r
           * while(<condition>) <body>
           * ```
           */
          export interface RWhileLoop<Info = NoInfo> extends Base<Info>, Location {
              readonly type: RType.WhileLoop
              condition:     RNode<Info>
              body:          RExpressionList<Info>
          }
      • RIfThenElse

        if(<condition>) <then> [else <otherwise>]
        Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-if-then-else.ts#L10
        /**
         * ```r
         * if(<condition>) <then> [else <otherwise>]
         * ```
         */
        export interface RIfThenElse<Info = NoInfo> extends Base<Info>, Location {
            readonly type: RType.IfThenElse;
            condition:     RNode<Info>;
            then:          RExpressionList<Info>;
            otherwise?:    RExpressionList<Info>;
        }
    • RNamedAccess
      Represents an R named access operation with $ or @, the field is a string.

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-access.ts#L19
      /**
       * Represents an R named access operation with `$` or `@`, the field is a string.
       */
      export interface RNamedAccess<Info = NoInfo> extends RAccessBase<Info> {
          operator: '$' | '@';
          access:   [RUnnamedArgument<Info>];
      }
    • RIndexAccess
      access can be a number, a variable or an expression that resolves to one, a filter etc.

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-access.ts#L25
      /** access can be a number, a variable or an expression that resolves to one, a filter etc. */
      export interface RIndexAccess<Info = NoInfo> extends RAccessBase<Info> {
          operator: '[' | '[[';
          /** is null if the access is empty, e.g. `a[,3]` */
          access:   readonly (RArgument<Info> | typeof EmptyArgument)[]
      }
    • RUnaryOp
      Unary operations like + and -

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-unary-op.ts#L7
      /**
       * Unary operations like `+` and `-`
       */
      export interface RUnaryOp<Info = NoInfo> extends Base<Info>, Location {
          readonly type: RType.UnaryOp;
          operator:      string;
          operand:       RNode<Info>;
      }
    • RBinaryOp
      Operators like +, ==, &&, etc.

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-binary-op.ts#L7
      /**
       * Operators like `+`, `==`, `&&`, etc.
       */
      export interface RBinaryOp<Info = NoInfo> extends Base<Info>, Location {
          readonly type: RType.BinaryOp;
          operator:      string;
          lhs:           RNode<Info>;
          rhs:           RNode<Info>;
      }
    • RSingleNode
      This subtype of RNode represents all types of Leaf nodes in the normalized AST.

      Defined at ./src/r-bridge/lang-4.x/ast/model/model.ts#L132
      /**
       * This subtype of {@link RNode} represents all types of {@link Leaf} nodes in the
       * normalized AST.
       */
      export type RSingleNode<Info>     = RComment<Info> | RSymbol<Info> | RConstant<Info> | RBreak<Info> | RNext<Info> | RLineDirective<Info>
    • RPipe
      Variant of the binary operator, specifically for the new, built-in pipe operator.

      Defined at ./src/r-bridge/lang-4.x/ast/model/nodes/r-pipe.ts#L7
      /**
       * Variant of the binary operator, specifically for the new, built-in pipe operator.
       */
      export interface RPipe<Info = NoInfo> extends Base<Info>, Location {
          readonly type: RType.Pipe;
          readonly lhs:  RNode<Info>;
          readonly rhs:  RNode<Info>;
      }

The following segments intend to give you an overview of how to work with the normalized AST:

How Get a Normalized AST

As explained alongside the Interface wiki page, you can use the PipelineExecutor to get the normalized AST. If you are only interested in the normalization, a pipeline like the DEFAULT_NORMALIZE_PIPELINE suffices:

async function getAst(code: string): Promise<RNode> {
    const result = await new PipelineExecutor(DEFAULT_NORMALIZE_PIPELINE, {
        shell: new RShell(),
        request: requestFromInput(code.trim())
    }).allRemainingSteps();
    return result.normalize.ast;
}

From the REPL, you can use the :normalize command.

Traversing the Normalized AST

We provide two ways to traverse the normalized AST: Visitors and Folds.

Visitors

If you want a simple visitor which traverses the AST, the visitAst function from ./src/r-bridge/lang-4.x/ast/model/processing/visitor.ts is a good starting point. You may specify functions to be called whenever you enter and exit a node during the traversal, and any computation is to be done by side effects. For example, if you want to collect all the ids present within a normalized (sub-)ast, as it is done by the collectAllIds function, you can use the following visitor:

const ids = new Set<NodeId>();
visitAst(nodes, node => {
    ids.add(node.info.id);
});
return ids;

Folds

We formulate a fold with the base class DefaultNormalizedAstFold in ./src/abstract-interpretation/normalized-ast-fold.ts. Using this class, you can create your own fold behavior by overwriting the default methods. By default, the class provides a monoid abstraction using the empty from the constructor and the concat method.

  • DefaultNormalizedAstFold
    Default implementation of a fold over the normalized AST (using the classic fold traversal). To modify the behavior, please extend this class and overwrite the methods of interest. You can control the value passing (Returns generic) by providing sensible Monoid behavior overwriting the concat method and supplying the empty value in the constructor.

    Defined at ./src/abstract-interpretation/normalized-ast-fold.ts#L82
    /**
     * Default implementation of a fold over the normalized AST (using the classic fold traversal).
     * To modify the behavior, please extend this class and overwrite the methods of interest.
     * You can control the value passing (`Returns` generic)
     * by providing sensible Monoid behavior overwriting the {@link DefaultNormalizedAstFold#concat|concat} method
     * and supplying the empty value in the constructor.
     *
     * @note By providing `entry` and `exit` you can use this as an extension to the simpler {@link visitAst} function but without
     *       the early termination within the visitors (for this, you can overwrite the respective `fold*` methods).
     *
     * @example First you want to create your own fold:
     *
     * ```ts
     * let marker = false;
     * class MyNumberFold<Info> extends DefaultNormalizedAstFold<void, Info> {
     *     override foldRNumber(node: RNumber<Info>) {
     *         super.foldRNumber(node);
     *         marker = true;
     *     }
     * }
     * ```
     * This one does explicitly not use the return functionality (and hence acts more as a conventional visitor).
     * Now let us suppose we have a normalized AST as an {@link RNode} in the variable `ast`
     * and want to check if the AST contains a number:
     *
     * ```ts
     * const result = new MyNumberFold().fold(ast);
     * ```
     *
     * Please take a look at the corresponding tests or the wiki pages for more information on how to use this fold.
     */
    export class DefaultNormalizedAstFold<Returns = void, Info = NoInfo> implements NormalizedAstFold<Returns, Info> {
        protected readonly enter: EntryExitVisitor<Info>;
        protected readonly exit:  EntryExitVisitor<Info>;
        protected readonly empty: Returns;
    
        /**
         * Empty must provide a sensible default whenever you want to have `Returns` as non-`void`
         * (e.g., whenever you want your visitors to be able to return a value).
         */
        constructor(empty: Returns, enter?: EntryExitVisitor<Info>, exit?: EntryExitVisitor<Info>) {
            this.empty = empty;
            this.enter = enter;
            this.exit = exit;
        }
    
        /**
         * Monoid::concat
         *
         *
         * @see {@link https://en.wikipedia.org/wiki/Monoid}
         * @see {@link DefaultNormalizedAstFold#concatAll|concatAll}
         */
        protected concat(_a: Returns, _b: Returns): Returns {
            return this.empty;
        }
    
        /**
         * overwrite this method, if you have a faster way to concat multiple nodes
         *
         * @see {@link DefaultNormalizedAstFold#concatAll|concatAll}
         */
        protected concatAll(nodes: readonly Returns[]): Returns {
            return nodes.reduce((acc, n) => this.concat(acc, n), this.empty);
        }
    
        public fold(nodes: SingleOrArrayOrNothing<RNode<Info> | typeof EmptyArgument>): Returns {
            if(Array.isArray(nodes)) {
                const n = nodes as readonly (RNode<Info> | null | undefined | typeof EmptyArgument)[];
                return this.concatAll(n.filter(n => n && n !== EmptyArgument).map(node => this.foldSingle(node as RNode<Info>)));
            } else if(nodes) {
                return this.foldSingle(nodes as RNode<Info>);
            }
            return this.empty;
        }
    
        protected foldSingle(node: RNode<Info>): Returns {
            this.enter?.(node);
            const type = node.type;
            // @ts-expect-error -- ts may be unable to infer that the type is correct
            const result = this.folds[type]?.(node);
            this.exit?.(node);
            return result;
        }
    
        foldRAccess(access: RAccess<Info>) {
            let accessed = this.foldSingle(access.accessed);
            if(access.operator === '[' || access.operator === '[[') {
                accessed = this.concat(accessed, this.fold(access.access));
            }
            return accessed;
        }
        foldRArgument(argument: RArgument<Info>) {
            return this.concat(this.fold(argument.name), this.fold(argument.value));
        }
        foldRBinaryOp(binaryOp: RBinaryOp<Info>) {
            return this.concat(this.foldSingle(binaryOp.lhs), this.foldSingle(binaryOp.rhs));
        }
        foldRExpressionList(exprList: RExpressionList<Info>) {
            return this.concat(this.fold(exprList.grouping), this.fold(exprList.children));
        }
        foldRForLoop(loop: RForLoop<Info>) {
            return this.concatAll([this.foldSingle(loop.variable), this.foldSingle(loop.vector), this.foldSingle(loop.body)]);
        }
        foldRFunctionCall(call: RFunctionCall<Info>) {
            return this.concat(this.foldSingle(call.named ? call.functionName : call.calledFunction), this.fold(call.arguments));
        }
        foldRFunctionDefinition(definition: RFunctionDefinition<Info>) {
            return this.concat(this.fold(definition.parameters), this.foldSingle(definition.body));
        }
        foldRIfThenElse(ite: RIfThenElse<Info>) {
            return this.concatAll([this.foldSingle(ite.condition), this.foldSingle(ite.then), this.fold(ite.otherwise)]);
        }
        foldRParameter(parameter: RParameter<Info>) {
            return this.concat(this.foldSingle(parameter.name), this.fold(parameter.defaultValue));
        }
        foldRPipe(pipe: RPipe<Info>) {
            return this.concat(this.foldSingle(pipe.lhs), this.foldSingle(pipe.rhs));
        }
        foldRRepeatLoop(loop: RRepeatLoop<Info>) {
            return this.foldSingle(loop.body);
        }
        foldRUnaryOp(unaryOp: RUnaryOp<Info>) {
            return this.foldSingle(unaryOp.operand);
        }
        foldRWhileLoop(loop: RWhileLoop<Info>) {
            return this.concat(this.foldSingle(loop.condition), this.foldSingle(loop.body));
        }
        foldRBreak(_node: RBreak<Info>) {
            return this.empty;
        }
        foldRComment(_node: RComment<Info>) {
            return this.empty;
        }
        foldRLineDirective(_node: RLineDirective<Info>) {
            return this.empty;
        }
        foldRLogical(_node: RLogical<Info>) {
            return this.empty;
        }
        foldRNext(_node: RNext<Info>) {
            return this.empty;
        }
        foldRNumber(_node: RNumber<Info>) {
            return this.empty;
        }
        foldRString(_node: RString<Info>) {
            return this.empty;
        }
        foldRSymbol(_node: RSymbol<Info>) {
            return this.empty;
        }
    
        protected readonly folds: FittingNormalizedAstFold<Returns, Info> = {
            [RType.Access]:             n => this.foldRAccess(n),
            [RType.Argument]:           n => this.foldRArgument(n),
            [RType.BinaryOp]:           n => this.foldRBinaryOp(n),
            [RType.Break]:              n => this.foldRBreak(n),
            [RType.Comment]:            n => this.foldRComment(n),
            [RType.ExpressionList]:     n => this.foldRExpressionList(n),
            [RType.ForLoop]:            n => this.foldRForLoop(n),
            [RType.FunctionCall]:       n => this.foldRFunctionCall(n),
            [RType.FunctionDefinition]: n => this.foldRFunctionDefinition(n),
            [RType.IfThenElse]:         n => this.foldRIfThenElse(n),
            [RType.LineDirective]:      n => this.foldRLineDirective(n),
            [RType.Logical]:            n => this.foldRLogical(n),
            [RType.Next]:               n => this.foldRNext(n),
            [RType.Number]:             n => this.foldRNumber(n),
            [RType.Parameter]:          n => this.foldRParameter(n),
            [RType.Pipe]:               n => this.foldRPipe(n),
            [RType.RepeatLoop]:         n => this.foldRRepeatLoop(n),
            [RType.String]:             n => this.foldRString(n),
            [RType.Symbol]:             n => this.foldRSymbol(n),
            [RType.UnaryOp]:            n => this.foldRUnaryOp(n),
            [RType.WhileLoop]:          n => this.foldRWhileLoop(n),
        };
    }
    View more (NormalizedAstFold)

Now, of course, we could provide hundreds of examples here, but we use tests to verify that the fold behaves as expected and happily point to them at ./test/functionality/r-bridge/normalize-ast-fold.test.ts.

As a simple showcase, we want to use the fold to evaluate numeric expressions containing numbers, +, and * operators.

class MyMathFold<Info> extends DefaultNormalizedAstFold<number, Info> {
    constructor() {
    	/* use `0` as a placeholder empty for the monoid */
        super(0);
    }

    protected override concat(a: number, b: number): number {
    	/* for this example, we ignore cases that we cannot handle */ 
        return b;
    }

    override foldRNumber(node: RNumber<Info>) {
    	/* return the value of the number */ 
        return node.content.num;
    }

    override foldRBinaryOp(node: RBinaryOp<Info>) {
        if(node.operator === '+') {
            return this.fold(node.lhs) + this.fold(node.rhs);
        } else if(node.operator === '*') {
            return this.fold(node.lhs) * this.fold(node.rhs);
        } else {
        	/* in case we cannot handle the operator we could throw an error, or just use the default behavior: */
            return super.foldRBinaryOp(node);
        }
    }
}

Now, we can use the PipelineExecutor to get the normalized AST and apply the fold:

const shell = new RShell();
const ast = (await new PipelineExecutor(DEFAULT_NORMALIZE_PIPELINE, {
	shell, request: retrieveNormalizedAst(RShell, '1 + 3 * 2')
}).allRemainingSteps()).normalize.ast;

const result = new MyMathFold().fold(ast);
console.log(result); // -> 7