Skip to content

Overview of the compiler

P-E-P edited this page Oct 4, 2024 · 4 revisions
---
title: GCCRS Graph
---
stateDiagram-v2
    Rust --> Tokens: Lexer
    Tokens --> AST: Parser
    state AST {
        [*] --> attr_checking
        attr_checking: Attribute Checker
        cfg_stripping: Cfg stripping
        toplevel_nr: Top level name resolution
        early_nr: Early name resolution
        validation: Validation
        late: Late name resolution
        feature_gating: Feature gating
        attr_checking --> cfg_stripping
        cfg_stripping --> toplevel_nr
        toplevel_nr --> early_nr
        early_nr --> expansion
        expansion --> cfg_stripping
        expansion --> validation
        validation --> feature_gating
        feature_gating --> late
        late --> [*]
    }
    state HIR {
        type_resolve: Type resolver
        privacy: Privacy checker
        safety: Safety checker
        const: Const checker
        dead: Dead code scanner
        [*] --> type_resolve
        type_resolve --> privacy
        privacy --> safety
        safety --> const
        const --> dead
        dead --> [*]
    }
    AST --> HIR: Lowering
    metadata: Crate metadata
    HIR --> metadata
    HIR --> Generic
    state Generic {
        unused: Unused variables checker
        readonly: Read only checker
        [*] --> unused
        unused --> readonly
        readonly --> [*]
    }
    state BIR {
        wrapper: Polonius wrapper
        polonius: Polonius
        wrapper --> polonius
    }
    HIR --> BIR
    Generic --> Gimple
    state RTL {
        CFGC: Control flow graph cleanup
        fwd_prop: Forward propagation
        loop: Loop optimisation
        if: If conversion
        combination: Instruction combination
        register: Register allocation
        bb_reordering: Basic block reordering
        [*] --> CFGC
        CFGC --> fwd_prop
        fwd_prop --> CSE
        CSE --> GCSE
        GCSE --> loop
        loop --> if
        if --> combination
        combination --> scheduling
        scheduling --> register
        register --> bb_reordering
    }
    Gimple --> RTL
    RTL --> Assembly
Loading

Compiler steps breakdown

Take rust code and recognize different token patterns. Separate those patterns in a buffered queue (rust-lex).

Tokens are defined in a macro.

Parser implementation can be found here.

Accepted syntax may be relaxed, this is fine because it might get modified later by macros or removed using conditional compilation attributes.

For example we might accept at this stage functions without body. As long as a macro expand it to a proper function later or strip it out this is fine, we'll error out later.

AST

Attribute checker gcc/rust/util/rust-attribute

Check the validity of attributes. Check if attributes are applied on correct items, within allowed context.

Expansion stage gcc/rust/expand

Fixed point expansion loop. If the AST has changed during one iteration we loop again until it stays the same or the maximum allowed number of iteration has been reached.

Remove disabled items (eg. #[cfg(false)]). See conditional compilation.

Collect all definitions (Struct, Functions, Modules, Traits)

Resolve all macro invocations (mbe, proc-macros...)

Expansion gcc/rust/expand

Expand all macros invocation found. Do not error out if something is missing as it might be expanded during next iterations.

Ensure the AST is valid and properly constructed. We error out on some syntax previously accepted during parsing.

Some features should only be accepted if they are enabled (nightly features)

Resolve everything else

HIR

Lowering

BIR

Helpful bits

  • gcc/rust/util/rust-hir-map - Retrieve an item/ast node from it's node id.
  • gcc/rust/lex/token.h - Retrieve a macro to build a list of tokens.
  • gcc/rust/util/lang-items.h - Retrieve existing lang items.
  • gcc/rust/check/errors/rust-feature-gate.cc - Retrieve the list of existing nightly features.
  • gcc/rust/util/rust-keyword-values.h - Retrieve the list of existing keywords and weak keywords.