-
Notifications
You must be signed in to change notification settings - Fork 13
Qi Compiler Sync Aug 26 2022
We spent our time (about 4 hours!) almost entirely on pair programming to incorporate the bindingspec meta-DSL into the Qi source (WIP Pull Request). Along the way, we made the distinction between core and non-core forms of the language more clear, introduced some new ways to handle certain core forms, and uncovered some issues that we had to work around which may warrant additional consideration down the line.
As a hosted DSL, Qi consists of a single Racket macro that expands Qi syntax to Racket syntax. This macro contains about 60 forms that together comprise the Qi language. That is, every one of those forms is a rule in a single macro (the flow
macro).
With the recent work to add macro-extensibility to the language, it became possible to write Qi macros that extend the syntax of the language in much the same way that Racket macros extend Racket -- that is, the syntactic extensions have the same status as "built-in" forms. Side by side with the ability to allow users to extend the language, this development also allows Qi itself to be reorganized internally in terms of "core forms" and "extensions" (built-in macros). The benefit of doing this is that it allows us to decompose the expansion phase of (the Racket macro) flow
into two components -- expansion of Qi macros (not Racket macros) to "core" Qi forms, followed by compilation of the core language to Racket, bringing many benefits including the ability to improve performance.
In the last meeting, we had separated the ~60 forms of the language into about 30 core forms and 30 macros. This laid the foundation to start exploring using the bindingspec meta-DSL which models these two layers concretely and thus provides many benefits including automatically generating a "good" expander.
Since this was almost entirely a pair programming session, these notes will attempt to describe what that involved, including any useful notes for future hacking sessions.
In order to incorporate bindingspec, the Qi core needed to be properly stratified in terms of core and non-core forms.
The code was refactored before the meeting to explicitly distinguish the core language from the extensions. This is the current layout of the code.
├── flow
│ ├── aux-syntax.rkt
│ ├── core
│ │ ├── compiler.rkt
│ │ ├── impl.rkt
│ │ └── syntax.rkt
│ └── extended
│ ├── expander.rkt
│ ├── forms.rkt
│ ├── impl.rkt
│ └── syntax.rkt
├── flow.rkt
├── info.rkt
├── macro.rkt
├── main.rkt
├── on.rkt
├── private
│ └── util.rkt
├── switch.rkt
└── threading.rkt
The main thing to note here is that core/compiler.rkt
expands the core Qi language to Racket, while extended/forms.rkt
contains syntactic forms that were formerly lumped together with the core forms in the original flow
macro, but now implemented as Qi macros (for reference: A more complete description of these modules).
We worked on adding bindingspec to the formerly empty flow/extended/expander.rkt
module. As part of that, we needed to make some modifications to the core forms. Those are described below.
Some Core Qi forms (i.e. in core/compiler.rkt
) facilitate partial application of Racket functions. To do this in a tractable way that could be properly modeled as a core form, we chose to adopt Racket's approach for this kind of thing -- interposition points. So now:
- Blanket template patterns
(a ... __ b ...)
are tagged by the expander as(#%blanket-template (a ... __ b ...)
. - Fine-grained template patterns
(a ... _ b ...)
are tagged as(#%fine-template (a ... _ b ...)
. - Partial application patterns
(a ...)
are tagged as(#%partial-application (a ...))
.
This allows the compiler to define #%blanket-template
, #%fine-template
and #%partial-application
as core forms.
- The rule to match function identifiers (and interpret them directly as Racket functions) was demoted to a non-core expansion rule by wrapping the expansion using
esc
(which is a core form), so, a rulef -> (esc f)
. - The rule to match literals in the compiler was moved up to the expander as a rule
v -> (gen v)
. - The Qi macro-expansion rule which was formerly in the core was removed since it is now generated by the (bindingspec-produced) expander.
We spent a lot of time initially on declaring a grammar for the core language in the bindingspec meta-DSL, declaring the syntax of core Qi forms one at a time.
Once we were done with that, we were able to generate the required Qi expander simply as:
(define (expand-flow stx)
((nonterminal-expander floe) stx))
... where floe
was defined as a nonterminal in the grammar specification and entailed the core language.
The rest of the time was spent debugging various issues. Some common issues encountered were:
In the core (compiler) level, only core forms may be used. With the additional delineation of core forms above, there were many core form productions that needed to be modified as they were no longer exclusively in terms of core forms. In particular, any function-valued identifier needed to be changed from func
to (esc func)
since identifiers when used on their own are now expanded so that they are wrapped by esc
in a production rule in the expander.
Likewise, core forms could no longer use literals directly, since these, too, were now expanded to (gen ...)
at the expansion stage. All instances of direct use of literals had to be modified from v
to (gen v)
.
Similarly, use of partial application templates had to be wrapped in the relevant "interposition" core form, e.g. (#%fine-template ...)
.
Bindingspec takes a strict position on such collisions and assumes that the user meant to use the DSL form, raising an error if it doesn't match the expected syntax. In Qi, since syntax that doesn't match one of the forms of the language may still match a partial application of a Racket function which may happen to share the same name, in some cases we may wish to explicitly signal this possibility and allow the use of partial application syntax for these Racket functions.
We wrote a hack to achieve this, by naming the actual core form something different (e.g. appleye
instead of apply
), and then handling the original name of the form as a production at the expander level, by expanding the Qi syntax (apply
used as an identifier) to a use of the core form (appleye
). This allows the use of it as partial application to be handled correctly since it is no longer considered a Qi form.
Some strategies employed were:
"Bisect" to a problematic expansion rule in the bindingspec specification by adding a rule to match anything and raise an error (e.g. (~> v:expr (begin (displayln "hello!") (error 'bye)))
), and moving this rule around until the source of the original error is identified.
(define (expand-flow stx)
(displayln (~a "input: " stx))
(let ([result ((nonterminal-expander floe) stx)])
(displayln (~a "output: " result))
result))
(define-syntax (qi0->racket stx)
(let ([result (syntax-parse (cadr (syntax->list stx))
... compiler rules (i.e. core form expansions) ...
(displayln (~a "qi0->racket output" result))
result))
... by adding this to parse the input to expand-flow
:
(define (expand-flow stx)
...
(syntax-parse stx
[(a:id . _) (displayln (~a "syntax info: "
(syntax-debug-info
((make-interned-syntax-introducer 'qi) #'a))))]
[_ (void)])
...)
(raise-syntax-error #f "Error!" this-syntax)
instead of using Qi's report-syntax-error
utility. TODO: Modify report-syntax-error
to show source location information.
We ran into an issue where the threading interface macro ~>
was in some way shadowing the ~>
defined in the qi
binding space. This has something to do with the way binding spaces use the "set of scopes" model, that is, by simply augmenting the existing set of scopes with a new scope. Thus, in some sense they are not totally independent, and that could lead to problems like this one.
Workaround: We used different names for the Racket-level threading forms R~>
and R~>>
instead of ~>
and ~>>
but use rename-out
at the module level to restore the correct names.
Qi uses a chiral
syntax property to map a syntax object to a version that is aware of its "chirality", or the side on which it accepts arguments. This is modulated by the right-threading form (~>>
) which makes component forms right-chiral, where otherwise, by default, forms are left-chiral. Bindingspec doesn't currently copy over syntax properties, and adding that would be necessary to support this case.
But a potentially better implementation of the chirality behavior is to use syntax parameters instead, so that, within a threading form, all forms are parameterized as left- or right-chiral, depending on the threading form, and this would be overridden by nested threading forms that would modulate the chirality in either direction. Bindingspec doesn't support syntax parameters yet, either, and it might be a more involved effort. This support may also be added.
Danny shared this paper as a good introduction to writing compilers:
An Incremental Approach to Compiler Construction
He's also using bindingspec's predecessor ee-lib
in the typed-nanopass
project.
- Support syntax properties and/or syntax parameters in bindingspec (see above), and resume the integration.
- Review what's necessary to be able to support a
define-qi-alias
(i.e. binding-space specific) form similar todefine-alias
(which usesmake-rename-transformer
). - Write down some example optimizations (pseudocode) that we'd like to see in the Qi compiler, without worrying about the implementation yet.
- Review CI failures resulting from exported Qi macros (e.g. this and this).
Benjamin, Danny, Figgy, Michael, Sid, Super
Home | Developer's Guide | Calendar | Events | Projects | Meeting Notes