Qi Meeting Nov 29 2024

Playing Like a Grandmaster

Qi Meeting Nov 29 2024

Adjacent meetings: Previous | Up | Next [None]

Summary

We reviewed the in-progress PR integrating the proof-of-concept "deep macro" approach to extending deforestation. We discussed some concerns that we were combining two alternative, yet incompatible, approaches to the design, and settled on moving towards one of them. We also reviewed the proposed stream abstraction that would encapsulate deforestable operations, and how it provides the basis for extension of deforestation, and discussed a few other minor issues.

Background

The Qi 4 release added deforestation to the Qi compiler, greatly increasing the performance of standard list-oriented operations like map, filter, and foldl. While this achieved the performance we sought in the short term, we were uneasy about some aspects of the implementation as a long term investment, as it operated by matching literal uses of host language forms like map, filter and the folds. We wanted a clean separation between Qi and the host language to simplify future designs and trivially avoid conflating the respective invariants of the two languages.

We addressed this by adding a new core language form, #%deforestable, and using this as the target for a suite of list-oriented Qi macros modeled after racket/list, with the main difference being that the higher-order positions would be Qi flows. This allowed deforestation to be done purely within the Qi core language, formally identifying it as the scope of compiler optimizations, hopefully making our lives easier in the long run, but it required an up-front investment to set up.

Alongside this effort, Dominik started to modify the compiler architecture, and specifically the deforestation architecture, so that it would abstract the incidental aspects of stream components into the broad categories of producers, transformers, and consumers, laying the groundwork for future extension. He also started implementing deforested runtimes for individual operations modeled after those in racket/list to help reveal the right extension interface for this core runtime.

But we still didn't have a proper way to use the new #%deforestable form to extend these abstracted deforested operations.

A few weeks ago, Michael provided a proof of concept to achieve this – that is, a means of extending deforestation to custom list operations via the newly introduced #%deforestable core language form. We iterated on this until we got it to work with syntax and semantics that we wanted, and Sid recently started a PR to integrate the standalone POC into the actual compiler.

Incompatible Designs

Last week, Dominik and Sid reviewed some failing compiler tests and resolved them by adding an identifier to the syntax of the define-deforestable form that would allow the compiler to know which deforestable form it was seeing, e.g. map or filter. This fit into the existing pattern matching machinery in the compiler so that the new map and filter forms would be recognized and deforested, as before.

But upon reviewing the change, Michael felt that we were now using two alternative approaches that could make sense on their own, but were mutually incompatible at least in the sense that there's no good reason to have them both present.

For example, we propose to define filter this way:

(define-deforestable
  (filter [f f])
  #'(λ (vs)
      (r:filter f vs)))

And that "deep macro" definition expands to something like:

(begin
  (define-syntax filter
    (lambda (stx)
      #'(#%deforestable ...)))

  (define-for-syntax
    filter-codegen
    (lambda (stx)
      (with-syntax ([f ...])
        #'(λ (vs)
            (r:filter f vs)))))

  (define-syntax filter-info
    (deforestable-info
      filter-codegen)))

… which encodes both the expansion to core Qi (i.e. to #%deforestable) as well as the Racket runtime that encodes the semantics of filter. It is in this sense that it is "deep."

For now, the exact expansion to #%deforestable is left unspecified above. But first, to reiterate the requirements, there are two independent semantics that need to be encoded in this core form:

The Racket runtime or "codegen", that is, the basic semantics of the form as defined in terms of Racket that will be used if the form is not optimized in any way. For instance, for the filter form, a simple Racket implementation of the filter function accepting a list argument.
The "deforested" runtime. This is the implementation that will be used if this flow can be composed with adjacent flows to produce a single fused stream that computes the entire result instead of doing it in several complete steps.

There are (at least) two ways we could have the #%deforestable core form express these distinct semantics.

Encode the runtimes in the IR
Encode the runtimes in a compile-time datatype

Encoding the Runtimes in the IR

With just the codegen, the expansion of (filter odd?) might look like this:

(#%deforestable [f (esc (#%host-expression odd?))]
                (λ (f)
                  (λ (vs)
                    (r:filter f vs))))

In order to add the deforested runtime to it as well, the outline proposal from last week was to expand the definition to include the necessary additional information:

(define-transformer
  (filter [f f])
  #'(λ (vs)
      (r:filter f vs))
  ; provide f, next, and state:
  #'(f filter-cstream-next ()))

… which might expand to:

(#%deforestable [f (esc (#%host-expression odd?))]
                (transformer f filter-cstream-next ())
                (λ (f)
                  (λ (vs)
                    (r:filter f vs))))

… where transformer is a new datatype exposed by the compiler for the purposes of extending deforestation, and filter-cstream-next would need to be defined by the macro author.

This approach has the drawback that it would bloat the size of the expanded code.

Encoding the Runtimes in a Compile-Time Datatype

This is of course the approach we have been developing of late.

The expansion of (filter odd?) looks like this (already):

(#%deforestable #'filter-info [f (esc (#%host-expression odd?))])

Since we already have a compile-time datatype attached to the #'filter-info syntax object that conveys the codegen, we might as well just reuse that by adding an extra field to hold the deforested runtime, so that the above expansion would remain unchanged after this addition.

Mixing It Up

But our modification from last week meant that the form expands to:

(#%deforestable filter #'filter-info [f (esc (#%host-expression odd?))])

Although the only difference is the addition of the filter identifier, conceptually, what we're doing here is denoting, by this identifier, the parametrized deforested runtime that already happens to be present in the compiler. The way in which it is present in the compiler is in terms of syntax classes that match the specific names of these forms in the input syntax and themselves contain the necessary attributes that parametrize the underlying stream implementation in the compiler. For instance, here is the part of the fst (stands for "fusable stream transformer") class that encodes the parameters for the built-in filter stream implementation:

  (define-syntax-class fst
    #:attributes (next f state)
    (pattern filter:fst-filter
             #:attr f #'(filter.f)
             #:attr next #'filter-cstream-next
             #:attr state #'())
    …

This sets the attributes f, next, and state that parametrize the underlying filter-cstream-next stream implementation, which is also defined internally in the compiler. In other words, the built-in deforested runtime is correctly used by virtue of simply matching the name filter provided by the macro author. If we really wanted to go down this road, we might as well match the same filter identifier to use an r:filter runtime included in the compiler for the codegen part, too.

Effectively, we are using both of the above approaches – we're using the compile-time datatype to encode the codegen, and the IR syntax to encode the deforested runtime. When we consider that the purpose of the (any) runtime is to define the semantics of the form, it seems questionable to have two different and uncoupled ways of doing it.

Since the compile-time datatype approach has the advantage of generating less code, and of keeping implementation details neatly encapsulated and not conflated with other more relevant aspects of core language syntax (such as floe and expr positions that will need to be handled distinctly by the expander), we reaffirmed that we should get fully behind that approach and use the same mechanism to encode both runtimes.

At the same time, we also agreed that the addition of the identifier to fit into the existing pattern matching infrastructure was a reasonable first step.

When we fully transition to the compile-time datatype approach to defining filter, it would look something like:

;; The runtime piece
(define-inline ((filter-cstream-next f) next ctx src)
  (λ (done skip yield)
    (next done
          skip
          (λ (value state)
            (if (f value)
                (yield value state)
                (skip state))))))

(define-transformer (filter [f f])
  #'(λ (vs)
      (r:filter f vs))
  ;; The code generator for calling the CPS function
  #'(filter-cstream-next f)
  ;; The intial state of the transformer
  #'())

… which expands to:

(begin
  (define-for-syntax filter-codegen ...)
  (define-for-syntax transformer-codegen
    (lambda (stx)
      (with-syntax ([f ...])
        #'(filter-cstream-next f))))

  (define-syntax filter-info
    (deforestable-info
      filter-codegen
      (transformer-info
       transformer-codegen
       #'())))

  (#%deforestable #'filter-info [f (esc (#%host-expression odd?))]))

The Stream Abstraction

Our working assumption is that the stream abstraction, together with a small number of categories of stream components, including:

producers
transformers
and consumers

… is capable of expressing all list operations of interest. Thus, by having a core stream runtime in terms of these categories, parameterized by certain attributes (e.g. the operating function f, next, and state), we can support extension to custom operations just by allowing the user to specify a stream component and a parametrization.

Dominik abstracted the core deforestation implementation some time ago to operate in terms of these categories and parametrizations, and to apply optimizations based on sequences of these abstract components (e.g. producer → transformer* → consumer), rather than anything more specific such as the names of the forms (e.g. range → filter → foldl). So the groundwork for extension is already in place, and though our recent discussions have tended to elaborate more long term considerations, we seem to be connecting the dots now between what we've already built and what we hope to build.

Our existing implementation handles many special cases, including designating stream transformers that cannot be in the leading position due to their handling of multiple values (which is currently not supported by the stream runtime). Since we're assuming exactly one list input in the initial implementation of qi/list, we can simplify some of these patterns. We should also be able to have fewer layers of syntax classes for producers, transformers, and consumers (maybe just one layer?), as some of the initial layers of matching are intended to sort syntax into the stream categories, but we would now be able to get those categories explicitly from the macro author.

A Surprise in Migrating to `qi/list`

While we were trying things out, we noticed that currently, this code:

#lang racket

(require qi)

(~> ('(1 2 3)) (filter odd?))

… results in the error:

filter: contract violation
  expected: (any/c . -> . any/c)
  given: '(1 2 3)

Formerly, this would not have been surprising as (filter odd?) is partial application of the Racket function filter, and so, of course, we should be using ~>> instead of ~> to pass the list argument in the right position. But with the introduction of qi/list, the expectation may be reversed, where if we usually use qi/list but forget to require it, we might be surprised by this error as Qi's filter is a syntactic form and invariant to threading direction.

To address this, we should do one or both of:

Include filter in (require qi) (analogous to racket/base) so that the above is not a syntax error.
Clearly signpost the difference between Racket filter and Qi filter and when each is in scope.

Where We're Going and How to Get There

At this point, we are all in agreement about where we're going and what we hope or wish for with deforestation in Qi, and have (more or less) realistic expectations of the challenges that lie ahead. It's impressive how much progress we've made in recent weeks on discerning – chess grandmaster style! - some aspects of the ideal extension approach and implementation that we would like to see in the long term, though we may still be weeks or months away from that.

At the same time, we felt we should be careful to balance ambition with firmly planting one foot in front of the other and just making moves, accepting short term kludges to help us "continuously deform" our current implementation to the ideal one while iteratively discovering aspects of that ideal implementation not foreseen in our calculations. Given that we are using an integration branch for this work and are not merging directly to the main branch in this highly experimental and exploratory work, we should feel free to take the most expedient approach to test our assumptions and try whatever we need to in order to get to our goals without necessarily having all the answers already.

The Return of Prof. Michael

Prof. Michael will be teaching a course on DSLs next semester, starting with functional DSLs like pict and Haskell-derived DSLs, moving on to macros and shallow embeddings and exploring changing representations supporting the same syntax, followed by holistic treatment of hosted DSLs including optimizing compilation. It promises to be a very interesting and modern programming languages class, but as this is his first time teaching a brand new course from scratch, Michael is adamant on not livestreaming or otherwise recording the lectures so that he can make his mistakes in peace. Hopefully, we will at least have some weekly updates that we can share here!

Dangerous Rabbit Holes

Dominik is in the orbit of many rabbit holes, careening dangerously from writing S-expression parsers to make working on weird data formats more palatable to "flow-oriented programming" with actual water and channels, to implementing continuations instead of a call stack in assembly instruction sets, and others too terrible to list (none of them involve deforestation). Will our hero emerge from the rabbit holes as he always has before, or is he doomed to fall in and emerge in another dimension? Find out next week, same bat time, same bat channel!

Next Steps

(Some of these are carried over from last time)

Address remaining code review comments and merge the PR.
Implement more fusable stream components like drop, append, and member.
Define the define-producer, define-transformer, and define-consumer interface for extending deforestation, and re-implement existing operations using it.
Ensure (eventually) that both codegen and deforested runtimes are included in the info struct and not in the IR.
Investigate why make build is compiling a lot of packages instead of just qi-lib
Update the benchmarking infrastructure (vlibench and also the basic ones in qi-sdk)
Resolve the issue with bindings being prematurely evaluated.
Fix the bug in using bindings in deforestable forms like range
Finalize and document guiding principles for Git commits in the developer wiki
Revisit extending deforestation to multi-valued settings.
Write a proof-of-concept compiling Qi to another backend (such as threads or futures), and document the recipe for doing this.
Ask Ben to review tests from the perspective of Frosthaven Manager.
Review Cover's methodology for checking coverage.
Document the release model in the user docs and announce the performance regression (and the remedy) to users.
Improve unit testing infrastructure for deforestation.
Discuss and work out Qi's theory of effects and merge the corresponding PR.
Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
Decide on whether there will be any deforestation in the Qi core, upon (require qi) (without (require qi/list))
Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.

Attendees

Dominik, Michael, Sid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qi Meeting Nov 29 2024

Playing Like a Grandmaster

Summary

Background

Incompatible Designs

Encoding the Runtimes in the IR

Encoding the Runtimes in a Compile-Time Datatype

Mixing It Up

The Stream Abstraction

A Surprise in Migrating to `qi/list`

Where We're Going and How to Get There

The Return of Prof. Michael

Dangerous Rabbit Holes

Next Steps

Attendees

Clone this wiki locally

Qi Meeting Nov 29 2024

Playing Like a Grandmaster

Summary

Background

Incompatible Designs

Encoding the Runtimes in the IR

Encoding the Runtimes in a Compile-Time Datatype

Mixing It Up

The Stream Abstraction

A Surprise in Migrating to qi/list

Where We're Going and How to Get There

The Return of Prof. Michael

Dangerous Rabbit Holes

Next Steps

Attendees

Clone this wiki locally

A Surprise in Migrating to `qi/list`