Skip to content

Qi Meeting Dec 6 2024

Siddhartha Kasivajhula edited this page Dec 13, 2024 · 2 revisions

Pulling a Rabbit Out of a Hat

Qi Meeting Dec 6 2024

Adjacent meetings: Previous | Up | Next [None]

Summary

We merged the PR that completes integration of the "deep macro" mechanism for extending deforestation, and realized that we're, unexpectedly, just about ready to do a new release. We also discussed slow build times for Qi, next steps on deforestation, rabbit holes, and some ideas on making Racket small enough to "fit in your head," especially whether having just one kind of binding used at different phases in Racket (rather than two kinds of bindings) is feasible.

Background

Last week, we reviewed the PR integrating the "deep macro" extension approach into the compiler. We addressed some issues that we discovered during the meeting, and there were a few comments to be addressed afterwards.

Slow Build Times for qi-lib

Last time we'd noted that make build (our usual command to compile qi-lib) was recompiling all kinds of packages that we weren't expecting, including qi-doc, qi-sdk, and more. As a result, it was taking a long time (almost a minute?) -- much longer than it used to. make build invokes a Makefile target that simply runs raco setup --pkgs qi-lib, so we really should expect only qi-lib to be recompiled.

Sam TH pointed out on Discord that the cause was that the --pkgs flag causes raco to rebuild all packages in any collection contained in the provided packages, effectively doing a package → collection → file mapping and thus building any packages providing modules in the Qi collection. As there are more such packages nowadays than when we first started using make build, that would explain why it's taking longer.

Sid created an issue to record this for future improvement, but in the meantime, Bogdan suggested a way to do what we want for now: raco make -l qi. This is equivalent to raco make -l qi/main and ends up compiling Qi's main.rkt module and, as usual for raco make, all of its dependencies, transitively. We modified the make build target to use this command, and it now takes < 10 seconds!

We did notice, however, that tests now take longer to run, since formerly, make build would implicitly compile qi-test, but now, test modules are being recompiled on the fly before the tests can be run. We'll need to update the make test-* targets in a similar way.

And Now We ... Release?

Merging the PR

Having addressed the remaining code review comments from last time, we merged the PR into the integration branch.

We had been anticipating that, at this point, we would begin the work of implementing deforested runtimes for remaining standard list operations, i.e. specifically those modeled after the racket/list collection as a reference, and also start to define and expose an extension interface to users for custom list operations. But we realized that we are now unexpectedly in a position to release a new version of Qi, since the numerous behind-the-scenes changes to the architecture of deforestation are now complete, and there are no "real" backwards incompatibilities at this stage.

Major or Minor Version?

We felt that although there are technically no backwards incompatibilities (except in terms of performance of host language functions in racket/list), it still felt like a major release would be most appropriate as this introduces Qi higher-order expressions in list operations and includes a fair bit of core rearchitecture. So we agreed that this release will be Qi 5.

Next Steps on Deforestation

Based on our discussion last time, we felt that the next step would be to eliminate the explicit syntax classes in the compiler that match the syntax by name (e.g. range, map or filter), and instead perhaps extend define-deforestable to allow specifying the parameterization (like f, state, and so on) to be used in the corresponding stream components in the deforested runtime. That is, operators like map and filter could be defined using define-deforestable by providing both (1) the naive Racket runtime or "codegen" defining its semantics (which it already does) in case it does not get deforested, as well as (2) a stream component to use and a set of parameters for it that yield the analogous desired semantics but using the deforested runtime. Then, we can also further abstract these to define-stream-producer, define-stream-transformer, etc. that expand to define-deforestable.

Looks like we'll tackle this after the release!

Trajectories in Pantůček Space

A few years ago, Bogdan Popa introduced a very cool declarative GUI framework for Racket called gui-easy. It's so handy that it's even the default GUI framework in Rhombus today. There's really, really nothing wrong with it, and no reason at all to mess with it.

But just think, what if you could have all the same awesome features you're used to, but in an ASCII, VT-compatible terminal environment, so that you can relive the glory days of Borland C++? Wouldn't that be cool? No. No, it would not. Who would even imagine such a thing?

Yes, amazingly enough, there's only one person capable of this, and, you guessed it, it's Dominik. Having ventured down this very deep rabbit hole (called tui-easy, and fine, it can't be denied that it is really cool and retro!), he has his hands full and it looks like he won't be deforesting anything anytime soon. 😭

But despite the attractive pull of this new lagomorphic singularity, the prospect of a solstice Qi release has created enough anti-rabbity that he's going to re-run the suite of benchmarks on the release branch so we can verify that we don't have any performance regressions.

Maybe we also need to add a new benchmark for take, since that is a newly deforested operator that is going to be part of this release. 🤔

Bindings and Phases

As an aside, we also discussed some other things related to Sid's crazy idea to do away with transformer bindings and replace them with ordinary bindings declared as required for-lang at the module level (since Sid is preparing a blog post on "small languages that fit in your head" for the Racket Advent calendar and wondering if he should mention anything about this), but we felt that there are a lot of questions it leaves unanswered, and it doesn't feel especially viable at this time.

One aspect is that even if we eliminate defining different kinds of bindings, it would seem that they are, nevertheless, present in modules when we require them, as macros may be used in a module alongside ordinary variables, and they are to be treated distinctly by the expander.

Michael pointed out how Racket's "phases" are best thought of as a self-reflective tower of languages (a fascinating and insightful way of looking at it!), so that phase 1 entails things that are useful for compiling the phase 0 language. Likewise, phase 2 contains resources for compiling the phase 1 language. Although these higher phases typically use something like racket/base, there's no reason they couldn't, in principle, using entirely different languages like Scala or Haskell, if we happen to use those languages in compiling the phase 1 or phase 0 languages. These languages themselves may likewise be treated as phase 0, and may themselves involve phases definable in relation to them. How would the proposal model these?

Potentially: in the modules that are required for-lang, all the syntax → syntax functions would be treated as macros to be used in compiling the original module. On the other hand, all the definitions within the required module would be available as "compile time" or "phase 1" bindings, that is, for use by the "macros" (i.e. syntax → syntax functions) defined in that required module at their phase 0. This could potentially be well modeled by using submodules required at phase 0 within the macro module, but which are not provided outside the module, so that the original module requiring the macro module would only get the syntax → syntax functions, which would be applied to the module source at compile time by virtue of the semantics of for-lang. Said another way, whether something is a "macro" is a property of expansion, not of the definitions themselves.

The proposal doesn't eliminate phases, but rather, suggests a reorganization of modules along lines that more precisely reflect their use. Arguably, the fact that phase 1 and phase 0 languages are evaluated – by definition – at different times, and even could be totally different languages, is a sign that there really shouldn't be any overlap in their dependencies, or rather, that modules could be profitably organized so as to avoid such overlap.

In the case where a module provides more than one kind of binding – for example, both functions as well as macros – a common case – the proposal would favor splitting the module into two modules, one of which will be required in an ordinary way and the other of which would be required for-lang, and both modules would define only ordinary functions except that one of them would consist solely of syntax → syntax functions.

Likewise, in the module that requires other modules both ordinarily as well as for-lang, we could consider that there is only one kind of binding here, but existing at two different phases, and with for-lang entailing a directive that certain bindings are to be used at phase 1 in compiling the source module.

There are also cases where we may seek to export materials that are relevant to different phases in connection with the same language. For example, Syntax Spec provides identifiers at multiple phases, like, for instance, the floe nonterminal might be exported as a phase 1 identifier, since it's information that's relevant for compiling the phase 0 language, Qi, but isn't itself part of Qi or relevant for writing Qi.

The proposal would imply in this case that we should not export bindings at multiple phases, and instead, once again, decompose our modules so that they are each actually used by other modules only at specific phases.

Some time ago when we first discussed this, Ben raised a number of questions about it, including:

  1. How does it model hygiene?
  2. How does it achieve delayed evaluation in cases like if and other conditionals?
  3. Why disallow define and define-syntax (i.e. the two kinds of bindings) in the same module?

It's likely that the first two are achieved by virtue of the fact that the syntax → syntax functions are applied at phase 1 by the for-lang directive, and the expander should be able to implement hygiene and delayed evaluation in the same manner it does today (however that may be).

Re: the third, Sid feels that having one kind of thing used in two different ways is clearer than having two different kinds of things. This seems especially so in cases where we use define-syntax to bind a value that isn't meant to be used as a macro, and is then retrieved during expansion (phase 1) using a special syntax-local-value mechanism. Couldn't we find a way to do this by just using ordinary bindings in the appropriate phases?

In any case, these many considerations will need to be satisfactorily treated before the proposal to unify bindings could be viable.

Incidentally, the reason Sid is interested in this is in pursuit of "languages that fit in your head," and taking any useful steps could make the incredible power of the Racket platform more broadly accessible through such design principles.

One article that came up as possibly being related to this discussion is Matt Might's article on First Class Runtime Macros.

Next Steps

(Some of these are carried over from last time)

  • Update make test-* Makefile targets.
  • Update documentation for the Qi 5 release.
  • Write any new benchmarks needed, e.g. for take.
  • Mark the release branch as "ready for review" and tag reviewers.
  • Release Qi 5!
  • Announce the release.
  • Implement more fusable stream components like drop, append, and member.
  • Define the define-producer, define-transformer, and define-consumer interface for extending deforestation, and re-implement existing operations using it.
  • Ensure (eventually) that both codegen and deforested runtimes are included in the info struct and not in the IR.
  • Investigate why make build is compiling a lot of packages instead of just qi-lib
  • Update the benchmarking infrastructure (vlibench and also the basic ones in qi-sdk)
  • Resolve the issue with bindings being prematurely evaluated.
  • Fix the bug in using bindings in deforestable forms like range
  • Finalize and document guiding principles for Git commits in the developer wiki
  • Revisit extending deforestation to multi-valued settings.
  • Write a proof-of-concept compiling Qi to another backend (such as threads or futures), and document the recipe for doing this.
  • Ask Ben to review tests from the perspective of Frosthaven Manager.
  • Review Cover's methodology for checking coverage.
  • Document the release model in the user docs and announce the performance regression (and the remedy) to users.
  • Improve unit testing infrastructure for deforestation.
  • Discuss and work out Qi's theory of effects and merge the corresponding PR.
  • Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
  • Decide on whether there will be any deforestation in the Qi core, upon (require qi) (without (require qi/list))
  • Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.

Attendees

Dominik, Michael, Sid

Clone this wiki locally