Skip to content

Qi Meeting July 12 2024

Siddhartha Kasivajhula edited this page Jul 24, 2024 · 6 revisions

Commit Issues

Qi Meeting July 12 2024

Adjacent meetings: Previous | Up | Next [None]

Summary

We discussed what good Git practices might look like for the Qi repo, and also discussed plans for organizing development around the deforestation work.

Background

We're still working on providing deforested list APIs. Ben posted a truly gargantuan code review on Dominik's PR including many substantive comments but also meta comments regarding development and specifically Git practices. It brought up the question of what good commit practices look like, and which practices would be best for us to follow.

Sid started refactoring deforestation to operate via the intermediary of an explicit core language form, #%deforestable.

The Space of Git Practices

Different projects and organizations use different development practices and have different cultures and attitudes towards commits, commit messages, branching, pull requests, merges and rebases, and so on. Which conventions are best for us? What are the tradeoffs?

The goal of this meeting was to discuss these, and the goal of these notes is to start to synthesize the discussions into guiding principles -- not hard rules. If we know what we are going for and why, it would help us make informed choices in this regard during development.

The Parameters of the Space

Git practices could be quantified as a space defined by the following parameters, some of which are continuous and others of which are discrete or binary. These parameters can serve as the basis for guiding principles.

Small vs Large

Just as modules in a codebase could be large or small -- even very small -- and still be cohesive and sensible, so too can Git commits be small or large, and still be cohesive. So this parameter is perhaps irrelevant to determining whether a commit is good or bad, aside from a practical consideration we'll soon get to.

Consistent vs Inconsistent

Consistency is a semantic property of the code, which we could reify as, "do the tests pass?"

If a commit causes the tests to break, then that commit is inconsistent, otherwise, if the tests are passing, the commit is consistent.

Consistency of every commit is a high bar, and it may not always be convenient to achieve, but it does provide value:

  • forensic utility in debugging issues -- e.g. supporting the ability to "bisect" to identify commits that introduced bugs.
  • granular flexibility in depending on literally any version of the code. For instance, if a feature is introduced in commit X and improved until the introduction of another feature Y, and if Y has a bug, then we could reliably rely on commit Y-1 instead of searching for a version between X and Y that is consistent.

These are worth considering as we continue to discuss the tradeoffs.

Principle: Commits should be consistent.

Strategy: Before committing, run tests and ensure they pass.

Though useful in principle, in practice, consistent commits seem to be much harder to achieve than "cohesive" ones (discussed below). It would be useful to see if there are any strategies we could adopt to make achieving consistency more convenient.

Cohesive vs Hodgepodge

A cohesive commit is one where all of the changes are guided by the same goal -- the question "Why?" has a single answer. A commit including unrelated changesets that happen to have been done concurrently is a hodgepodge.

Principle: Commits should be cohesive.

Strategy: Use a tool like Magit to compose your commit by staging individual changes that are related to one another.

A cohesive changeset is not necessarily consistent, nor is the reverse true.

For example, a commit could include a bugfix as well as broad formatting improvements in unrelated modules. This is not a cohesive commit, but if all the tests pass then it is consistent. Conversely, a commit could refactor a module into submodules - a cohesive change. But it may still exhibit a compilation error or cause tests to fail and in that case would not be consistent.

Returning to "small vs large" for a moment, consider a set of linting changes that simply improve formatting, variable naming, etc. There could be a dozen individual changes involved. Each of these changes on its own would constitute a "cohesive" commit. In such cases, though, it would be better to commit them all at once since in this example, the composed set of changes itself is cohesive.

Principle: Small, cohesive changesets should be combined into one commit if their union is cohesive.

Example: Lint.

Complete vs Partial

A complete commit is not only cohesive but includes everything entailed in the concept being introduced into the codebase. For instance, adding a new feature and adding tests for that feature could be two separate cohesive commits, but neither of these commits on their own would be "complete." A commit combining these two changesets would be.

Principle: Commits should be complete.

In practice, we do not always write tests at the same time as the code we change, for many reasons. Once again, completeness seems much harder to achieve in practice than cohesiveness. It's impossible to always anticipate all the changes that are implied by a changeset.

Normalized vs Denormalized

A modern convention for commit messages employs the following schema:

  • The first line is a short summary, ≤ 50 characters

Sometimes, this summary is sufficient to describe the changes clearly and effectively. But if not, then for the rest of the commit message:

  • There must be an empty line after the summary line
  • The remaining message should describe the changes and considerations in more detail, and consisting of paragraphs with each line up to 72 characters long

The advantage of following this convention is that Git tools and web UIs recognize it and will format it in the appropriate way to improve readability.

But the argument could be made that such detailed descriptions should be included in the documentation, or in comments in the source code, and that commit messages need only summarize the change very briefly. In addition, the pull request description could include more of the rationale. Thus, the full context on the set of changes is "normalized" in the sense of databases -- that is, each piece of information is contained in just one place, with there being a paper trail that could be followed to fully reconstruct it.

Yet, commit messages, source comments, PR descriptions, and documentation are used in different ways and in different settings. When we read documentation, we are trying to understand the behavior as it is now. When we read source comments, we are in the middle of modifying that code or are trying to learn more about it. When we look at commit messages, we are trying to gain narrative context of how things got to be the way they are, or are trying to debug a problem that was introduced at some point in the past. We typically only look at PR descriptions during the process of code review -- additionally, they are specific to the repo hosting provider (e.g. GitHub) and are not portable and self-contained within the repo.

As these are all fairly different use cases, for the same reason as with databases, it wouldn't hurt to "denormalize" the context across all of these different categories of things. At the same time, we really don't want to spend too much time on any particular thing.

Principle: Commit messages should be brief but self-contained (denormalized).

Strategy: Instead of treating the commit message as a quick label, just spend a moment thinking about why you're making this commit, and write that down -- your in-the-moment thoughts may be interesting to read, and are not captured anywhere else!

This ensures that commit messages tell a complete, if abbreviated, story, without being too onerous to write. The 80-20 rule applies here -- we can get a commit message that's 80% as good as it could be with only 20% of the effort it would take to get a perfect message. We can certainly strive for "good enough" here. Following the schema above is becoming widely accepted, and it doesn't take any additional time once we are used to it, so we might as well adhere to it.

Short timeframe vs long timeframe

Looking at any codebase, we could start afresh in a new repo and commit the entire original codebase as a single commit. This commit would meet the requirements of cohesiveness, consistency and completeness described above.

But a single commit capturing the entire history of changes is not useful, and defeats the purpose of having version control in the first place.

On the other extreme, we could have many commits over a small timespan of changes. There could be a dozen consistent and complete commits in just an hour of work. We could likewise squash these commits -- in other words compose them -- into a single commit, and in this case it seems more acceptable than to do it for years of work. Similarly, in the pursuit of consistency and completeness we may be tempted to reorder and "fixup" commits that are non-contiguous. Once again, it seems more acceptable to do this for commits made over the span of an hour than over the span of years.

Yet, the timescale is in some sense arbitrary. Where do we draw the line for when it is appropriate to squash and reorder commits? One position could be that it is acceptable to do so within a pull request that has a single author (i.e. individual PRs, but not integration branches, nor the main branch). It could be argued that squash-merging is just a convenient -- and arguably overly simplistic -- boundary point in the spectrum of possibilities for representing the PR's changes on the main branch -- thus, fixing up commits within the PR is just a more nuanced version of this. On the other hand, retroactively curating commits is extra work that may not feel worthwhile. It's easier to squash-merge and just be done with it.

Principle: Reorder and fixup commits within PRs if it is trivial to do.

Example: Renaming a variable in one commit, followed by a second commit much later doing the same thing in a function that was left out of the change initially. Here, the commits could be reordered and combined.

"Death of the Author"

Like the "death of the author" theory used in literary analysis, sometimes, even a good commit message may not express everything that was really going on and all of the real rationale guiding the author in making changes. It's easy to forget the complexity of the journey in retrospect, and there may be valuable forensic data that is lost in a simplistic squash-merge, even if the commit message satisfies review standards of being "good." So there is value in preserving the causal sequence of events, however messy it may be.

Of course, this complexity can be modeled by Git branches. The full set of commits remains on the feature branch, while squashed commits go on the main branch. But even this is inadequate as it means that the forensic data is not self-contained in the project repo, and even if it were, it would be awkward and inconvenient to reconstruct the causal sequence from these myriad branches.

That is, preserving the causal sequence is valuable regardless of the quality of commit messages. But messiness of the commit history can still incur a greater cost in developer time despite this value, so there is a balance to be struck.

Principle: Avoid squashing large sets of changes.

As we have also the principle of reordering commits only when it is trivial to do, with a little care it would be best to avoid such issues arising in the first place, by making a best effort being proactive about representing the causal sequence in commits that strive to meet the principles outlined here, at the same time without being overly pedantic.

In Sum: Getting Commits Right

Yes, paying attention to commits and commit messages is extra work, but it doesn't have to be a rabbit hole. By careful application of the 80-20 rule and guided by the above principles (and any others we discern), we could gain worthwhile improvements in code and repo quality (and the benefits that brings, such as a better debugging experience, more structure and help for understanding code changes, greater likelihood of community participation, etc.) for a modest price.

Qi/List

Case Study: The Need for an Integration Branch

We agreed that the deforestation work needs an integration branch as merging it in its current form would introduce backwards incompatibility when we introduce the #%deforestable core form and, along with it, Qi (rather than Racket) syntax for list operations.

This is unwittingly a good case study for some of our Git discussions, since in principle, we could have introduced the #%deforestable core form prior to rearchitecting the deforestation pipeline and introducing deforestation of take and other list operations, and this would have steered clear of introducing backwards incompatibility and wouldn't have needed an integration branch.

Yet, in terms of how things actually happened, the deforestation work happened first, and the #%deforestable form is being introduced as a refactor of the modified architecture, indeed, taking advantage of the improved architecture. Sid could rewrite his changes against the main branch, and then Dominik could rewrite his changes against the new changeset merged into main. It would be a lot of work to get to the same place, but we would have a less messy version of history as a result. Would it be worth the effort?

With regard to rewriting history, a line needs to be drawn somewhere. But as we saw earlier, there are cases where it is worth doing.

Core Deforestation

We discussed that core list operations like filter and map should be in qi, not in qi/list. This resolves some awkwardness with racket/base vs racket/list.

Benchmarking with and without deforestation

We also discussed that we will need to use something other than the require-time mechanism to turn deforestation on and off in benchmarking, as the deforestation pass is being moved into the core (and will not be enabled as a require-time side effect of (require qi/list)).

Do we need a Racket/List Clone with Deforestation?

After discussion, we agreed this would be too many options. There's never a case where we would want to use both racket/list and qi/list. Therefore, we should only provide one.

Next Steps

(Some of these are carried over from last time)

  • Finalize and document guiding principles for Git commits in the developer wiki
  • Start an integration branch for the deforestation work
  • Revisit extending deforestation to multi-valued settings.
  • Write a proof-of-concept compiling Qi to another backend (such as threads or futures), and document the recipe for doing this.
  • Ask Ben to review tests from the perspective of Frosthaven Manager.
  • Review Cover's methodology for checking coverage.
  • Document the release model in the user docs and announce the performance regression (and the remedy) to users.
  • Improve unit testing infrastructure for deforestation.
  • Discuss and work out Qi's theory of effects and merge the corresponding PR.
  • Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
  • Deforest other racket/list APIs via qi/list
  • Decide on whether there will be any deforestation in the Qi core, upon (require qi) (without (require qi/list))
  • Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.

Attendees

Dominik, Michael, Sid

Clone this wiki locally