How to Create and Review PRs

Overview

Blockchain software development requires a much higher degree of rigor than most other kinds of software. This is because with blockchains, there is no roll-back. There is essentially zero room for consensus bugs. If you ship a consensus bug, that bug could not only have catastrophic consequences for users (i.e. they lose all their money), but also be intractable to fix, mitigate, or remove. This is because unlike nearly every other kind of networked software, the state of the blockchain is what the users' computers say it is. If you want to make changes, you must get user buy-in, and this is necessarily time-consuming and not at all guaranteed to succeed.

Therefore, making changes to the codebase is necessarily a review-intensive process. No one wants bugs, but no one can afford consensus bugs. This page describes how to make and review non-consensus changes. The process for consensus changes includes not only the entirety of this document, but also the SIP process.

A good PR review sets both the submitter and reviewers up for success. It minimizes the time required by both parties to get the code into an acceptable state, without sacrificing quality or safety. Unlike most other software development practices, safety is the primary concern. A PR can and will be delayed or closed if there is any concern that it will lead to unintended consensus-breaking changes.

This document describes some best practices on how to create and review PRs in this context. The target audience is people who have commit access to this repository (reviewers), and people who open PRs (submitters). This is a living document -- developers can and should document their own additional guidelines here.

This document is formatted like a checklist. Each paragraph is one goal or action item that the reviewer and/or submitter must complete. The key take-away from each paragraph is bolded.

Reviewer Expectations

The overall task of a reviewer is to create an acceptance plan for the submitter. This is simply the list of things that the submitter must do in order for the PR to be merged. The acceptance plan should be coherent, cohesive, succinct, and complete enough that the reviewer will understand exactly what they need to do to make the PR worthy of merging, without further reviews. The lack of ambiguity is the most important trait of an acceptance plan.

Reviewers should strive to complete the review in one round. The reviewer should provide enough detail to the submitter that the submitter can make all of the requested changes without further supervision. Whenever possible, the reviewer should provide all of these details publicly as comments, so that other reviewers can vet them as well.

Reviewers should aim to perform a reviewer in one sitting whenever possible. This enables a reviewer to time-box their review, and ensures that by the time they finish studying the patch, they have a complete understanding of what the PR does in their head. This, in turn, sets them up for success when writing up the acceptance plan. It also enables reviewers to mark time for it on their calendars, which helps everyone else develop reasonable expectations as to when things will be done.

Code reviews should be timely. A PR review should begin no more than 2 business days after the PR is submitted. The develop and next branches in particular often change quickly, so letting a PR languish only creates more merge work for the submitter. If a review cannot be begun within 2 business days, then the reviewers should tell the submitter when they can begin. This gives the reviewer the opportunity to keep working on the PR (if needed) or even withdraw and resubmit it.

Submitter Expectations

Everyone is busy all the time with a host of different tasks. Consequently, a PR's size and scope should be constrained so that a review can be written for it no more than 2 hours. This time block starts when the reviewer opens the patch, and ends when the reviewer hits the "submit review" button. If it takes more than 2 hours, then the PR should be broken into multiple PRs unless the reviewers agree to spend more time on it. A PR can be rejected if the reviewers believe they will need longer than this.

The size and scale of a PR depend on the reviewers' abilities to process the change. Different reviewers and submitters have different levels of familiarity with the codebase. Moreover, everyone has a different schedule -- sometimes, some people are more busy than others. A successful PR submitter takes the reviewers' familiarity and availability into account when crafting the PR, even going to far as to ask in advance if a particular person could be available for review.

Submission Checklist

A PR submission's text should answer the following questions for reviewers:

What problem is being solved by this PR?
What does the solution do to address them?
Why is this the best solution? What alternatives were considered, and why are they worse?
What do reviewers need to be familiar with in order to provide useful feedback?
What issue(s) are addressed by this PR?
What are some hints to understanding some of the more intricate or clever parts of the PR?

In addition, the PR submission should answer the prompts of the Github template we use for PRs.

The code itself should adhere to the following guidelines, which both submitters and reviewers should check:

Documentation

Each file must have a copyright statement.
Any new non-test modules should have module-level documentation explaining what the module does, and how it fits into the blockchain as a whole.
Any new files must have some top-of-file documentation that describes what the contained code does, and how it fits into the overall module.

Within the source files, the following code documentation standards are expected:

Each public function, struct, enum, and trait should have a Rustdoc comment block describing the API contract it offers. This goes for private structs and traits as well.
Each non-trivial private function should likewise have a Rustdoc comment block. Trivial ones that are self-explanatory, like getters and setters, do not need documentation. If you are unsure if your function needs a docstring, err on the side of documenting it.
Each struct and enum member must have a Rustdoc comment string indicating what it does, and how it is used. This can be as little as a one-liner, as long as the relevant information is communicated.

Factoring

Public or exported struct, enum, and trait definitions go into the mod.rs file. Private structs, enums, and traits can go anywhere.
Each non-mod.rs file implements at most one subsystem. It may include multiple struct implementations and trait implementations. The filename should succinctly identify the subsystem, and the file-level documentation must succinctly describe it and how it relates to other subsystems it interacts with.
Directories represent collections of related but distinct subsystems.
To the greatest extent possible, business logic and I/O should be separated. A common pattern used in the codebase is to place the business logic into an "inner" function that does not do I/O, and handle I/O reads and writes in an "outer" function. The "outer" function only does the needful I/O and passes the data into the "inner" function. The "inner" function is often private, whereas the "outer" function is often public.

Refactoring

Any PR that does a large-scale refactoring must be in its own PR. This includes PRs that touch multiple subsystems. Refactoring often adds line noise that obscures the new functional changes that the PR proposes. Small-scale refactorings are permitted to ship with functional changes.
Refactoring PRs can generally be bigger, because they are easier to review. However, large refactorings that affect functional behavior of the system should be discussed first before carried out. This is because it is imperative that they do not stay open for very long (to keep the submitter's maintenance burden low), but nevertheless reviewing them must still take at most 2 hours. Discussing them first front-loads part of the review process.

Databases

If at all possible, the database schema should be preserved. Exceptions can be made on a case-by-case basis. The reason for this is that it's a big ask for people to re-sync nodes from genesis when they upgrade to a new point release.
Any changes to a database schema must also ship with a new schema version and new schema migration logic, as well as test coverage for it.
The submitter must verify that any new database columns are indexed, as relevant to the queries performed on them. Table scans are not permitted if they can be avoided (and they almost always can be). You can find table scans manually by setting the environment variable BLOCKSTACK_DB_TRACE when running your tests (this will cause every query executed to be preceded by the output of EXPLAIN QUERY PLAN on it).
Database changes cannot be consensus-critical unless part of a hard fork (see below).

Data Input

Data from the network and from Bitcoin is untrusted. Code that ingests such data cannot assume anything about its structure, and must handle any possible byte sequence that can be submitted to the Stacks node.
All inputs are space-bound. Every piece of code that ingests data must impose a maximum size on its byte representation. Any inputs that exceed this size must be discarded with as little processing as possible.
All inputs' deserializations are resource-bound. Every piece of code that ingests data must impose a maximum amount of RAM and CPU required to decode it into a structured representation. If the data does not decode with the allotted resources, then no further processing may be done and the data is discarded.
All inputs are time-bound. Every piece of code that ingests data from the network must impose a maximum amount of time that ingestion can take. If the data takes too long to arrive, then it must be discarded without any further processing.
Data ingestion must not panic. Every piece of code that ingests data must gracefully handle errors. Panicking failures are forbidden.

Block Processing

This section is pertinent of flow control can reach any block-processing logic.

Block-processing must be deterministic and well-ordered, regardless of the order in which blocks or microblocks arrive. In general, every node must be able to calculate the same Stacks chain tip, regardless of the order in which they receive the block data.
Blocks and microblocks with a sortition should be stored, even if they are invalid. This is a safety measure to ensure that we never lose data.

Error Handling

Each subsystem must have its own Error type. Error types of aggregate subsystems are encouraged to both wrap their constituent subsystems' Error types in their own Error types, as well as provide conversions from them via a From trait implementation.
Functions that act on externally-submitted data must never panic. This includes code that acts on incoming network messages, blockchain data, and burnchain (Bitcoin) data.
Runtime panics should be used sparingly. Generally speaking, a runtime panic is only appropriate if there is no reasonable way to recover from the error condition. For example, this includes (but is not limited to) disk I/O errors, database corruption, and unreachable code.
If a runtime panic is desired, it must have an appropriate error message.

Logging

Log messages should be informative and context-free as possible. They are used mainly to help us identify and diagnose problems. They are not used to help you verify that your code works; that's the job of a unit test.
DO NOT USE println!() OR eprintln!(). Instead, use the logging macros (test_debug!(), trace!(), debug!(), info!(), warn!(), error!()).
Use structured logging whenever you find yourself logging multiple data with a format string.
Use trace!() and test_debug!() liberally. It only runs in tests.
Use debug!() for information that is relevant for diagnosing problems at runtime. This is off by default, but can be turned on with the BLOCKSTACK_DEBUG environment variable.
Use info!() sparingly.
Use warn!() or error!() only when there really is a problem.

Consensus-Critical Code

A consensus-critical change is a change that affects how the Stacks blockchain processes blocks, microblocks, or transactions, such that a node with the patch could produce a different state root hash than a node without the patch. If this is even possible, then the PR is automatically treated as a consensus-critical change and must ship as part of a hard fork. It must also be described in a SIP.

All changes to consensus-critical code must be opened against next. It is never acceptable to open them against develop or master.
All consensus-critical changes must be gated on the Stacks epoch. They may only take effect once the system enters a specific epoch (and this must be documented).

A non-exhaustive list of examples of consensus-critical changes include:

Adding or changing block, microblock, or transaction wire formats
Changing the criteria under which a burnchain operation will be accepted by the node
Changing the data that gets stored to a MARF key/value pair in the Clarity or Stacks chainstate MARFs
Changing the order in which data gets stored in the above
Adding, changing, or removing Clarity functions
Changing the cost of a Clarity function
Adding new kinds of transactions, or enabling certain transaction data field values that were previously forbidden.

Testing

Unit tests should focus on the business logic with mocked data. To the greatest extent possible, each error path should be tested in addition to the success path. A submitter should expect to spend most of their test-writing time focusing on error paths; getting the success path to work is often much easier than the error paths.
Unit tests should verify that the I/O code paths work, but do so in a way that does not "clobber" other tests or prevent other tests from running in parallel (if it can be avoided). This means that unit tests should use their own directories for storing transient state (in /tmp), and should bind on ports that are not used anywhere else.
If randomness is needed, tests should use a seeded random number generator if possible. This ensures that they will reliably pass in CI.
When testing a consensus-critical code path, the test coverage should verify that the new behavior is only possible within the epoch(s) in which the behavior is slated to activate. Above all else, backwards-compatibility is a hard requirement.
Integration tests are necessary when the PR has a consumer-visible effect. For example, changes to the RESTful API, event stream, and mining behavior all require integration tests.
Every consensus-critical change needs an integration test to verify that the feature activates only when the hard fork activates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly