Welcome to the documentation for the System Initiative software. This document not only contains information about SI itself, but also about running and developing SI in general.
The sections within range from guides to knowledge-base-esque information. The goal for the document is to value longer living documentation and notes over precise documentation that reflects a point in time feature of the system. If you just want to get SI up and running, see the README.
Above all, for any and all questions related to either using or developing the software, never hesitate to reach out to us on our Discord server.
- How to Read This Document
- Development Environment
- Troubleshooting
- Running the Stack Locally
- Using LocalStack for Secrets and Credentials
- Using buck2
- Reading and Writing Rust-based Code Documentation
- Learning About SI Concepts
- Running Rust Tests
- Preparing Your Pull Request
- Core Services
- Repository Structure
- Adding a New Rust Library
This file is designed to be rendered on GitHub. As a result, some sections will not look as intended in all markdown readers. Here is how to do that:
- Navigate to this file on GitHub.
- Find the section picker in the file header.
- Upon clicking it, a sidebar section will pop out with a generated, nested sections that you can click through. GitHub automatically generates these based on the headings for each section and sub-section in this file.
For more information, see the GitHub guide.
Developing SI locally can be done in a variety of ways, but the officially supported method is to use the nix
flake at the root of the repository.
Using the flake requires using one of the below platforms.
It is possible that the System Initiative software can be developed on even more platforms, but these platforms have been validated to work with nix
and the corresponding flake.
macOS (Darwin) is officially supported on both x86_64 (amd64) (also known as "Intel") and aarch64 (arm64) (also known as "Apple Silicon") architectures. We do not specify the minimum version of macOS that must be used, so we recommend looking at the "Dependencies" section for more information.
For aarch64 (arm64) users, Rosetta 2 must be installed.
You can either install it via directions from the official support page or by running softwareupdate --install-rosetta
.
On macOS, you will likely hit the file descriptor limit, which requires user intervention. We have a section further down that describes how to intervene.
Linux (GNU) is officially supported on both x86_64 (amd64) and aarch64 (arm64) architectures. Linux with MUSL instead of GNU is untested.
In general, GNU-based distros will work. Those include, but are not limited to the following: Ubuntu, Fedora, Debian, Arch Linux, and openSUSE.
If using NixOS, you need Docker to be installed and Flakes to be enabled.
If not using direnv
, you can use nix develop
or Nix command.
Using native Windows is not supported at this time, but may be desired in the future.
However, WSL2 on Windows 10 and Windows 11 is officially supported on
both x86_64 (amd64) and aarch64 (arm64) architectures.
In order to work with nix
, systemd
may need to be enabled in your WSL2 distro of choice.
On WSL2, you will likely hit the file descriptor limit, which requires user intervention. We have a section further down that describes how to intervene.
On some systems, you may need to significantly increase the file descriptor limit for building and running our services with buck2
.
This is because buck2
opens many more files than either cargo
or pnpm
do.
Not only that, but when using Tilt to build and run concurrent services, even more files are opened than they would be for sequential builds.
Increasing the file descriptor limit is possible via the ulimit
command.
To see all limits, execute the following command:
ulimit -a
Here is an example of a significant limit increase, where the argument provided after the flag represents the new desired number of file descriptors:
ulimit -n <file-descriptor-count>
To find an acceptable limit, run the health check command.
buck2 run dev:healthcheck
The local development environment sets up a high number of file watches. On Linux, this uses the inotify
kernel subsystem. This setting
may need to be increased in order to allow all of the inotify
watches to be configured.
Warning
To tune this value, it will require sudo access.
sudo sysctl fs.inotify.max_user_watches=<new-max-user-watches-value>
To find an acceptable value, run the health check command.
buck2 run dev:healthcheck
To make this setting persistent, consult your distributions documentation.
For all supported platforms, there are two dependencies that must be installed, nix
(preferably via the Determinate Nix Installer) and docker
.
We use nix
as our package manager for the repository.
It ensures that our developers are all using the same versions of all packages and libraries for developing SI.
Regardless of how nix
is installed, it must have the flakes feature enabled.
We highly recommend using the Determinate Nix Installer over the official installer; one reason being that the former will enable flakes by default.
Tip
You can use direnv
(version >= 2.30) with our nix
flake for both ease of running commands
and for editor integration.
For more information, see the Direnv section.
We use docker
to run our dependent services for the SI stack.
It can either be installed via Docker Desktop or directly via Docker Engine.
For Docker Desktop, the version corresponding to your native architecture should be used (e.g. install the aarch64 (arm64) version on a Apple-Silicon-equipped MacBook Pro).
WSL2 users should be able to use either Docker Desktop for WSL2 or Docker Engine (i.e. installing and using docker
within the distro and not interacting with the host).
Regardless of platform, you may need to configure credentials in ~/.local/share
.
Since Rancher Desktop provides the ability to use moby, you can use it to run and develop the System Initiative software. However, it is untested, and you may need to further configuration depending on your platform.
Direnv (version >= 2.30) with nix-direnv can automatically set up your shell, which means you don't need to enter a subshell with nix develop
, or prefix all commands with nix develop --command
.
You can install it with your package manager of choice, but if you're unsure which installation method to use or your package manager does not provide a compatible version, you can use nix
itself (e.g. nix profile install nixpkgs#direnv
).
We recommend using the upstream docs for hooking direnv
into your shell, but here is an example on how to do it on a system where zsh
is the default shell.
In this example, the following is added to the end of ~/.zshrc
.
if [ $(command -v direnv) ]; then
eval "$(direnv hook zsh)"
fi
There are also plugins to integrate direnv
with common editors.
- Emacs: emacs-direnv
- IntelliJ-based IDEs: Direnv integration, Better Direnv
- Neovim and Vim: direnv.vim
- Visual Studio Code: direnv
All commands need to be run from the nix
environment.
There are two primary options to do so:
- If
direnv
is installed and hooked into your shell, you cancd
into the repository andnix
will bootstrap the environment for you using the flake. - Otherwise, you can execute
nix develop
to enter the environment,nix develop --command <command>
to execute a command, or use the environment in whatever way your prefer.
For backend services like veritech
and sdf
, there will usually be an INFO
-level log indicating that the webserver has bound to a port and is ready to receive messages.
This may be subject to change (e.g. underlying library is upgraded to a new major version and the startup sequence changes) and will vary from component to component.
This section contains common troubleshooting scenarios when working on the System Initiative software.
Since we switched to buck2
for our build infrastructure in mid-2023, you may experience issues when running services reliant on node_modules
within older cloned instances of the repostiory.
To solve these build errors, run the following in the root of your repository:
Warning
This command deletes files. Ensure your current working directory is the root of the repository and understand what the command does before executing. Please reach out to us on our Discord server if you have any questions.
find app bin lib third-party/node -type d -name node_modules -exec rm -rf {} \;; rm -rf node_modules
If you see an error related to NATS Jetstream not being enabled when running the stack or tests, your local systeminit/nats
image is likely out of date.
To get the most up-to-date images, including the aforementioned image, run the following command:
buck2 run //dev:pull
SI uses external services in conjunction with its native components.
These external services are deployed via docker compose
and are configured to stick to their default settings as closely as possible, including port settings.
Thus, it is worth checking if you are running these services to avoid conflicts when running SI.
Potentially conflicting services include, but are not limited to, the following:
- Jaeger
- NATS and NATS Jetstream
- OpenTelemetry
- PostgreSQL
In the case of a port conflict, a good strategy is to temporarily disable the host service until SI is no longer being run.
In your editor, you may find that you'll see errors like "YourEnum does not implement Display"
if you are using Display
from the strum
crate.
This is because your editor may not have "proc macros" ("procedural macros") enabled.
Check out your editor or relevant plugin docs for more information.
This section provides additional information on running the System Initiative software stack locally. Readers should first refer to the README and the development environment section higher up before reading this document.
While the README covers using buck2 run dev:up
, there are two other ways to run the full stack locally:
buck2 run dev:up-standard
: run withrustc
default build optimizationsbuck2 run dev:up-debug
: run withrustc
debug build optimizations for all services except for therebaser
buck2 run dev:up-debug-all
: run withrustc
debug build optimizations for all services
By default, the stack will run with rustc
release build optimizations, which is what users and testers of the System Initiative software will want to use.
It runs the software in its intended state.
However, if you are a contributor seeking build times suitable for rapid iteration, you may want to use one of the aforementioned options.
Warning
Contributors should test your changes with integration tests and with release optimizations when running the full stack. The aforementioned options are solely recommended for rapid iteration.
You run the full stack the same way, but with additional environment variables.
TILT_HOST=0.0.0.0 DEV_HOST=0.0.0.0 buck2 run dev:up
Caution
The user is responsible for maintaining secure access (local, remote, etc.) to their local instance.
This section contains information related to using LocalStack when working on the System Initiative software.
You can use the "AWS Credential" builtin SchemaVariant
with LocalStack when running the System Initiative software with the following command:
buck2 run //dev:up
To use LocalStack with "AWS Credential", do the following:
- Create a
Component
using theSchemaVariant
. - Create a
Secret
and use it in the property editor. - Populate
http://localhost:4566
(orhttp://0.0.0.0:4566
, depending on your system) in the "Endpoint" field for theSecret
.
Now, you can use LocalStack in your development setup.
This section ontains information on using buck2
within this repository.
We recommend using the buck2
binary provided by our nix
flake to ensure compatible versioning.
For information on what buck2
is, please refer to the upstream repostiroy.
- A "target" is an instantiation of a rule
- A "rule" is a library-esque function that can be buildable, runnable and/or testable
- A "buildable" rule (
buck2 build
) only runs when affected sources are changed, and ignores environment variables and passed down command-line arguments - A "runnable" rule (
buck2 run
) runs upon every invocation, and accepts environment variables and passed down command-line arguments - A "testable" rule (
buck2 test
) runs upon every invocation and is similar to a runnable rule, but collects test metadata and is intended for sandboxed environments (e.g. CI)
All buck2
commands follow similar syntax.
<ENV> buck2 <CMD> <PATH/TO/DIRECTORY/WITH/BUCK>:<TARGET> -- <ARGS>
You can use pseudo-relative pathing to access targets.
You cannot use relative parent directories (e.g. ../../path/to/directory/with/BUCK
), but you can use child relative directories, like in the example below.
buck2 run lib/dal:test-integration
However, you can always use the //
prefix to start from the root, regardless of your current working directory in the repository.
# Let's change our current working directory to somewhere in the repository.
cd bin/lang-js
# Now, let's build using a BUCK file somewhere else in the repository.
buck2 build //lib/dal
You may have noticed in the example above, we could build lib/dal
without writing lib/dal:dal
.
If the target shares the same name as the directory, you do not have to write the name.
# This...
buck2 build lib/dal
# is the same as this...
buck2 build lib/dal:dal
# and this...
buck2 build //lib/dal
# is the same as this.
buck2 build //lib/dal:dal
Both BUCK
files and the rules it uses are written in Starlark, which is a superset of Python.
There is an upstream VS Code extension that you can build locally using npm
.
After building it, you can install the vsix
file as an extension in VS Code.
One great thing about the extension is that you can use "Go To Definition" in VS Code to follow the path of where a rule comes from and how it's written.
There are two directories where the rules come from:
- prelude: the vendored, upstream standard library with rules for common use cases and programming languages
- Common use case example:
sh_binary
is provided as a way to run shell scripts - Programming language example:
rust_library
is a provided as a way to add buildable Rust library targets - Side note: this must be kept up to date in conjunction with the
buck2
binary
- Common use case example:
- prelude-si: our custom library with custom rules and wrappers around existing rules
- Example:
rust_library
provides more than the standardrust_library
rule and includes additional targets
- Example:
Run the following command, but do not forget the :
at the end.
buck2 targets <PATH/TO/DIRECTORY/WITH/BUCK>:
Here is an example:
buck2 targets //lib/dal:
Expanding on the terminology section, "buildable" targets only use files that are explicitly provided. If your new file isn't available during builds, you likely need to do one of two things:
- Use the
export_file
rule to ensure the file is available for builds - Check the
srcs
attribute of a rule, if applicable, to ensure the file is in the sources tree
Sharing the same extension as Bazel files, bzl
files contain rules written in Starlark for buck2
.
For complex introspection and interaction with graphs, bxl
files are Starlark scripts that extend buck2
functionality.
Check out the section on running Rust tests farther down in this file. Short version: you'll use the following pattern:
# Pattern for unit tests
<ENV> buck2 run <PATH/TO/RUST/LIBRARY/DIRECTORY/WITH/BUCK>:test-unit -- <ARGS>
# Pattern for integration tests
<ENV> buck2 run <PATH/TO/RUST/LIBRARY/DIRECTORY/WITH/BUCK>:test-integration -- <ARGS>
With xargs
installed, run the following command:
buck2 uquery 'kind("rust_(binary|library|test)", set("//bin/..." "//lib/..."))' | xargs buck2 build
This commands queries for all rust_binary
, rust_library
and rust_test
targets found in bin/**
lib/**
directories with BUCK
files.
Then, it pipes that output into buck2 build
.
If you intend to run these services with optimial performance, you need to build with release mode.
Here is an example building sdf
, pinga
, veritech
, rebaser
and forklift
from the repository root:
buck2 build @//mode/release bin/sdf bin/pinga bin/veritech bin/rebaser bin/forklift
Specifying modes (i.e. @//mode/<mode>
) from the mode directory changes the build optimizations for buildable targets.
The build cache is not shared between modes and changing modes does not invalidate the cache for other modes.
In other words, you build the same buildable target with two different modes and retain both caches.
If you are looking to find other buck2
users and/or ask questions, share ideas and share experiences related to buck2
, check out the unofficial "Buck2 Fans" Discord server.
This section contains all information related to developer documentation for this repository's source code.
Let's say you want to learn what a Component
or an AttributeValue
is.
Where do you look?
You can generate and open the docs in your browser to find out!
buck2 run //lib/dal:doc -- --document-private-items --open
Our Rust crates contain module and documentation comments that can be generated into static webpages by rustdoc
.
When in doubt, see if doc
target for a Rust-based library has what you are looking for.
As previously mentioned, for our Rust crates, we leverage rustdoc
for seamless integration with IntelliJ Rust, rust-analyzer, and more.
We try to follow the official "How to write documentation" guide from rustdoc
as closely as possible.
Older areas of the codebase may not follow the guide and conventions derived from it.
We encourage updating older documentation as whilst navigating through SI crates.
- RFC-1574: more API documentation conventions for
rust-lang
- "Making Useful Documentation Comments" from "The Book": a section of "The Book" covering useful documentation in the context of crate publishing
As referenced in our code documentation section, the rustdoc
static webpages are an entrypoint into learning about the Rust modules and structs that back many SI concepts.
Let's say you want to learn about what a Component
is.
You may want to open the docs for the dal as it is the center of many SI concepts.
You can generate and open the Rust documentation locally via the following command:
buck2 run lib/dal:doc -- --document-private-items --open
This section contains information related to running Rust tests.
Before running Rust based tests, we should ensure that dependent services are running.
buck2 run dev:platform
Tip
Not all tests require dependent services (e.g. unit tests for a small library), but if you are unsure, we recommend running dependent services in advance.
For a given crate, you can use buck2 test
.
buck2 test <crate>:test
Tip
The <crate>
has the structure //{directory}/{package}
(e.g. //lib/sdf-server
).
To see all options, use buck2 targets
with a :
character at the end.
buck2 targets <crate>:
To list all targets run:
buck2 targets //bin/...
buck2 targets //lib/...
Tip
Ellipsis also work for other directories. This works for the root as well, but the list at root will contain a lot of noise.
buck2 targets ...
You can also run individual tests, as needed.
Let's start with running all tests for a given package.
cd lib/dal
buck2 test :test
What if you want to run an individual test for lib/dal
?
Instead of using buck2 test
, we will use buck2 run
.
Here is an example with an individual dal integration test:
buck2 run //lib/dal:test-integration -- edge::new
Here is the same test, but with a precise pattern using the --exact
flag.
buck2 run //lib/dal:test-integration -- --test integration integration_test::internal::edge::new -- --exact
Let's say the test has been ignored with "#[ignore]
" and we would like to run it.
We will use the --ignored
flag.
buck2 run //lib/dal:test-integration -- edge::new -- --ignored
If you'd like to see STDOUT rather than it being captured by the test executor, use the --nocapture
flag.
buck2 run //lib/dal:test-integration -- edge::new -- --nocapture
You can prefix your buck2 run
executions with environment variables of choice.
Here is an example for running one test within a crate and backtrace enabled:
RUST_BACKTRACE=1 buck2 run <crate>:test-integration -- <pattern>
Let's say you wanted to run all tests, but set backtrace to full. You can do that too.
RUST_BACKTRACE=full buck2 run <crate>:test-integration
If you'd like to see a live log stream during test execution, use the SI_TEST_LOG
variable in conjunction with
the --nocapture
flag.
SI_TEST_LOG=info buck2 run <crate>:test-integration -- <pattern> -- --nocapture
You can combine this with the DEBUG
environment variable for lang-js
for even more information.
DEBUG=* SI_TEST_LOG=info buck2 run <crate>:test-integration -- <pattern> -- --nocapture
This section contains information related to preparing changes for a pull request.
We do not require a particular commit message format of any kind, but we do require that individual commits be descriptive, relative to size and impact. For example, if a descriptive title covers what the commit does in practice, then an additional description below the title is not required. However, if the commit has an out-sized impact relative to other commits, its description will need to reflect that.
Reviewers may ask you to amend your commits if they are not descriptive enough. Since the descriptiveness of a commit is subjective, please feel free to talk to us on our Discord server if you have any questions.
If you would like an optional commit template, see the following:
<present-tense-verb-with-capitalized-first-letter> <everything-else-without-puncutation-at-the-end>
<sentences-in-paragraph-format-or-bullet-points>
Here is an example with a paragraph in the description:
Reduce idle memory utilization in starfield-server
With this change, starfield-server no longer waits for acknowledgement
from the BGS API. As soon as the request is successful, the green
thread is dropped, which frees up memory since the task is usually idle
for ~10 seconds or more.
Here is another example, but with bullet points in the description:
Replace fallout queue with TES queue
- Replace fallout queue with TES queue for its durability benefits
- Refactor the core test harness to use TES queue
- Build and publish new TES queue Docker images on commits to "main"
Finally, here is an example with a more complex description:
Use multi-threaded work queue operations in starfield-server
Iterating through work queue items has historically been sequential in
starfield-server. With this change, rayon is leveraged to boost overall
performance within green threads.
starfield-server changes:
- Replace sequential work queue with rayon parallel iterator
Test harness changes:
- Refactor the core test harness to create an in-line work queue
This is an opinionated guide for rebasing your local branch with the latest changes from main
.
It does not necessarily reflect present-day best practices and is designed for those who would like to perform the
aforementioned action(s) without spending too much time thinking about them.
- Ensure you have VS Code installed.
- Ensure your local tree is clean and everything is pushed up to the corresponding remote branch.
- This will make it easier if we want to see the diff on GitHub later.
- Open VS Code and create a new terminal within the application.
- We will execute this guide's commands in this terminal in order to get
CMD + LEFT CLICK
functionality for files with conflicts.
- We will execute this guide's commands in this terminal in order to get
- Run
git pull --rebase origin main
to start the process.- If there is at least “conflict area” for that one commit that git cannot figure out, it’ll drop you into interactive rebase mode.
- It will keep you in interactive rebase until you have finishing “continuing” through all the commits.
- Run
git status
to see what is wrong and where there are conflicts. - Open all files with conflicts by clicking
CMD + LEFT CLICK
on each one. - In each “conflict area” in a given file, you’ll have options (font size is small) at the top to help resolve the conflict(s).
- Affected files are marked with a red exclamation point in the VS Code file picker.
- In those options, “Current” refers to
HEAD
, which ismain
in our case. - In those same options, “Incoming” refers to changes on our branch.
- You can the options or manually intervene to make changes. Sometimes, you may want to accept everything on HEAD or your local branch and just triage manually. Sometimes, you’ll want to not accept anything and manually triage the whole thing. Sometimes you’ll want to do both. It depends!
- Finally, it can be useful to have your branch diff open on GitHub to see what you changed before the rebase:
https://github.com/systeminit/si/compare/main...<your-branch>
.
- Once all conflict areas for “unmerged paths” (files with conflicts) have been resolved, run
git add
with either the entire current working directory and below (.
) or specific files/directories (e.g.lib/dal/src lib/sdf-server/src/
) as the next argument(s). - Now run
git status
again. The output should indicate that conflicts have been resolved and that we can continue rebasing. - If everything looks good in the output, run
git rebase --continue
. You will have an opportunity to amend your commit message here, if desired.- You will not have to necessarily the “human fix this conflict area” process for every commit.
- It will only happen for commits with conflict areas.
- Once the interactive rebase ends (or never even started if there were no conflicts), you should be good to push! Now, run
git push
.- You will likely have to add the
-f/--force
flag since we are overwriting history (technically?) on the remote. - Be careful when using the force flag! Try to push without using the force flag first if you are unsure.
- You will likely have to add the
- You are done! Congratulations!
This is an opinionated guide for squashing the commits on your local branch and pushing them to your corresponding remote branch. It does not necessarily reflect present-day best practices and is designed for those who would like to perform the aforementioned action(s) without spending too much time thinking about them.
- Ensure your local tree is clean and everything is pushed up to the corresponding remote branch.
- This will make it easier if we want to see the diff on GitHub later.
- Count the numer of commits that you'd like to squash.
- Navigating to your branch diff on GitHub can be helpful here:
https://github.com/systeminit/si/compare/main...<your-branch-name>
- Navigating to your branch diff on GitHub can be helpful here:
- Run
git reset --soft HEAD~N
whereN
is the name of commits (example:git reset --soft HEAD~2
where you'd like to squash two commits into one). - Run
git status
to see all staged changes from the commits that were soft reset. - Now, commit your changes (e.g.
git commit -s
). - Finally, run
git push
.- You will likely have to add the
-f/--force
flag since we are overwriting history (technically?) on the remote. - Be careful when using the force flag! Try to push without using the force flag first if you are unsure.
- You will likely have to add the
- You are done! Congratulations!
This section contains the paths and brief definitions of the services that run in the System Initiative software stack.
- forklift: the service that forklifts data from SI to a data warehouse (or perform an "ack and no-op")
- pinga: the job queueing and execution service used to execute non-trivial jobs
- rebaser: where all workspace-level changes are persisted and conflicts are detected based on proposed changes
- sdf: the backend webserver for communicating with
web
that contains the majority of the "business logic" - veritech: a backend webserver for dispatching functions in secure runtime environments
- web: the primary frontend web application for using the System Initiative software
While there are other directories in the project, these are primarily where most of the interesting source code lives and how they are generally organized:
Directory | Description |
---|---|
app/ |
Web front ends, GUIs, or other desktop applications |
bin/ |
Backend programs, CLIs, servers, etc. |
component/ |
Docker container images and other ancillary tooling |
lib/ |
Supporting libraries and packages for services or applications |
To add a new Rust library, there are a few steps:
- Create
lib/MY-LIBRARY
and put aCargo.toml
there like this:
[package]
name = "MY-LIBRARY"
edition = "2021"
version.workspace = true
authors.workspace = true
license.workspace = true
repository.workspace = true
rust-version.workspace = true
publish.workspace = true
- Put
src/lib.rs
with your code. - Add your library to the top-level
Cargo.toml
inside[workspace] members = ...
. - Run
cargo check --all-targets --all-features
to get your library added to the top levelCargo.lock
.
Note
If your library adds, removes or modifies a third party crate, you may need to sync buck2
deps.
See the support/buck2 directory for more information.