Tooling to manage private clusters of Solana nodes.
Requirements: Go 1.18 & build essentials.
go mod download
go build -o ./solana-cluster .
Find amd64 and arm64 Docker images for all releases on GitHub Container Registry.
docker pull ghcr.io/blockdaemon/solana-cluster-manager
docker run \
--network-mode=host \
-v /solana/ledger:/ledger:ro \
ghcr.io/blockdaemon/solana-cluster-manager \
sidecar --ledger /ledger
$ solana-cluster sidecar --help
Runs on a Solana node and serves available snapshot archives.
Do not expose this API publicly.
Usage:
solana-snapshots sidecar [flags]
Flags:
--interface string Only accept connections from this interface
--ledger string Path to ledger dir
--port uint16 Listen port (default 13080)
$ solana-cluster tracker --help
Connects to sidecars on nodes and scrapes the available snapshot versions.
Provides an API allowing fetch jobs to find the latest snapshots.
Do not expose this API publicly.
Usage:
solana-snapshots tracker [flags]
Flags:
--config string Path to config file
--internal-listen string Internal listen URL (default ":8457")
--listen string Listen URL (default ":8458")
$ solana-cluster fetch --help
Fetches a snapshot from another node using the tracker API.
Usage:
solana-snapshots fetch [flags]
Flags:
--download-timeout duration Max time to try downloading in total (default 10m0s)
--ledger string Path to ledger dir
--max-slots uint Refuse to download <n> slots older than the newest (default 10000)
--min-slots uint Download only snapshots <n> slots newer than local (default 500)
--request-timeout duration Max time to wait for headers (excluding download) (default 3s)
--tracker string Download as instructed by given tracker URL
$ solana-cluster mirror --help
Periodically mirrors snapshots from nodes to an S3-compatible data store.
Specify credentials via env $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY
Usage:
solana-snapshots mirror [flags]
Flags:
--refresh duration Refresh interval to discover new snapshots (default 30s)
--s3-bucket string Bucket name
--s3-prefix string Prefix for S3 object names (optional)
--s3-region string S3 region (optional)
--s3-secure Use secure S3 transport (default true)
--s3-url string URL to S3 API
--tracker string URL to tracker API
If the local ledger directory does not have the same full snapshot as the remote one, then fetch the full snapshot otherwise skip full snapshot fetching.
If the local ledger does not have the latest incremental snapshot, fetch the latest incremnetal.
Fetch the latest incremental snapshot for the latest full snapshot locally available. Warn if this local snapshot is not the latest but do not fetch.
Fetch the latest full snapshot, unless it is already available in the local ledger and has the same hash.
Snapshot management tooling enables efficient peer-to-peer transfers of accounts database archives.
Scraping (Flow A)
Snapshot metadata collection runs periodically similarly to Prometheus scraping.
Each cluster-aware node runs a lightweight solana-cluster sidecar
agent providing telemetry about its snapshots.
The solana-cluster tracker
then connects to all sidecars to assemble a complete list of snapshot metadata.
The tracker is stateless so it can be replicated.
Service discovery is available through HTTP and JSON files. Consul SD support is planned.
Side note: Snapshot sources are configurable in stock Solana software but only via static lists. This does not scale well with large fleets because each cluster change requires updating the lists of all nodes.
Downloading (Flow B)
When a Solana node needs to fetch a snapshot remotely, the tracker helps it find the best snapshot source. Nodes will download snapshots directly from the sidecars of other nodes.
Not yet public. 🚜 Subscribe to releases! ✨
Blockdaemon manages one of the largest Solana validator and RPC infrastructure deployments to date, backed by a custom peer-to-peer backbone. This repository shares our performance and sustainability optimizations.
When Solana validators first start, they have to retrieve and validate hundreds of gigabytes of state data from a remote node. During normal operation, validators stream at least 500 Mbps of traffic in either direction.
For Solana infra operators that manage more than node (not to mention hundreds), this cost currently scales linearly as well. Unmodified Solana deployments treat their cluster peers the same as any other. This can end in a large number of streams between globally dispersed validators.
This is obviously inefficient. 10 Gbps connectivity is cheap and abundant locally within data centers. In contrast, major public clouds (who shall not be named) charge egregious premiums on Internet traffic.
The solution: Co-located Solana validators that are controlled by the same entity should also behave as one entity.
Leveraging internal connectivity to distribute blockchain data can reduce public network leeching and increase total cluster bandwidth.
Authenticated internal connectivity allows delegation of expensive calculations and re-use of results thereof. Concretely, the amount write-heavy snapshot creation & verification procedures per node can decrease as the cluster scales out.