From 5b8e984883928e9a5fb225e4114bc04f3d9a3834 Mon Sep 17 00:00:00 2001 From: Urvi Date: Thu, 20 Jun 2024 13:56:29 -0700 Subject: [PATCH] exp/services/ledgerexporter: Updated README with step by step guide to installing and running ledger exporter --- exp/services/ledgerexporter/README.md | 221 ++++++++++++------ .../ledgerexporter/config.example.toml | 44 ++++ exp/services/ledgerexporter/config.toml | 1 - exp/services/ledgerexporter/internal/main.go | 8 +- .../ledgerexporter/internal/main_test.go | 14 +- 5 files changed, 207 insertions(+), 81 deletions(-) create mode 100644 exp/services/ledgerexporter/config.example.toml diff --git a/exp/services/ledgerexporter/README.md b/exp/services/ledgerexporter/README.md index 57757508e1..b964921bae 100644 --- a/exp/services/ledgerexporter/README.md +++ b/exp/services/ledgerexporter/README.md @@ -1,101 +1,184 @@ -# Ledger Exporter (Work in Progress) +## Ledger Exporter: Installation and Usage Guide -The Ledger Exporter is a tool designed to export ledger data from a Stellar network and upload it to a specified destination. It supports both bounded and unbounded modes, allowing users to export a specific range of ledgers or continuously export new ledgers as they arrive on the network. +This guide provides step-by-step instructions on installing and using the Ledger Exporter, a tool that helps you export Stellar network ledger data to a Google Cloud Storage (GCS) bucket for efficient analysis and storage. -Ledger Exporter currently uses captive-core as the ledger backend and GCS as the destination data store. -# Exported Data Format -The tool allows for the export of multiple ledgers in a single exported file. The exported data is in XDR format and is compressed using zstd before being uploaded. +**Table of Contents** -```go -type LedgerCloseMetaBatch struct { - StartSequence uint32 - EndSequence uint32 - LedgerCloseMetas []LedgerCloseMeta -} -``` +* [Prerequisites](#prerequisites) +* [Installation Steps](#installation-steps) + * [Set Up GCP Credentials](#set-up-gcp-credentials) + * [Create a GCS Bucket](#create-a-gcs-bucket) +* [Configuration](#configuration) + * [Create a Configuration File (`config.toml`)](#create-a-configuration-file-configtoml) +* [Running the Ledger Exporter](#running-the-exporter) + * [Pull the Docker Image](#pull-the-docker-image) + * [Run the Exporter](#run-the-exporter) +* [CLI Commands](#cli-commands) + * [scan-and-fill](#1-scan-and-fill) + * [append](#2-append) -## Getting Started +## Prerequisites -### Installation (coming soon) +* **Google Cloud Platform (GCP) Account:** You'll need a GCP account to create a GCS bucket for storing the exported data. +* **Docker:** Allows you to run the Ledger Exporter in a self-contained environment. The official installation guide: [https://docs.docker.com/engine/install/](https://docs.docker.com/engine/install/) -### Command Line Options +## Installation Steps -#### Scan and Fill Mode: -Exports a specific range of ledgers, defined by --start and --end. Will only export to remote datastore if data is absent. -```bash -ledgerexporter scan-and-fill --start --end --config-file -``` +### Set Up GCP Credentials -#### Append Mode: -Exports ledgers initially searching from --start, looking for the next absent ledger sequence number proceeding --start on the data store. If abscence is detected, the export range is narrowed to `--start `. -This feature requires ledgers to be present on the remote data store for some (possibly empty) prefix of the requested range and then absent for the (possibly empty) remainder. +Create application default credentials for your Google Cloud Platform (GCP) project by following these steps: +1. Download the [SDK](https://cloud.google.com/sdk/docs/install). +2. Install and initialize the [gcloud CLI](https://cloud.google.com/sdk/docs/initializing). +3. Create [application authentication credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp) and store it in a secure location on your system, such as $HOME/.config/gcloud/application_default_credentials.json. -In this mode, the --end ledger can be provided to stop the process once export has reached that ledger, or if absent or 0 it will result in continous exporting of new ledgers emitted from the network. +For detailed instructions, refer to the [Providing Credentials for Application Default Credentials (ADC) guide.](https://cloud.google.com/docs/authentication/provide-credentials-adc) - It’s guaranteed that ledgers exported during `append` mode from `start` and up to the last logged ledger file `Uploaded {ledger file name}` were contiguous, meaning all ledgers within that range were exported to the data lake with no gaps or missing ledgers in between. -```bash -ledgerexporter append --start --config-file -``` +### Create a GCS Bucket -### Configuration (toml): -The `stellar_core_config` supports two ways for configuring captive core: - - use prebuilt captive core config toml, archive urls, and passphrase based on `stellar_core_config.network = testnet|pubnet`. - - manually set the the captive core confg by supplying these core parameters which will override any defaults when `stellar_core_config.network` is present also: - `stellar_core_config.captive_core_toml_path` - `stellar_core_config.history_archive_urls` - `stellar_core_config.network_passphrase` +1. Go to the GCP Console's Storage section ([https://console.cloud.google.com/storage](https://console.cloud.google.com/storage)) and create a new bucket. +2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. +3. **Note down the bucket name** as you'll need it later in the configuration process. -Ensure you have stellar-core installed and set `stellar_core_config.stellar_core_binary_path` to it's path on o/s. +## Configuration -Enable web service that will be bound to localhost post and publishes metrics by including `admin_port = {port}` +### Create a Configuration File (`config.toml`) + +The configuration file specifies details about your GCS bucket, stellar network and other settings. + +Replace the placeholder values in the sample file with your specific information: + +
+ Sample TOML Configuration (config.toml) -An example config, demonstrating preconfigured captive core settings and gcs data store config. ```toml +# Admin port configuration +# Specifies the port number for hosting the web service locally to publish metrics. admin_port = 6061 -[datastore_config] +# Datastore Configuration +[datastore] +# Specifies the type of datastore. Currently, only Google Cloud Storage (GCS) is supported. type = "GCS" -[datastore_config.params] -destination_bucket_path = "your-bucket-name///" +[datastore.parameters] +# The Google Cloud Storage bucket path for storing data, with optional subpaths for organization. +bucket_path = "your-bucket-name///" + +[datastore.schema] +# Configuration for ledger and file storage. +ledgers_per_file = 64 # Number of ledgers stored in each file. +files_per_partition = 10 # Number of files per partition directory. -[datastore_config.schema] -ledgers_per_file = 64 -files_per_partition = 10 +# Stellar-core Configuration +[stellar_core] +# Use default captive-core config based on network +# Options are "testnet" for the test network or "pubnet" for the public network. +network = "testnet" -[stellar_core_config] - network = "testnet" - stellar_core_binary_path = "/my/path/to/stellar-core" - captive_core_toml_path = "my-captive-core.cfg" - history_archive_urls = ["http://testarchiveurl1", "http://testarchiveurl2"] - network_passphrase = "test" +# Alternatively, you can manually configure captive-core parameters (overrides defaults if 'network' is set). + +# Path to the captive-core configuration file. +#captive_core_config_path = "my-captive-core.cfg" + +# URLs for Stellar history archives, with multiple URLs allowed. +#history_archive_urls = ["http://testarchiveurl1", "http://testarchiveurl2"] + +# Network passphrase for the Stellar network. +#network_passphrase = "Test SDF Network ; September 2015" + +# Path to stellar-core binary +# Not required when running in a Docker container as it has the stellar-core installed and path is set. +# When running outside of Docker, it will look for stellar-core in the OS path if it exists. +#stellar_core_binary_path = "/my/path/to/stellar-core ``` +
-### Exported Files +## Running the Ledger Exporter -#### File Organization: -- Ledgers are grouped into files, with the number of ledgers per file set by `ledgers_per_file`. -- Files are further organized into partitions, with the number of files per partition set by `files_per_partition`. +### Pull the Docker Image -### Filename Structure: -- Filenames indicate the ledger range they contain, e.g., `0-63.xdr.zstd` holds ledgers 0 to 63. -- Partition directories group files, e.g., `/0-639/` holds files for ledgers 0 to 639. +Open a terminal window and run the following command to download the Stellar Ledger Exporter Docker image: -#### Example: -with `ledgers_per_file = 64` and `files_per_partition = 10`: -- Partition names: `/0-639`, `/640-1279`, ... -- Filenames: `/0-639/0-63.xdr.zstd`, `/0-639/64-127.xdr.zstd`, ... +```bash +docker pull stellar/ledger-exporter +``` + +### Run the Ledger Exporter -#### Special Cases: +The following command demonstrates how to run the Ledger Exporter: -- If `ledgers_per_file` is set to 1, filenames will only contain the ledger number. -- If `files_per_partition` is set to 1, filenames will not contain the partition. +```bash +docker run --platform linux/amd64 -d \ + -v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \ + -e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \ + -v ${PWD}/config.toml:/config.toml \ + stellar/ledger-exporter [options] +``` -#### Note: -- Avoid changing `ledgers_per_file` and `files_per_partition` after configuration for consistency. +**Explanation:** -#### Retrieving Data: -- To locate a specific ledger sequence, calculate the partition name and ledger file name using `files_per_partition` and `ledgers_per_file`. -- The `GetObjectKeyFromSequenceNumber` function automates this calculation. +* `--platform linux/amd64`: Specifies the platform architecture (adjust if needed for your system). +* `-d`: Runs the container in detached mode (background process). +* `-v`: Mounts volumes to map your local GCP credentials and config.toml file to the container: + * `$HOME/.config/gcloud/application_default_credentials.json`: Your local GCP credentials file. + * `${PWD}/config.toml`: Your local configuration file. +* `-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json`: Sets the environment variable for credentials within the container. +* `stellar/ledger-exporter`: The Docker image name. +* ``: The Stellar Ledger Exporter command (e.g., [scan-and-fill](#1-scan-and-fill), [append](#2-append)) + +## CLI Commands + +The Stellar Ledger Exporter offers two primary commands to manage ledger data export: + +### 1. scan-and-fill + +**Purpose:** +Exports a specific range of Stellar ledgers, defined by the `--start` and `--end` options. + +**Behavior:** +- Scans the specified ledger sequence range. +- Exports only missing ledgers to the remote datastore (GCS bucket). +- Avoids unnecessary exports if data is already present. + +**Usage:** + +```bash +docker run --platform linux/amd64 -d \ + -v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \ + -e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \ + -v ${PWD}/config.toml:/config.toml \ + stellar/ledger-exporter \ + scan-and-fill --start --end [--config ] +``` + +Arguments: +- `--start ` (required): The starting ledger sequence number in the range to export. +- `--end ` (required): The ending ledger sequence number in the range. +- `--config ` (optional): The path to your configuration file, containing details like GCS bucket information. Defaults to `config.toml` in the runtime working directory. + +### 2. append + +**Purpose:** +Exports ledgers starting from `--start`, searching for the next missing ledger sequence number in the datastore. If a missing ledger is found, the export begins from that missing ledger. + +**Behavior:** +- Starts searching from the provided `--start` ledger and identifies the first missing ledger sequence number after `--start` in the remote datastore (GCS bucket). +- Narrows the export range to include only missing ledgers from that point onwards. +- If the `--end` ledger is provided, it will stop the process once export has reached that ledger. If the `--end` ledger is absent or set to 0, the exporter will continuously export new ledgers as they appear on the network. + +**Usage:** + +```bash +docker run --platform linux/amd64 -d \ + -v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \ + -e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \ + -v ${PWD}/config.toml:/config.toml \ + stellar/ledger-exporter \ + append --start [--end ] [--config ] +``` +Arguments: +- `--start ` (required): The starting ledger sequence number for the export process. +- `--end ` (optional): The ending ledger sequence number. If omitted or set to 0, the exporter will continuously export new ledgers as they appear on the network. +- `--config ` (optional): The path to your configuration file, containing details like GCS bucket information. Defaults to `config.toml` in the runtime working directory. \ No newline at end of file diff --git a/exp/services/ledgerexporter/config.example.toml b/exp/services/ledgerexporter/config.example.toml new file mode 100644 index 0000000000..84e65144bd --- /dev/null +++ b/exp/services/ledgerexporter/config.example.toml @@ -0,0 +1,44 @@ + +# Sample TOML Configuration + +# Admin port configuration +# Specifies the port number for hosting the web service locally to publish metrics. +admin_port = 6061 + +# Datastore Configuration +[datastore] +# Specifies the type of datastore. Currently, only Google Cloud Storage (GCS) is supported. +type = "GCS" + +[datastore.parameters] +# The Google Cloud Storage bucket path for storing data, with optional subpaths for organization. +bucket_path = "your-bucket-name///" + +[datastore.schema] +# Configuration for ledger and file storage. +ledgers_per_file = 64 # Number of ledgers stored in each file. +files_per_partition = 10 # Number of files per partition directory. + +# Stellar-core Configuration +[stellar_core] +# Use default captive-core config based on network +# Options are "testnet" for the test network or "pubnet" for the public network. +network = "testnet" + +# Alternatively, you can manually configure captive-core parameters (overrides defaults if 'network' is set). + +# Path to the captive-core configuration file. +#captive_core_config_path = "my-captive-core.cfg" + +# URLs for Stellar history archives, with multiple URLs allowed. +#history_archive_urls = ["http://testarchiveurl1", "http://testarchiveurl2"] + +# Network passphrase for the Stellar network. +#network_passphrase = "Test SDF Network ; September 2015" + +# Path to stellar-core binary +# Not required when running in a Docker container as it has the stellar-core installed and path is set. +# When running outside of Docker, it will look for stellar-core in the OS path if it exists. +# If you want to override the path, you can do so here. +#stellar_core_binary_path = "/my/path/to/stellar-core" + diff --git a/exp/services/ledgerexporter/config.toml b/exp/services/ledgerexporter/config.toml index c41d9376ac..c5c4519f0b 100644 --- a/exp/services/ledgerexporter/config.toml +++ b/exp/services/ledgerexporter/config.toml @@ -10,5 +10,4 @@ files_per_partition = 64000 [stellar_core_config] network = "testnet" - stellar_core_binary_path = "/usr/local/bin/stellar-core" diff --git a/exp/services/ledgerexporter/internal/main.go b/exp/services/ledgerexporter/internal/main.go index d1409eb89c..425ca5ac6e 100644 --- a/exp/services/ledgerexporter/internal/main.go +++ b/exp/services/ledgerexporter/internal/main.go @@ -39,7 +39,7 @@ func defineCommands() { RunE: func(cmd *cobra.Command, args []string) error { settings := bindCliParameters(cmd.PersistentFlags().Lookup("start"), cmd.PersistentFlags().Lookup("end"), - cmd.PersistentFlags().Lookup("config-file"), + cmd.PersistentFlags().Lookup("config"), ) settings.Mode = ScanFill return ledgerExporterCmdRunner(settings) @@ -52,7 +52,7 @@ func defineCommands() { RunE: func(cmd *cobra.Command, args []string) error { settings := bindCliParameters(cmd.PersistentFlags().Lookup("start"), cmd.PersistentFlags().Lookup("end"), - cmd.PersistentFlags().Lookup("config-file"), + cmd.PersistentFlags().Lookup("config"), ) settings.Mode = Append return ledgerExporterCmdRunner(settings) @@ -64,14 +64,14 @@ func defineCommands() { scanAndFillCmd.PersistentFlags().Uint32P("start", "s", 0, "Starting ledger (inclusive), must be set to a value greater than 1") scanAndFillCmd.PersistentFlags().Uint32P("end", "e", 0, "Ending ledger (inclusive), must be set to value greater than 'start' and less than the network's current ledger") - scanAndFillCmd.PersistentFlags().String("config-file", "config.toml", "Path to the TOML config file. Defaults to 'config.toml' on runtime working directory path.") + scanAndFillCmd.PersistentFlags().String("config", "config.toml", "Path to the TOML config file. Defaults to 'config.toml' on runtime working directory path.") viper.BindPFlags(scanAndFillCmd.PersistentFlags()) appendCmd.PersistentFlags().Uint32P("start", "s", 0, "Starting ledger (inclusive), must be set to a value greater than 1") appendCmd.PersistentFlags().Uint32P("end", "e", 0, "Ending ledger (inclusive), optional, setting to non-zero means bounded mode, "+ "only export ledgers from 'start' up to 'end' value which must be greater than 'start' and less than the network's current ledger. "+ "If 'end' is absent or '0' means unbounded mode, exporter will continue to run indefintely and export the latest closed ledgers from network as they are generated in real time.") - appendCmd.PersistentFlags().String("config-file", "config.toml", "Path to the TOML config file. Defaults to 'config.toml' on runtime working directory path.") + appendCmd.PersistentFlags().String("config", "config.toml", "Path to the TOML config file. Defaults to 'config.toml' on runtime working directory path.") viper.BindPFlags(appendCmd.PersistentFlags()) } diff --git a/exp/services/ledgerexporter/internal/main_test.go b/exp/services/ledgerexporter/internal/main_test.go index 4c9e5412f3..340ead2d03 100644 --- a/exp/services/ledgerexporter/internal/main_test.go +++ b/exp/services/ledgerexporter/internal/main_test.go @@ -29,12 +29,12 @@ func TestFlagsOutput(t *testing.T) { }{ { name: "no sub-command", - commandArgs: []string{"--start", "4", "--end", "5", "--config-file", "myfile"}, + commandArgs: []string{"--start", "4", "--end", "5", "--config", "myfile"}, expectedErrOutput: "Error: ", }, { name: "append sub-command with start and end present", - commandArgs: []string{"append", "--start", "4", "--end", "5", "--config-file", "myfile"}, + commandArgs: []string{"append", "--start", "4", "--end", "5", "--config", "myfile"}, expectedErrOutput: "", appRunner: appRunnerSuccess, expectedSettings: RuntimeSettings{ @@ -46,7 +46,7 @@ func TestFlagsOutput(t *testing.T) { }, { name: "append sub-command with start and end absent", - commandArgs: []string{"append", "--config-file", "myfile"}, + commandArgs: []string{"append", "--config", "myfile"}, expectedErrOutput: "", appRunner: appRunnerSuccess, expectedSettings: RuntimeSettings{ @@ -58,13 +58,13 @@ func TestFlagsOutput(t *testing.T) { }, { name: "append sub-command prints app error", - commandArgs: []string{"append", "--start", "4", "--end", "5", "--config-file", "myfile"}, + commandArgs: []string{"append", "--start", "4", "--end", "5", "--config", "myfile"}, expectedErrOutput: "test error", appRunner: appRunnerError, }, { name: "scanfill sub-command with start and end present", - commandArgs: []string{"scan-and-fill", "--start", "4", "--end", "5", "--config-file", "myfile"}, + commandArgs: []string{"scan-and-fill", "--start", "4", "--end", "5", "--config", "myfile"}, expectedErrOutput: "", appRunner: appRunnerSuccess, expectedSettings: RuntimeSettings{ @@ -76,7 +76,7 @@ func TestFlagsOutput(t *testing.T) { }, { name: "scanfill sub-command with start and end absent", - commandArgs: []string{"scan-and-fill", "--config-file", "myfile"}, + commandArgs: []string{"scan-and-fill", "--config", "myfile"}, expectedErrOutput: "", appRunner: appRunnerSuccess, expectedSettings: RuntimeSettings{ @@ -88,7 +88,7 @@ func TestFlagsOutput(t *testing.T) { }, { name: "scanfill sub-command prints app error", - commandArgs: []string{"scan-and-fill", "--start", "4", "--end", "5", "--config-file", "myfile"}, + commandArgs: []string{"scan-and-fill", "--start", "4", "--end", "5", "--config", "myfile"}, expectedErrOutput: "test error", appRunner: appRunnerError, },