Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest/ledgerbackend: add trusted hash to captive core catchup #5431

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
61a5f80
#4538: obtain trusted hash for captive core catchup command
sreuland Aug 19, 2024
1a50ed7
Merge remote-tracking branch 'upstream/master' into trusted_catchup
sreuland Aug 19, 2024
3b706e4
#4538: updated changelog notes
sreuland Aug 19, 2024
2937c18
#4538: use LedgerBackend interface for NewCaptive, fix unit tests tha…
sreuland Aug 20, 2024
116f3b1
Merge remote-tracking branch 'upstream/master' into trusted_catchup
sreuland Aug 20, 2024
8da847f
#4538: fix govet warn
sreuland Aug 20, 2024
f91fa55
#4538: fixed TestTxSubLimitsBodySize to not depend on soroban for tx …
sreuland Aug 20, 2024
d049e73
#4538: added core container log output on integrationt test wait loop
sreuland Aug 20, 2024
805950a
#4538: pin stellar-core image in CI to last stable 21.2.1 image
sreuland Aug 20, 2024
3b335b7
#4538: fix captive listen port conflicts on reingest integration tests
sreuland Aug 21, 2024
c1295eb
message
sreuland Aug 26, 2024
be4df19
Merge remote-tracking branch 'upstream/master' into trusted_catchup
sreuland Aug 26, 2024
e56e346
#4538: updated CHANGELOGs
sreuland Aug 26, 2024
2270bba
#4538: fixed verify-range ci test
sreuland Aug 26, 2024
7315813
#4538: fixed verify-range ci test, again
sreuland Aug 26, 2024
d236308
#4538: fixed shell script syntax on verify range
sreuland Aug 26, 2024
32bb6b9
#4538: create free space on gh runner for verify-range due to out of…
sreuland Aug 27, 2024
0ee83dc
#4538: use next larger gh runner for verify-range due to out of disk …
sreuland Aug 27, 2024
659d0e0
#4538: try older range on pubnet for verify-range, see how long it ta…
sreuland Aug 27, 2024
6e61306
#4538: use testnet for verify-range, shorter duration than pubnet
sreuland Aug 27, 2024
3d23723
Merge remote-tracking branch 'upstream/master' into trusted_catchup
sreuland Aug 27, 2024
d58b15a
#4538: enabled captive core full config option from reingest cmd flag…
sreuland Aug 28, 2024
41eeb8f
#4538: included new file for db command tests
sreuland Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/workflows/horizon.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@ jobs:
STELLAR_CORE_VERSION: 21.3.1-2007.4ede19620.focal
CAPTIVE_CORE_STORAGE_PATH: /tmp
steps:
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
- uses: actions/checkout@v3
with:
# For pull requests, build and test the PR head not a merge of the PR with the destination.
Expand All @@ -134,8 +138,8 @@ jobs:
- name: Build and test the Verify Range Docker image
run: |
docker build --build-arg="GO_VERSION=$(sed -En 's/^toolchain[[:space:]]+go([[:digit:].]+)$/\1/p' go.mod)" -f services/horizon/docker/verify-range/Dockerfile -t stellar/horizon-verify-range services/horizon/docker/verify-range/
# Any range should do for basic testing, this range was chosen pretty early in history so that it only takes a few mins to run
docker run -e BRANCH=$(git rev-parse HEAD) -e FROM=10000063 -e TO=10000127 stellar/horizon-verify-range
# Use small default range of two most recent checkpoints back from latest archived checkpoint.
docker run -e TESTNET=true -e BRANCH=$(git rev-parse HEAD) -e FROM=0 -e TO=0 stellar/horizon-verify-range
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using latest checkpoint on testnet for verify-range, tried similar on pubnet, takes 45+ minutes, mostly on change processor ledger entry processing.


# Push image
- if: github.ref == 'refs/heads/master'
Expand Down
8 changes: 8 additions & 0 deletions ingest/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Changelog

## Pending

### Fixed
* The Captive Core backend now performs 'online' stellar-core `run` for bounded modes of tx-meta retrieval. Refer to [runFrom.go](./ledgerbackend/run_from.go). Enables core to build, validate, and emit trusted ledger hashes in tx-meta stream from lastest of network for a bounded ledger range. The bounded mode will no longer do the 'offline' mode of running core `catchup` for getting tx-meta from just history archives, which does not guarantee verification of the ledger hashes to that of live network. ([#4538](https://github.com/stellar/go/pull/4538)).
* Note - due to the usage of `run` with LCL set to the `from` , there is now potential for longer run time execution durations due to core having to perform online replay from network latest ledger back to `from`. The longer runtime duration will be proportional to the older age of the `from` ledger.



All notable changes to this project will be documented in this file. This project adheres to [Semantic Versioning](http://semver.org/).

### Stellar Core Protocol 21 Configuration Update:
Expand Down
10 changes: 5 additions & 5 deletions ingest/ledger_change_reader_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ const (

func TestNewLedgerChangeReaderFails(t *testing.T) {
ctx := context.Background()
mock := &ledgerbackend.MockDatabaseBackend{}
mock := &ledgerbackend.MockLedgerBackend{}
seq := uint32(123)
mock.On("GetLedger", ctx, seq).Return(
xdr.LedgerCloseMeta{},
Expand All @@ -39,7 +39,7 @@ func TestNewLedgerChangeReaderFails(t *testing.T) {

func TestNewLedgerChangeReaderSucceeds(t *testing.T) {
ctx := context.Background()
mock := &ledgerbackend.MockDatabaseBackend{}
mock := &ledgerbackend.MockLedgerBackend{}
seq := uint32(123)

header := xdr.LedgerHeaderHistoryEntry{
Expand Down Expand Up @@ -146,7 +146,7 @@ func assertChangesEqual(

func TestLedgerChangeReaderOrder(t *testing.T) {
ctx := context.Background()
mock := &ledgerbackend.MockDatabaseBackend{}
mock := &ledgerbackend.MockLedgerBackend{}
seq := uint32(123)

src := xdr.MustAddress("GBXGQJWVLWOYHFLVTKWV5FGHA3LNYY2JQKM7OAJAUEQFU6LPCSEFVXON")
Expand Down Expand Up @@ -353,7 +353,7 @@ func TestLedgerChangeReaderOrder(t *testing.T) {

func TestLedgerChangeLedgerCloseMetaV2(t *testing.T) {
ctx := context.Background()
mock := &ledgerbackend.MockDatabaseBackend{}
mock := &ledgerbackend.MockLedgerBackend{}
seq := uint32(123)

src := xdr.MustAddress("GBXGQJWVLWOYHFLVTKWV5FGHA3LNYY2JQKM7OAJAUEQFU6LPCSEFVXON")
Expand Down Expand Up @@ -600,7 +600,7 @@ func TestLedgerChangeLedgerCloseMetaV2(t *testing.T) {

func TestLedgerChangeLedgerCloseMetaV2Empty(t *testing.T) {
ctx := context.Background()
mock := &ledgerbackend.MockDatabaseBackend{}
mock := &ledgerbackend.MockLedgerBackend{}
seq := uint32(123)

baseFee := xdr.Int64(100)
Expand Down
90 changes: 20 additions & 70 deletions ingest/ledgerbackend/captive_core_backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,6 @@ var _ LedgerBackend = (*CaptiveStellarCore)(nil)
// ErrCannotStartFromGenesis is returned when attempting to prepare a range from ledger 1
var ErrCannotStartFromGenesis = errors.New("CaptiveCore is unable to start from ledger 1, start from ledger 2")

func (c *CaptiveStellarCore) roundDownToFirstReplayAfterCheckpointStart(ledger uint32) uint32 {
r := c.checkpointManager.GetCheckpointRange(ledger)
if r.Low <= 1 {
// Stellar-Core doesn't stream ledger 1
return 2
}
// All other checkpoints start at the next multiple of 64
return r.Low
}

// CaptiveStellarCore is a ledger backend that starts internal Stellar-Core
// subprocess responsible for streaming ledger data. It provides better decoupling
// than DatabaseBackend but requires some extra init time.
Expand Down Expand Up @@ -163,7 +153,7 @@ type CaptiveCoreConfig struct {
}

// NewCaptive returns a new CaptiveStellarCore instance.
func NewCaptive(config CaptiveCoreConfig) (*CaptiveStellarCore, error) {
func NewCaptive(config CaptiveCoreConfig) (LedgerBackend, error) {
// Here we set defaults in the config. Because config is not a pointer this code should
// not mutate the original CaptiveCoreConfig instance which was passed into NewCaptive()

Expand Down Expand Up @@ -327,56 +317,18 @@ func (c *CaptiveStellarCore) getLatestCheckpointSequence() (uint32, error) {
return has.CurrentLedger, nil
}

func (c *CaptiveStellarCore) openOfflineReplaySubprocess(from, to uint32) error {
latestCheckpointSequence, err := c.getLatestCheckpointSequence()
if err != nil {
return errors.Wrap(err, "error getting latest checkpoint sequence")
}

if from > latestCheckpointSequence {
return errors.Errorf(
"from sequence: %d is greater than max available in history archives: %d",
from,
latestCheckpointSequence,
)
}

if to > latestCheckpointSequence {
return errors.Errorf(
"to sequence: %d is greater than max available in history archives: %d",
to,
latestCheckpointSequence,
)
}

stellarCoreRunner := c.stellarCoreRunnerFactory()
if err = stellarCoreRunner.catchup(from, to); err != nil {
return errors.Wrap(err, "error running stellar-core")
}
c.stellarCoreRunner = stellarCoreRunner

// The next ledger should be the first ledger of the checkpoint containing
// the requested ledger
ran := BoundedRange(from, to)
c.ledgerSequenceLock.Lock()
defer c.ledgerSequenceLock.Unlock()

c.prepared = &ran
c.nextLedger = c.roundDownToFirstReplayAfterCheckpointStart(from)
c.lastLedger = &to
c.previousLedgerHash = nil

return nil
}

func (c *CaptiveStellarCore) openOnlineReplaySubprocess(ctx context.Context, from uint32) error {
runFrom, ledgerHash, err := c.runFromParams(ctx, from)
func (c *CaptiveStellarCore) openOnlineReplaySubprocess(ctx context.Context, ledgerRange Range) error {
runFrom, ledgerHash, err := c.runFromParams(ctx, ledgerRange.from)
if err != nil {
return errors.Wrap(err, "error calculating ledger and hash for stellar-core run")
}

stellarCoreRunner := c.stellarCoreRunnerFactory()
if err = stellarCoreRunner.runFrom(runFrom, ledgerHash); err != nil {
runnerMode := stellarCoreRunnerModeActive
if ledgerRange.bounded {
runnerMode = stellarCoreRunnerModePassive
}
if err = stellarCoreRunner.runFrom(runFrom, ledgerHash, runnerMode); err != nil {
return errors.Wrap(err, "error running stellar-core")
}
c.stellarCoreRunner = stellarCoreRunner
Expand All @@ -388,9 +340,15 @@ func (c *CaptiveStellarCore) openOnlineReplaySubprocess(ctx context.Context, fro
defer c.ledgerSequenceLock.Unlock()

c.nextLedger = 0
ran := UnboundedRange(from)
ran := ledgerRange
var last *uint32
if ledgerRange.bounded {
boundedTo := ledgerRange.to
last = &boundedTo
}

c.lastLedger = last
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the key to being able to (re)use live runFrom with bounded range, trigger the captive core to be stopped once it emits ledger on pipe that matched up to the bounded to if exists, which would have been same as core catchup process terminating at point of same to ledger emitted.

c.prepared = &ran
c.lastLedger = nil
c.previousLedgerHash = nil

return nil
Expand Down Expand Up @@ -497,12 +455,8 @@ func (c *CaptiveStellarCore) startPreparingRange(ctx context.Context, ledgerRang
}
}

var err error
if ledgerRange.bounded {
err = c.openOfflineReplaySubprocess(ledgerRange.from, ledgerRange.to)
} else {
err = c.openOnlineReplaySubprocess(ctx, ledgerRange.from)
}
err := c.openOnlineReplaySubprocess(ctx, ledgerRange)

if err != nil {
return false, errors.Wrap(err, "opening subprocess")
}
Expand All @@ -513,13 +467,9 @@ func (c *CaptiveStellarCore) startPreparingRange(ctx context.Context, ledgerRang
// PrepareRange prepares the given range (including from and to) to be loaded.
// Captive stellar-core backend needs to initialize Stellar-Core state to be
// able to stream ledgers.
// Stellar-Core mode depends on the provided ledgerRange:
// - For BoundedRange it will start Stellar-Core in catchup mode.
// - For UnboundedRange it will first catchup to starting ledger and then run
// it normally (including connecting to the Stellar network).
//
// Please note that using a BoundedRange, currently, requires a full-trust on
// history archive. This issue is being fixed in Stellar-Core.
// ctx - caller context
// ledgerRange - specify the range info
func (c *CaptiveStellarCore) PrepareRange(ctx context.Context, ledgerRange Range) error {
if alreadyPrepared, err := c.startPreparingRange(ctx, ledgerRange); err != nil {
return errors.Wrap(err, "error starting prepare range")
Expand Down
Loading
Loading