Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion fails with ledgers not contiguous #166

Closed
2opremio opened this issue May 9, 2024 · 1 comment · Fixed by #168
Closed

Ingestion fails with ledgers not contiguous #166

2opremio opened this issue May 9, 2024 · 1 comment · Fixed by #168
Labels
bug Something isn't working

Comments

@2opremio
Copy link
Contributor

2opremio commented May 9, 2024

What version are you using?

21.0.1

What did you do?

Restart rpc (as part of a K8s cluster upgrade)

What did you expect to see?

Ingestion working

What did you see instead?

time="2024-05-09T18:24:36.715Z" level=error msg="could not run ingestion. Retrying" error="error appending ledgers: ledgers not contiguous: expected ledger sequence 51590893 but received 51594383" pid=1

This happens as part of eventStore.IngestEvents()

This happens after loading the in-memory data structures and Core is running.

Root case theories:

  • Disconnect between the ledger range in the DB and the one obtained from captive core
  • DB Corruption
  • Events in-memory data-structure corruption or malfunctioning
@2opremio 2opremio added the bug Something isn't working label May 9, 2024
@2opremio
Copy link
Contributor Author

2opremio commented May 10, 2024

Root case theories:

It was none of the above.

It turns out that the in-memory loading timed out, but the error was ignored (PR coming up)

As a result, the in-memory data-structure didn't get to load the full ledger range from the DB (in the ocurrence above it only loaded up to ledger 51590893 instead of 51594383)

Later, we attempted to continue loading the in-memory data structure from captive core (which we told to start at the DB boundary: 51594383).

Since the in-memory data structure was only loaded up to 51590893 (due to the time out), it resulted in an error.

We should also extend the default ingestion time out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant