Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest events into DB #192

Merged
merged 5 commits into from
Jun 18, 2024
Merged

Ingest events into DB #192

merged 5 commits into from
Jun 18, 2024

Conversation

psheth9
Copy link
Contributor

@psheth9 psheth9 commented May 31, 2024

Fixes: #189

This diff mainly covers the write path (ingestion) of events, follow up PR refactors getEvents in order to fetch it from DB

Changes:

  • Introduce new event table
  • Parse events from LCM and ingest in DB
  • Tests

Design Doc here
Parent Epic here

@psheth9 psheth9 self-assigned this May 31, 2024
@psheth9 psheth9 changed the base branch from main to events-db-backend May 31, 2024 22:03
@psheth9 psheth9 changed the title [Draft] Ingest events into DB Ingest events into DB Jun 4, 2024
@psheth9 psheth9 marked this pull request as ready for review June 4, 2024 16:02
@2opremio
Copy link
Contributor

2opremio commented Jun 4, 2024

Please do not merge until there is a solution for data migration

@psheth9 psheth9 requested review from Shaptic and aditya1702 June 4, 2024 20:42
@psheth9
Copy link
Contributor Author

psheth9 commented Jun 4, 2024

Please do not merge until there is a solution for data migration

Not going to merge to main but to events-db-backend !! @2opremio

@2opremio
Copy link
Contributor

2opremio commented Jun 6, 2024

Did you write migration code for the new schema? Running instances with the in-memory datasctructure should work when running this PR.

@2opremio
Copy link
Contributor

You also need to remove the in-memory prefill from daemon.go

@psheth9
Copy link
Contributor Author

psheth9 commented Jun 10, 2024

@2opremio Yes I will be adding the migration code and going to remove the entire in-memory storage of events in follow up PR. Not planning to do that in this PR bc need to refactor getEvents [to fetch it from DB instead of in-memory]

Could you approve this PR if DB schema and ingestion looks good :) ? Updated the details of follow up refactoring task here

Note: This won't merge changes to main but only to events-db-backend

const eventTableName = "events"

type EventWriter interface {
InsertEvents(lcm xdr.LedgerCloseMeta) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the interface should not be defined in the same file as its implementation: https://go.dev/wiki/CodeReviewComments#interfaces

Would like to get a recommendation from @tamirms about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, but that's how it's done everywhere else in the db package

return err
}

if !tx.Result.Successful() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we skipping if the transaction is not successful? There can be diagnostic events for both successful and failed txns right?

@2opremio @tamirms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is the existing behavior. Can update it if it is not true anymore.

Copy link
Contributor

@aditya1702 aditya1702 Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I saw it is in current getEvents too. I wanted to confirm this is true since I saw diagnostic events in both types of txns. @tamirms @2opremio

err = eventW.InsertEvents(ledgerCloseMeta)
assert.NoError(t, err)

//TODO: Call getEvents and validate events data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please resolve this before merging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, actually ignore this TODO as getEvents is backed by memory in this diff, this will be resolved in this follow up PR #215 (WIP) @2opremio

@2opremio
Copy link
Contributor

Can you please fix the tests and the TODO?

@psheth9
Copy link
Contributor Author

psheth9 commented Jun 13, 2024

@2opremio Fixed the tests PTAL :) Next is coming up soon

@2opremio
Copy link
Contributor

2opremio commented Jun 13, 2024

Ah, the in-memory store still exists? I would much rather review finished, self-contained code with proper tests.

Can you remove the in-memory store and add end-to-end tests (i.e. ingesting and checking that ingestion worked through querying) here?

@2opremio
Copy link
Contributor

I have finished what I was going , so I will get to review this on Monday. Sorry for the delay.

defer func() {
closeErr := txReader.Close()
if err == nil {
err = closeErr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't overwrite the error, just use errors.Join()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, note that err is shadowed down the line, so this may not be the err you want

continue
}

txEvents, err := tx.GetDiagnosticEvents()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shadows the errabove, so the error handling in defer above won't do what you expect. You may want to use a named return parameter to make it cleaner.

@2opremio
Copy link
Contributor

The approach seems fine, but ... it's hard to tell without seeing the complexity of the querying side.

I would like to see a benchmark of querying and see how it compares with the in-memory store.

@psheth9 psheth9 merged commit 3121178 into events-db-backend Jun 18, 2024
20 checks passed
@psheth9 psheth9 deleted the ingest-events-in-db branch June 18, 2024 04:42
if e.Event.ContractId != nil {
contractId = e.Event.ContractId[:]
}
id := events.Cursor{Ledger: lcm.LedgerSequence(), Tx: tx.Index, Op: 0, Event: uint32(index)}.String()
Copy link
Contributor

@2opremio 2opremio Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psheth9 I have just realized that the table already contains a ledger row, so the ledger info here is redundant.

Also:

  1. if the op is always 0 (and we don't plan that to change) that info shouldn't go into the DB.
  2. the tx index isn't the same as the application order field?

Thus, unless there is a good reason for the opposite, I would suggest storing the tx index and event index instead of the id field.

@psheth9 psheth9 mentioned this pull request Aug 20, 2024
psheth9 added a commit that referenced this pull request Aug 24, 2024
* Ingest events into DB (#192)

* Ingest events into DB

* Update tests

* Update tests

* Fix tests

* Ignore ingestion when events are empty

* Add getEvents backed by DB  (#215)

* Ingest events into DB

* Update tests

* Update tests

* Fix tests

* Ignore ingestion when events are empty

* Add getEvents backed by DB

* Refactor getEvents to call fetch events from db

* Remove in-memory events store

* Make use of cursor in pagination and combine filters

* Update scan function logic in order to test

* Move NewTestDb util

* Trim events db and remove eventTypes in SELECT query as event types are not indexed

* Introduce cursor range and add logs for latency

* remove event memory store

* remove event memory store from latest merge

* Fix lint issues part 1

* Fix lint issues part 2

* Fix more lint and add ledger range code for events

* Update eventHandler mock

* Fix 2 major errors in tests: nil pointer reference and unknown hash

* Fix contract Id filter logic and add cursor set to avoid duplicates

* Validate requested start ledger with stored ledger range

* Fix lint error part 4

* Add migration for events table

* Fix lint error part 5

* Fix lint error part 6

* Address review comments pt1

* Make contract id a blob type

* Remove events package and move cursor.go to db package

* Fix lint error part 6

* Fix lint error part 7

* Optimize migration code

* Introduce endLedger and remove Cursor id from schema

* Use LedgerReader to get Ledger Range in events

* Fix lint errors

* Add benchmark for testing various load parameter

* update benchmark

* Comment benchmark events

* Reduce allocs pt1

* Benchmark with 30 million events

* Refactor getEvents to backed only by Events table

* change test db path

* Use Binary encoding for saving events into DB

* Correct number of topics

* reduce events in benchmark so that tests run

* update events schema

* Fix topic count

* Fix fetch query to not stop if null topic

* Fix lint pt1

* Fix trimEvents and lint errors

* update log info

* Fix more lint errors

* Fix more lint errors pt 11

* Fix format in error

* Fix linter error pt 12

* Add nolint for GetEvents as a temp fix.

* Add events table migration (#262)

* Fix migrations - 1

* Make migrations sequential - 1

* Make migrations sequential - 2

* Fix failing unittest - 1

* Fix linting errors - 1

* Fix failing integration test - 1

* Remove %w from Fatal strings

* refactor migrationApplierFactoryF

* Add ledger seq to fatal error string

* Add comments - 1

* Fix - 1

* Optimise migrations - 1

* Optimise migrations - 2

* Optimise migrations - 3

* Fix linting - 1

* Fix linting - 2

* Remove dupicate latest ledger fetch code

* Rollback db in daemon instead of migration

* Remove unused constant

* Remove unused constant - 2

* Add rollback() statement

* Small change

* Abstract transaction and rollback management inside migration code

* Fix failing unittest

* Address review comments

* Store binary of topics instead of string

* Unify min/max topic count in event.go

* Address review comments for event Types and fix unit tests

* cleanup

* Fix linter errors for one last time

---------

Co-authored-by: Aditya Vyas <[email protected]>

* Fix linter errors pt 1

* Add nolint for InsertEvents

* Add nolint for InsertEvents again

* Address review comments

* Fix lint

* Fix lint pt1

* Fix lint pt2

* Fix index out of bound error and fix logging

* Add changelog entry

---------

Co-authored-by: Aditya Vyas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce events table and ingest events metadata in DB
3 participants