Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Create a new raw table for all ledger entry changes #309

Open
chowbao opened this issue Jan 31, 2025 · 6 comments
Open

Spike: Create a new raw table for all ledger entry changes #309

chowbao opened this issue Jan 31, 2025 · 6 comments
Assignees

Comments

@chowbao
Copy link
Contributor

chowbao commented Jan 31, 2025

Currently we separate all ledger entry changes into their own separate tables (trustlines, liquiditypools, contract_data, etc...). Does it makes sense for us to have an all combined ledger entry changes table similar to how history_operations combines all operation types and stores data in a details JSON.

This would have some advantages that would be worth looking into

  • Our state/change tables currently use the change compactor which means we don't a full view of all changes in any of those tables. It is compacted to change per ledger. Creating this new table would allow us to use the normal changeReader instead of the compactor to get every ledger change entry
  • Does having all changes in a single table make it easier to get things like balance changes? This would be beneficial in any cases where you'd need to join between different state tables
  • Is it easier to interact with changes like how we interact with operations? Is it better to have changes with a details JSON instead of the separate tables?

all_changes_table

  • -- high level information
  • op_id
  • transaction_id/hash
  • ledger_sequence
  • closed_at
  • change_type
  • ledger_entry_type
  • details -- JSON with all unique state table columns for each ledger entry type (contract_data, trustlines, etc...)
    • account
    • balance
    • to
    • from
    • asset
    • etc...
@chowbao chowbao self-assigned this Jan 31, 2025
@chowbao chowbao added this to the Data Sprint 56 milestone Jan 31, 2025
@amishas157
Copy link
Contributor

all_changes_table
-- high level information
op_id
transaction_id/hash
ledger_sequence
closed_at
change_type
ledger_entry_type
details -- JSON with all unique state table columns for each ledger entry type (contract_data, trustlines, etc...)
account
balance
to
from
asset
etc...

Wonder would it be more helpful to store most raw information , example: pre and post state? And then we can derive individual tables from that?

@chowbao
Copy link
Contributor Author

chowbao commented Jan 31, 2025

Wonder would it be more helpful to store most raw information , example: pre and post state? And then we can derive individual tables from that?

I think it would be the most raw information. Technically the pre would just be the previous row in the table right?

@amishas157
Copy link
Contributor

Technically the pre would just be the previous row in the table right?

Not sure. Let's say there happens an operations where:
Account A sends payment to account B

Ledger change entries:

  1. pre: account_a_balance: 100 , post: account_a_balance: 80
  2. pre: account_b_balance: 60, post: account_b_balance: 80

What I am understanding is that all_changes_table will capture:
Record 1: ledger m, transaction n, operation o, account: a_balance: 100
Record 2: ledger m+x, transaction n+y, operation o+z, account a_balance: 80

It is not necessary that Record 1 and record 2 are continuous since there might be more ledger entry changes in between.

I think what I am proposing is changelog table, whereas your current proposal will create historical table(each state captured across different rows). My thought is that with changelog table, it could be easier to see one level up history and not lose any information by parsing records in this table. With historical table, you would need to do joins for seeing any history

@chowbao
Copy link
Contributor Author

chowbao commented Jan 31, 2025

Ah okay yeah I think we are trying to get the same thing but wording it differently.

Record 1: ledger m, transaction n, operation o, account: a_balance: 100
Record 2: ledger m+x, transaction n+y, operation o+z, account a_balance: 80

This IS what the table would probably look like. And yes the record 1 and 2 don't have to be continuous

Edit: Oh I misread your comment. I guess I'm not sure what the difference between the changelog table and historical table would be

@amishas157
Copy link
Contributor

So my suggestion is that we store information as following:

Record 1: ledger m, transaction n, operation o, pre account: a_balance: 200, post account: a_balance: 100,
Record 2: ledger m+x, transaction n+y, operation o+z, pre account: a_balance: 100, post: account_a_balance: 80

@chowbao
Copy link
Contributor Author

chowbao commented Jan 31, 2025

Ah I see. Yeah that would work too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants