You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create an initial proposal, or set of options, for what we want to provide.
Complete a research pass of other chains.
What is a reasonable thing to provide? At one extreme: nothing. At the other: immediate API responses to any question of interest.
Pursue idea of TxMeta as raw form (could we have an unstructured db?)
Prototype data lake and pipeline in Dataflow to move to Data Warehouse. Can we try and build a framework that makes our prototype repeatable. Make it config based
Consumption/usage model - we want to make this easy for people in the community to run on their own (sub question: based on usage, what would pricing model look like?)
Research dependencies, with our solution would want stellar-core as a dependency but potentially no other code (Horizon not needed?)
Come up a list of modules/frameworks/tools that SDF or community could contribute back to open source
Pub/Sub model? What would guarantees look like?
Data Validation - what does this look like? This is the more difficult part to the problem, follow up with Graydon about idea for downloading hashes and compare with stellar-core. Come up with options for solution but does not have to be completely solved for prototype
File Format? JSON, avro. Partitioning method?
File size tradeoffs? small files frequently or wait and have larger files written?
Processing times for querying and running DataFlow jobs? Experiment: answer the question for a single point in time question and an analysis/aggregation question
Timeline:
Week 1:
Finish research on other chains - Tamir to research Polygon; Syd to research Polkadot, Filecoin
Proposal written for design of the prototype - asynchronously work together on doc
Include file details (saving in XDR? JSON? size of files saved?) Week 2:
Build a prototype for storing raw data in lake
One pipeline
Figure out point in time question (What are all the transactions that occurred at sequence number) Week 3:
Figure out aggregation question (Fee stats, Trading volume for USDC)
Define assumptions, pros/cons, risks as prototype is today
Recommendation for how to proceed
Out of Scope, but cool
Write a library like blockchain-etl that is available in commonly used ETL languages
Overview
Create an initial proposal, or set of options, for what we want to provide.
Timeline:
Week 1:
Week 2:
Week 3:
Out of Scope, but cool
blockchain-etl
that is available in commonly used ETL languagesResearch Doc for Off Chain Analysis
The text was updated successfully, but these errors were encountered: