-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Effect functions #810
Comments
Why can't you just call the effect on the indexer function? Would there be problems with block reorgs? |
A few reasons:
With these requirements in mind, it becomes pretty clear that we need a job queue pattern for side effects (with some EVM/Ponder sauce to make it really intuitive). |
@typedarray, I'm keen to support the Ponder team in getting the Effects feature built 🙃 I appreciate all the thinking going into making things work correctly. The nature of blockchain networks requires a lot of intellectual gymnastics to get right outcomes. Running effects when the current state can get soon invalidated is a challenge. A challenge of applying probabilities. Use caseJust the other day, I proposed this change:
In my use case, I need my web backend service to calculate some rewards for users depending on their on-chain activity and staked tokens amount. ContextPonder indexingI understand, that indexing functions, currently, are triggered for unfinalized blocks. It means, that some of the invocations are going to be discarded if a blockchain-reorg gets detected. The re-org means the last locally processed block hash is not equal the parent hash of a just-fetched remote block. Such defined re-orgs are driven by an RPC that cannot catch up with the fastest RPCs on the network and when asked for blocks, return some not up-to-date data. RPC quality vs re-orgs countFor many months, I've been maintaining four Ponder indexers across multiple mainnets (Ethereum, Arbitrum). Each of them prints the following metric about reorgs:
I read such metrics print as no reorgs have ever happened. Is that a correct thing to say? I was also running indexing on couple of EVM testnets, and one of them gave me the re-orgs for almost every block fetched. I switched to another RPC and the re-orgs were gone. Re-orgsvs indexed data qualityFrom my observations, reorgs happen more often on low-quality RPCs, and might even not happen for a very long time when using quality RPCs. It's a challenge while indexing the data, as the data source might keep changing. Ponder does not support effects as the quality of data source (the latest fetched block is always ahead of the latest processed block) cannot be guaranteed. However, the Ponder data api (GQL/REST API) allows fetching data about the current state of indexed blockchain view, which might also not be correct at a time. There's a very little probability that some client would as Ponder data api for some data that would be changed due to a re-org. vs effectsThe effects are only useful if there are any clients subscribing to them. So either the client, or the indexer, need to work around the possibility of effect being invalidated. In short, someone needs to keep track of the confirmed effects. We cannot definitely say that the next block that will be fetched is not going to trigger the re-org. But, we can describe how likely it is to occur. For example, by saying how many re-orgs actually happened for the total of realtime synced blocks so far. For quality data sources (fast RPCs), the re-org likeliness should be simply zero. Sometimes, it's going to be slightly around zero. And sometimes it going to be a much greater probability. Using this predictive metric, we could delay fetching blocks to give the RPC a chance to catch up. We could also try to delay triggering the effect function for a certain blocks range, depending on that predictive metric. |
On the other hand, we could also make Ponder to only fetch blocks with block tags ( From what I see, Ponder syncing engine allows fetching block either by a number of by the For example, we could keep track of the last safe/finalized block number, and only trigger effects for earlier blocks. Any thoughts on that, @typedarray? |
Addressing the performance issue: perhaps running side-effects from inside a worker thread would be a good choice. It's not going to block the indexing, and also will will allow queuing the side-effect callbacks within the same thread. Have a look at: https://piscinajs.github.io/piscina/ |
I created a module handling the side effects with re-orgs in mind. Here's the demo: The module requires two block data providers: one fetching the latest block, and one fetching the finalized block. Ponder indexing functions run each time the latest block arrives. That's when we yet don't know if the block is final. But we can register effects for the given event (the event id being: block number, block hash, log index). The effect callback functions might, or might not be called. If the re-org occurs, the effects queue would be rebuilt without the effects that were registered for the just-discarded blocks (block number, block hash). Finally, the second block data provider kicks in. The safe/finalized block provider. It calls the effects queue and process all remaining effects for the blocks before the current safe/finalized block. Demo: https://stackblitz.com/edit/evm-event-side-effects?file=lib%2Fevm-side-effects%2Fqueue.spec.ts |
Cool design, I can see this working for some side effect use cases. But, what if the user suspects that reorgs are very rare, and latency is critical. So, they would rather run the side effect as soon as possible, tolerating the ~1 in a million case where the event ends up getting reorg'd out. Or, maybe they'd like to register an "undo" function which runs in that 1 in million scenario. What if the effect functions must run in-order, similar to indexing functions? Should we allow this as an option, not offer any ordering guarantees? What should the coding experience look like, particularly when running the dev server? Effect functions will probably have a lot of duplicate logic/work as indexing functions. Can you kick off a side effect job from within an indexing function? Or only register them at the top-level as "siblings" to indexing functions, e.g. These are the types of design questions that we need to answer. Honestly, the implementation will probably be easy once we have clear answers to these (and more). |
Ok, so design wise, I'd go with:
|
Hey @typedarray, I believe the Ponder team has been working on this issue recently. Are there any updates on side-effects management? |
@tk-o no near-term update. We're still committed to solving the core indexing use case before expanding to effects. At the moment, this means improving the client/query story, direct SQL, working with offchain data, and performance for huge apps. |
Fair enough, sir. I will definitely keep my own solution running, but also think of having it changed to take the questions you've listed into account and have solutions in place as the answers. |
We should support an alternative kind of function that uses a similar registration / context API, but supports side effects.
The text was updated successfully, but these errors were encountered: