Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway caching proposal #1773

Merged
merged 2 commits into from
Feb 16, 2024
Merged

Gateway caching proposal #1773

merged 2 commits into from
Feb 16, 2024

Conversation

zkokelj
Copy link
Contributor

@zkokelj zkokelj commented Feb 2, 2024

Why this change is needed

To discuss if and how to best implement caching on Ten Gateway.

https://github.com/ten-protocol/ten-internal/issues/2851

What changes were made as part of this PR

Please provide a high level list of the changes made

PR checks pre-merging

Please indicate below by ticking the checkbox that you have read and performed the required
PR checks

  • PR checks reviewed and performed

Copy link

coderabbitai bot commented Feb 2, 2024

Important

Auto Review Skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@BedrockSquirrel
Copy link
Collaborator

Nice, it'd be interesting to have a think about which of these we'd actually see any hits on. Distinguishing between encrypted (sensitive) and non-encrypted methods might be interesting too. Like caching getBlockByHash probably more useful than caching getTransactionByHash because the latter is only useful to one user.

Copy link
Collaborator

@tudor-malene tudor-malene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all calls are cache-able.
The TTL differs.

Also, would be good to touch for "eth_call" - for example, what would be the key for caching


## 1. Why cache requests on Gateway?

Currently, all `eth_` requests that hit the gateway are forwarded to our Sequencer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sequencer is temporary. Let's say to the Ten Nodes and are executed by the Ten enclave

- `eth_sign`
- `eth_signTransaction`

Non-cacheable request methods are:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are also cache-able, but with a TTL until the next batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. What do you think about eth_sendRawTransaction we can probably also catch with TTL until next batch.
(Assuming the key is the whole transaction and submitting the same tx twice in short time period should give the same response).

Copy link
Contributor

@badgersrus badgersrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we're expecting all user traffic to come through this gateway, we might want to consider a standalone cache for testnet/ mainnet. This would reduce the memory overhead but you'd just get a bit of additional infrastructure complexity. We can add specific rate-limiting for the non-cacheable methods to prevent a targeted DDoS via those.

If this is something we'd want to consider you would want to define a cache interface with an in-memory implementation for dev and a redis/ memcached implementation that we could use for production which would be specified in the application configuration.

@BedrockSquirrel
Copy link
Collaborator

BedrockSquirrel commented Feb 2, 2024

It would be nice to have some indication of the value of a cache before investing too much time and resources. Intuitively, it feels to me like we're going to have a tiny cache hit ratio overall (like I'd guess < 1%).

Maybe I'm wrong, or might be that even the savings from occasional cache hits for certain methods make it worthwhile. But I'd be interested to hear any hunches or see some back-of-an-envelope maths or finger-in-the-air stats to motivate a cache.

Feels a bit like premature optimisation to me.

@badgersrus
Copy link
Contributor

It would be nice to have some indication of the value of a cache before investing too much time and resources. Intuitively, it feels to me like we're going to have a tiny cache hit ratio overall (like I'd guess < 1%).

Maybe I'm wrong, or might be that even the savings from occasional cache hits for certain methods make it worthwhile. But I'd be interested to hear any hunches or see some back-of-an-envelope maths or finger-in-the-air stats to motivate a cache.

Feels a bit like premature optimisation to me.

Are there any prometheus/ grafana dashboards we can use to track the requests being made? That way we could quantify the % of hits and extrapolate out to anticipated traffic

@anixon604
Copy link
Contributor

It would be nice to have some indication of the value of a cache before investing too much time and resources. Intuitively, it feels to me like we're going to have a tiny cache hit ratio overall (like I'd guess < 1%).
Maybe I'm wrong, or might be that even the savings from occasional cache hits for certain methods make it worthwhile. But I'd be interested to hear any hunches or see some back-of-an-envelope maths or finger-in-the-air stats to motivate a cache.
Feels a bit like premature optimisation to me.

Are there any prometheus/ grafana dashboards we can use to track the requests being made? That way we could quantify the % of hits and extrapolate out to anticipated traffic

I think it might be viable to just give a go on a OOB caching package before building a custom one out. Just to see what sane defaults might accomplish and get quick functionality.

Like this one https://github.com/dgraph-io/ristretto will even export metrics it seems. So it can just bolt on and run for a bit and see how the metrics play out.

@zkokelj @badgersrus

@zkokelj
Copy link
Contributor Author

zkokelj commented Feb 5, 2024

Here are the stats from Sepolia Ten Gateway in the past ~6 days.

eth_accounts: 0
eth_chainID: 105110
eth_coinbase: 0
eth_getBlockByHash: 17241
eth_getBlockByNumber: 500670
eth_getBlockRecipients: 0
eth_getBlockTransactionCountByHash: 0
eth_getBlockTransactionCountByNumber: 0
eth_getCode: 22086
eth_getStorageAt: 49096
eth_getTransactionByBlockHashAndIndex: 0
eth_getTransactionByBlockNumberAndIndex: 0
eth_getTransactionByHash: 2913
eth_getTransactionReceipt: 454676
eth_getUncleCountByBlockHash: 0
eth_getUncleCountByBlockNumber: 0
eth_maxPriorityFeePerGas: 0
eth_sign: 0
eth_signTransaction: 0
eth_getBalance: 561268
eth_blockNumber: 1760227
eth_call: 402467
eth_createAccessList: 0
eth_estimateGas: 24998
eth_feeHistory: 3070
eth_getProof: 0
eth_gasPrice: 280329
eth_getFilterChanges: 0
eth_getFilterLogs: 0
eth_getTransactionCount: 114349
eth_newBlockFilter: 0
eth_newFilter: 0
eth_newPendingTransactionFilter: 0
eth_sendRawTransaction: 88151
eth_sendTransaction: 0
eth_syncing: 0
eth_uninstallFilter: 0

There are a few methods that get called a lot, more than half of the methods were never called in the past week.
I think it might be worth implementing a simple cache for those. eth_getBlockByNumber looks like a method that can be cached the best (and also for longer time) + some other can use shorter TTL

@zkokelj
Copy link
Contributor Author

zkokelj commented Feb 5, 2024

I think all calls are cache-able. The TTL differs.

Also, would be good to touch for "eth_call" - for example, what would be the key for caching

I looked into eth_call and for the key we can use combination of: from,to, data, block. Because block can be also latest we should not have very long TTL for that cache.

@BedrockSquirrel
Copy link
Collaborator

Thanks @zkokelj those numbers are really useful, much easier to reason about with some data.

Caching on those more public values seems worthwhile for sure, but worth noting that the methods we're likely to cache only hit the host and not the enclave, so might not help much with the main bottlenecks.

@otherview
Copy link
Contributor

Good stuff, and an important proposal.

  • I believe that the caching is different for the below as they can use latest, earliest, pending or blockNumber.
    The following methods have an extra default block parameter:
    . eth_getBalance
    . eth_getCode
    . eth_getTransactionCount
    . eth_getStorageAt
    . eth_call

  • It's likely that caching can be done by a different app rather than inMem
    The typical eth response is a few kb, I think that a first iteration done in Memory is good but might fill up fast. Also it's likely that it will take some effort to easily scale or be as resilient as perhaps a solution like Redis .

Copy link
Contributor

@moraygrieve moraygrieve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zkokelj zkokelj marked this pull request as ready for review February 16, 2024 10:31
@zkokelj zkokelj merged commit a2a080a into main Feb 16, 2024
1 check passed
@zkokelj zkokelj deleted the ziga/gateway_caching_proposal branch February 16, 2024 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants