Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upkeep States: optimize calls to DB to scan for workIDs #11496

Conversation

amirylm
Copy link
Collaborator

@amirylm amirylm commented Dec 5, 2023

AUTO-7399

Description

As part of load testing, it was found that while scanning the DB for state of workIDs we hit the DB too much which is causing pressure and multiple failures, which leads to a worst case where we lookup a massive amount of workIDs multiple times.

Changes

  • Added rate limiting for workIDs, where we avoid checking a single workID more than a rate of 3 times / 1 min
  • specialized implementation of token buckets with low (as possible) memory footprint

Testing

  • unit tests for ensuring functionality of the token buckets

Copy link
Contributor

github-actions bot commented Dec 5, 2023

I see that you haven't updated any README files. Would it make sense to do so?

@amirylm amirylm changed the title Upkeep States: optimize calls to db to scan for workIDs Upkeep States: optimize calls to DB to scan for workIDs Dec 5, 2023
@ferglor
Copy link
Collaborator

ferglor commented Dec 5, 2023

Not really coding related, but what do the DB queries that we're executing look like? Do we have indexes created against workID, or whatever we're querying by? Or would that be too expensive in terms of storage?

@amirylm
Copy link
Collaborator Author

amirylm commented Dec 5, 2023

@ferglor There are 2 tables that gets queried:

  1. evm logs (log poller) where we call poller.IndexedLogs of DedupKeyAdded event with the given workIDs (should be indexed). called from scanner.go
  2. evm upkeep states where we query by workID (indexed). called from orm.go

@ferglor
Copy link
Collaborator

ferglor commented Dec 5, 2023

@amirylm looking at the migrations, I can only see two indexes for evm.logs, one for tx_hash, and then another for evm_chain_id, block_timestamp. SelectIndexedLogs seems to query based on evm_chain_id, address, event_sig, topics, and block_number. So it looks like we could improve query performance but I'm not sure at what cost/what the process is for introducing new indexes like this 🤷

Copy link
Contributor

@infiloop2 infiloop2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we dug into the reasons for trying to fetch workID state so frequently. Can the caller be optimised?

Also, what's the effect of these rate limits on call sites?

@cl-sonarqube-production
Copy link

@reductionista
Copy link
Contributor

@amirylm looking at the migrations, I can only see two indexes for evm.logs, one for tx_hash, and then another for evm_chain_id, block_timestamp. SelectIndexedLogs seems to query based on evm_chain_id, address, event_sig, topics, and block_number. So it looks like we could improve query performance but I'm not sure at what cost/what the process is for introducing new indexes like this 🤷

We have a lot of indexes on that table, maybe you're not seeing them because it was recently renamed from evm_logs to evm.logs?

 evm    | evm_logs_idx                 | index | dominovaldano | logs
 evm    | evm_logs_idx_created_at      | index | dominovaldano | logs
 evm    | evm_logs_idx_data_word_four  | index | dominovaldano | logs
 evm    | evm_logs_idx_data_word_one   | index | dominovaldano | logs
 evm    | evm_logs_idx_data_word_three | index | dominovaldano | logs
 evm    | evm_logs_idx_data_word_two   | index | dominovaldano | logs
 evm    | evm_logs_idx_topic_four      | index | dominovaldano | logs
 evm    | evm_logs_idx_topic_three     | index | dominovaldano | logs
 evm    | evm_logs_idx_topic_two       | index | dominovaldano | logs
 evm    | evm_logs_idx_tx_hash         | index | dominovaldano | logs 
 evm    | idx_evm_log_poller_blocks_order_by_block     | index | dominovaldano | log_poller_blocks
 evm    | idx_evm_logs_ordered_by_block_and_created_at | index | dominovaldano | logs

Copy link
Contributor

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 11, 2024
@github-actions github-actions bot closed this Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants