Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCIP-1496 Delete expired logs by the block_timestamp #12040

Merged
merged 2 commits into from
Feb 28, 2024

Conversation

mateusz-sekara
Copy link
Collaborator

@mateusz-sekara mateusz-sekara commented Feb 15, 2024

Motivation

Delete expired logs should use block_timestamp instead of created_at.
created_at is generated when inserting a block into the database. These values can be completely different for different nodes because of the

  • different clocks
  • inserting logs to DB at different times due to outage, replay etc
    In this case, block_timestamp is the real value on "when" the block was produced and should be the same for all nodes.

This is a very similar case to what was fixed some time ago for the read layer #10743

Additionally, we can get rid of the created_at index, because after this change filtering by created_at is not used at all. One index less should save some time during insert/delete operations on evm.logs table. The only scenario that might be a problem here is running replay from block older than the defined retention. Prune will start removing logs that are just replayed, this won't be a problem when we keep it filtering by created_at

No big difference between deleting by block_timestamp and created_at. Removing around ~90k logs

-- created_at
 Delete on logs l  (cost=0.56..201494.41 rows=92352 width=12) (actual time=6069.856..6069.857 rows=0 loops=1)
   ->  Nested Loop  (cost=0.56..201494.41 rows=92352 width=12) (actual time=0.067..2268.151 rows=1309224 loops=1)
         ->  Seq Scan on log_poller_filters  (cost=0.00..54.99 rows=277 width=60) (actual time=0.013..0.620 rows=271 loops=1)
               Filter: (evm_chain_id = '11155111'::numeric)
               Rows Removed by Filter: 1021
         ->  Index Scan using idx_evm_logs_ordered_by_block_and_created_at on logs l  (cost=0.56..725.26 rows=196 width=60) (actual time=1.103..7.819 rows=4831 loops=271)
               Index Cond: ((evm_chain_id = '11155111'::numeric) AND (address = log_poller_filters.address) AND (event_sig = log_poller_filters.event) AND (created_at <= (statement_timestamp() - '6415:33:20'::interval)))
 Planning Time: 0.422 ms
 Execution Time: 6069.896 ms
(9 rows)
-- block_timestamp

 Delete on logs l  (cost=0.56..193585.58 rows=92282 width=12) (actual time=5568.214..5568.215 rows=0 loops=1)
   ->  Nested Loop  (cost=0.56..193585.58 rows=92282 width=12) (actual time=0.048..1760.326 rows=1309284 loops=1)
         ->  Seq Scan on log_poller_filters  (cost=0.00..54.99 rows=277 width=60) (actual time=0.013..0.480 rows=271 loops=1)
               Filter: (evm_chain_id = '11155111'::numeric)
               Rows Removed by Filter: 1021
         ->  Index Scan using evm_logs_by_timestamp on logs l  (cost=0.56..696.71 rows=196 width=60) (actual time=0.011..5.941 rows=4831 loops=271)
               Index Cond: ((evm_chain_id = '11155111'::numeric) AND (address = log_poller_filters.address) AND (event_sig = log_poller_filters.event) AND (block_timestamp <= (statement_timestamp() - '6415:33:20'::interval)))
 Planning Time: 0.416 ms
 Execution Time: 5568.253 ms
(9 rows)

Copy link
Contributor

I see that you haven't updated any README files. Would it make sense to do so?

@mateusz-sekara mateusz-sekara changed the title Removing created_at index Delete expired logs by the block_timestamp Feb 15, 2024
@@ -338,7 +338,7 @@ func (o *DbORM) SelectLogsByBlockRange(start, end int64) ([]Log, error) {
WHERE evm_chain_id = :evm_chain_id
AND block_number >= :start_block
AND block_number <= :end_block
ORDER BY (block_number, log_index, created_at)`, args)
ORDER BY (block_number, log_index)`, args)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should not be 2 logs with the same log_index, block_number and evm_chain_id

@@ -1565,3 +1573,55 @@ func Benchmark_LogsDataWordBetween(b *testing.B) {
assert.Len(b, logs, 1)
}
}

func Benchmark_DeleteExpiredLogs(b *testing.B) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No big diff

-- block_timestamp
Benchmark_DeleteExpiredLogs
Benchmark_DeleteExpiredLogs-12    	      13	  89827359 ns/op
Benchmark_DeleteExpiredLogs-12    	      13	  96382282 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  99016438 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  94013479 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  93778076 ns/op
-- created_at
    pkg: github.com/smartcontractkit/chainlink/v2/core/chains/evm/logpoller
Benchmark_DeleteExpiredLogs
Benchmark_DeleteExpiredLogs-12    	      12	  89441781 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  91320372 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  92141535 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	  93036705 ns/op
Benchmark_DeleteExpiredLogs-12    	      12	 108362344 ns/op

@mateusz-sekara mateusz-sekara changed the title Delete expired logs by the block_timestamp CCIP-1496 Delete expired logs by the block_timestamp Feb 23, 2024
@mateusz-sekara mateusz-sekara force-pushed the logpoller-created-at-index branch from 53a5371 to 5c670ed Compare February 23, 2024 07:55
@mateusz-sekara mateusz-sekara marked this pull request as ready for review February 23, 2024 07:55
@mateusz-sekara mateusz-sekara requested review from a team as code owners February 23, 2024 07:55
@@ -0,0 +1,5 @@
-- +goose Up
drop index evm.evm_logs_idx_created_at;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This index shouldn't be used anywhere, because there are no more queries that rely on that field. I've also checked index utilization for CCIP to double check

 evm.evm_logs_by_timestamp                        |        21272200 |  3524320643 |     3522526611
 evm.logs_pkey                                    |        12636814 |    11596801 |       11596715
 evm.idx_evm_logs_ordered_by_block_and_created_at |         4240547 | 20127987563 |    15762720980
 evm.evm_logs_idx                                 |         7208398 |    24196816 |       22603336
 evm.evm_logs_idx_data_word_one                   |               0 |           0 |              0
 evm.evm_logs_idx_data_word_three                 |           27315 |   390229380 |              0
 evm.evm_logs_idx_data_word_two                   |               0 |           0 |              0
 evm.evm_logs_idx_topic_four                      |               0 |           0 |              0
 evm.evm_logs_idx_topic_three                     |               0 |           0 |              0
 evm.evm_logs_idx_topic_two                       |          311176 |   104034706 |          17779
 evm.evm_logs_idx_created_at                      |               0 |           0 |              0
 evm.evm_logs_idx_tx_hash                         |             745 |         932 |            922
 evm.evm_logs_idx_data_word_four                  |               0 |           0 |              0

Copy link
Contributor

@reductionista reductionista Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SonarQube is complaining about the lack of capitalization here, will be cleaner to look at for the future if we address it:

--- drop index evm.evm_logs_idx_created_at;
+++ DROP INDEX evm.evm_logs_idx_created_at;

Post rebase fix
Added debug log
@reductionista reductionista added this pull request to the merge queue Feb 27, 2024
@reductionista reductionista removed this pull request from the merge queue due to a manual request Feb 27, 2024
@mateusz-sekara mateusz-sekara added this pull request to the merge queue Feb 28, 2024
Merged via the queue into develop with commit aa22ad5 Feb 28, 2024
97 checks passed
@mateusz-sekara mateusz-sekara deleted the logpoller-created-at-index branch February 28, 2024 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants