Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(Postgres): Remove unused indexes #10590

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

Betree
Copy link
Member

@Betree Betree commented Dec 30, 2024

Considering that indexes add overhead to SQL write queries and can make the query planner slower, this migration drops indexes that are seemingly no longer used in the codebase. They were identified by running the following
query in the database:

SELECT
    relname AS table_name,
    indexrelname AS index_name,
    idx_scan AS index_scans,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    indexdef
FROM
    pg_stat_user_indexes
JOIN
    pg_indexes ON pg_indexes.indexname = pg_stat_user_indexes.indexrelname
WHERE
    -- Unused indexes
    idx_scan = 0
    -- That are not unique (we want to keep the constraints)
   AND indexdef NOT ILIKE 'CREATE UNIQUE INDEX%'
ORDER BY
    pg_relation_size(indexrelid) DESC;

Methodology:

  1. On a Postgres server that has been running for some time (Heroku doesn't let us check the SELECT stats_reset FROM pg_stat_database WHERE datname = current_database())
  2. Have a tour in the interface to trigger many different queries
  3. Have a tour in metabase dashboard to trigger many different queries
  4. Run the above query to check unused indexes

@Betree Betree self-assigned this Dec 30, 2024
Comment on lines 9 to 25
'transactions__collective_id_createdAt', // Table: Transactions. Size: 351 MB
'transactions__host_collective_id_createdAt', // Table: Transactions. Size: 246 MB
'transactions__hostCollective_clearedAt', // Table: Transactions. Size: 127 MB
'activities__from_collective_id', // Table: Activities. Size: 100 MB. Replaced by activities__from_collective_id_simple, which skips the transactions activities.
'activities__host_collective_id', // Table: Activities. Size: 88 MB. Replaced by activities__host_collective_id_simple, which skips the transactions activities.
'orders__req_ip', // Table: Orders. Size: 21 MB. The fraud detection uses redis, not the DB.
'payment_methods__fingerprint', // Table: PaymentMethods. Size: 16 MB
// 'UploadedFiles_s3_hash', // Table: UploadedFiles. Size: 16 MB. This one is unused because we've disabled Klippa, but we want to keep it in case we re-enable the feature.
'transaction_wise_transfer_id', // Table: Transactions. Size: 8144 kB
'expense_tag_stats_tag', // View: ExpenseTagStats. Size: 6864 kB. We're seemingly never fetching BY tag.
// 'uploaded_files__created_by_user_id', // Table: UploadedFiles. Size: 5792 kB. For Klippa rate-limiting, keeping in case we re-enable an AI feature.
'expenses_data_stripe_virtual_card_transaction_id', // Table: Expenses. Size: 1104 kB
'expenses_data_paypal_transaction_id', // Table: Expenses. Size: 1048 kB
'wise_batchgroup_status', // Table: Expenses. Size: 648 kB
'privacy_transfer_id', // Table: Transactions. Size: 16 kB
'Subscriptions_lastChargedAt', // Table: Subscriptions. Size: 16 kB
'virtual_card_requests__host_collective_id__collective_id', // Table: VirtualCardRequests. Size: 16 kB
Copy link
Member Author

@Betree Betree Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @znarf for the following indexes:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks pretty dangerous to me, I think this has the potential to break production. transactions__collective_id_createdAt and transactions__host_collective_id_createdAt were carefully created for the generic GraphQL transactions resolver.

Comment on lines 9 to 25
'transactions__collective_id_createdAt', // Table: Transactions. Size: 351 MB
'transactions__host_collective_id_createdAt', // Table: Transactions. Size: 246 MB
'transactions__hostCollective_clearedAt', // Table: Transactions. Size: 127 MB
'activities__from_collective_id', // Table: Activities. Size: 100 MB. Replaced by activities__from_collective_id_simple, which skips the transactions activities.
'activities__host_collective_id', // Table: Activities. Size: 88 MB. Replaced by activities__host_collective_id_simple, which skips the transactions activities.
'orders__req_ip', // Table: Orders. Size: 21 MB. The fraud detection uses redis, not the DB.
'payment_methods__fingerprint', // Table: PaymentMethods. Size: 16 MB
// 'UploadedFiles_s3_hash', // Table: UploadedFiles. Size: 16 MB. This one is unused because we've disabled Klippa, but we want to keep it in case we re-enable the feature.
'transaction_wise_transfer_id', // Table: Transactions. Size: 8144 kB
'expense_tag_stats_tag', // View: ExpenseTagStats. Size: 6864 kB. We're seemingly never fetching BY tag.
// 'uploaded_files__created_by_user_id', // Table: UploadedFiles. Size: 5792 kB. For Klippa rate-limiting, keeping in case we re-enable an AI feature.
'expenses_data_stripe_virtual_card_transaction_id', // Table: Expenses. Size: 1104 kB
'expenses_data_paypal_transaction_id', // Table: Expenses. Size: 1048 kB
'wise_batchgroup_status', // Table: Expenses. Size: 648 kB
'privacy_transfer_id', // Table: Transactions. Size: 16 kB
'Subscriptions_lastChargedAt', // Table: Subscriptions. Size: 16 kB
'virtual_card_requests__host_collective_id__collective_id', // Table: VirtualCardRequests. Size: 16 kB
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @kewitz for the following indexes:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything seems safe to delete, but I'm surprised payment_methods__fingerprint is not being used. I'll investigate this further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is safe to delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transferwise id was introduced for webhooks, I doubt we moved away from that. #6002 #7379

Comment on lines 9 to 25
'transactions__collective_id_createdAt', // Table: Transactions. Size: 351 MB
'transactions__host_collective_id_createdAt', // Table: Transactions. Size: 246 MB
'transactions__hostCollective_clearedAt', // Table: Transactions. Size: 127 MB
'activities__from_collective_id', // Table: Activities. Size: 100 MB. Replaced by activities__from_collective_id_simple, which skips the transactions activities.
'activities__host_collective_id', // Table: Activities. Size: 88 MB. Replaced by activities__host_collective_id_simple, which skips the transactions activities.
'orders__req_ip', // Table: Orders. Size: 21 MB. The fraud detection uses redis, not the DB.
'payment_methods__fingerprint', // Table: PaymentMethods. Size: 16 MB
// 'UploadedFiles_s3_hash', // Table: UploadedFiles. Size: 16 MB. This one is unused because we've disabled Klippa, but we want to keep it in case we re-enable the feature.
'transaction_wise_transfer_id', // Table: Transactions. Size: 8144 kB
'expense_tag_stats_tag', // View: ExpenseTagStats. Size: 6864 kB. We're seemingly never fetching BY tag.
// 'uploaded_files__created_by_user_id', // Table: UploadedFiles. Size: 5792 kB. For Klippa rate-limiting, keeping in case we re-enable an AI feature.
'expenses_data_stripe_virtual_card_transaction_id', // Table: Expenses. Size: 1104 kB
'expenses_data_paypal_transaction_id', // Table: Expenses. Size: 1048 kB
'wise_batchgroup_status', // Table: Expenses. Size: 648 kB
'privacy_transfer_id', // Table: Transactions. Size: 16 kB
'Subscriptions_lastChargedAt', // Table: Subscriptions. Size: 16 kB
'virtual_card_requests__host_collective_id__collective_id', // Table: VirtualCardRequests. Size: 16 kB
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @gustavlrsn for the following indexes:

  • expense_tag_stats_tag

@Betree Betree force-pushed the perf/remove-unused-indexes branch from d82e407 to f4c5087 Compare January 2, 2025 10:18
@Betree Betree marked this pull request as draft January 9, 2025 09:39
@Betree
Copy link
Member Author

Betree commented Jan 9, 2025

Moving this back to draft as:

  1. We're going to fix some of the transaction indexes rather than removing
  2. We're going to split the PR into smaller, safer parts

@Betree Betree force-pushed the perf/remove-unused-indexes branch from 7df3778 to 73f46c5 Compare January 21, 2025 14:21
@Betree Betree marked this pull request as ready for review January 21, 2025 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants