我每天都会发布一篇新的 PostgreSQL "howto" 文章。加入我的旅程吧 — 订阅、提供反馈、分享!
今天让我们讨论如何查找和清理冗余索引。
冗余索引是指一个表上存在多个索引,它们的作用相同,或者其中一个索引可以满足所有其他索引的查询需求。两个基本的例子:
- 重复索引:两个或多个具有完全相同的列且顺序相同的索引显然是冗余的。
- 重叠索引:当你有一个多列索引,例如在列
(a, b)
上创建了索引,然后又在(a)
上创建了另一个索引。对于仅在列a
上进行过滤的查询,(a, b)
上的索引就足够了,因此在(a)
上的索引可能是不必要的。
注意,(a)
对于 (a, b)
是冗余的,但对于 (b, a)
则不是。同样,(b)
对于 (a, b)
也不是冗余的。
此外,我们还需要假设:
- 唯一索引在此类分析中不应被考虑,因为它们有特殊用途。
- 表达式索引遵循相同的规则 — 我们只需将每个表达式视为列,表达式的值应完全匹配。
- 在部分索引的情况下,条件应完全匹配 (此规则可以根据某些假设进行调整,但我们不会这样做)。
- 我们不考虑覆盖索引 (🎯 **TODO:**也将其包括在内)。
我们之前讨论未使用索引的 6 个原因同样适用于此处。
-
使用下面提供的查询,识别冗余索引集。只需要分析集群中的一个节点即可,例如主节点。因为此分析仅基于静态信息 (数据库结构),所以统计信息是何时重置的并不重要。
-
对于每个被认为是冗余的索引,进行手动分析以避免错误。如果不完全确定,请将该索引从考虑中移除。
译者注:即,将该索引从考虑移除的列表中删除
-
可选地,考虑在实际删除索引之前使用"软删除"技术,如 Day 53: Index maintenance 中所述。
-
经过适当的分析和测试后,一次删除一个索引,同时密切关注数据库性能指标,确保没有预期之外的执行计划变化。要删除索引,请使用
DROP INDEX CONCURRENTLY
。
这是来自 postgres-checkup 的一个查询,它以 JSON 的形式生成结果,其中包括使用 DROP INDEX CONCURRENTLY
删除索引的命令。
with const as (
select 0 as min_relpages -- on very large DBs, increase it to, say, 100
), fk_indexes as (
select
n.nspname as schema_name,
ci.relname as index_name,
cr.relname as table_name,
(confrelid::regclass)::text as fk_table_ref,
array_to_string(indclass, ', ') as opclasses
from pg_index i
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
join pg_class cr on cr.oid = i.indrelid and cr.relkind = 'r'
join pg_namespace n on n.oid = ci.relnamespace
join pg_constraint cn on cn.conrelid = cr.oid
left join pg_stat_user_indexes si on si.indexrelid = i.indexrelid
where
contype = 'f'
and not i.indisunique
and conkey is not null
and ci.relpages > (select min_relpages from const)
and si.idx_scan < 10
), index_data as ( -- Redundant indexes
select
*,
indkey::text as columns,
array_to_string(indclass, ', ') as opclasses
from pg_index i
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
where indisvalid = true and ci.relpages > (select min_relpages from const)
), redundant_indexes as (
select
i2.indexrelid as index_id,
tnsp.nspname AS schema_name,
trel.relname AS table_name,
pg_relation_size(trel.oid) as table_size_bytes,
irel.relname AS index_name,
am1.amname as access_method,
(i1.indexrelid::regclass)::text as reason,
i1.indexrelid as reason_index_id,
pg_get_indexdef(i1.indexrelid) main_index_def,
pg_size_pretty(pg_relation_size(i1.indexrelid)) main_index_size,
pg_get_indexdef(i2.indexrelid) index_def,
pg_relation_size(i2.indexrelid) index_size_bytes,
s.idx_scan as index_usage,
quote_ident(tnsp.nspname) as formated_schema_name,
coalesce(nullif(quote_ident(tnsp.nspname), 'public') || '.', '') || quote_ident(irel.relname) as formated_index_name,
quote_ident(trel.relname) AS formated_table_name,
coalesce(nullif(quote_ident(tnsp.nspname), 'public') || '.', '') || quote_ident(trel.relname) as formated_relation_name,
i2.opclasses
from index_data as i1
join index_data as i2 on
i1.indrelid = i2.indrelid -- the same table
and i1.indexrelid <> i2.indexrelid -- NOT the same index
join pg_opclass op1 on i1.indclass[0] = op1.oid
join pg_opclass op2 on i2.indclass[0] = op2.oid
join pg_am am1 on op1.opcmethod = am1.oid
join pg_am am2 on op2.opcmethod = am2.oid
join pg_stat_user_indexes as s on s.indexrelid = i2.indexrelid
join pg_class as trel on trel.oid = i2.indrelid
join pg_namespace as tnsp on trel.relnamespace = tnsp.oid
join pg_class as irel on irel.oid = i2.indexrelid
where
not i2.indisprimary -- index 1 is not PK
and not ( -- skip if index1 is (primary or uniq) and is NOT (primary and uniq)
i2.indisunique and not i1.indisprimary
)
and am1.amname = am2.amname -- same access type
and i1.columns like (i2.columns || '%') -- index 2 includes all columns from index 1
and i1.opclasses like (i2.opclasses || '%')
-- index expressions is same
and pg_get_expr(i1.indexprs, i1.indrelid) is not distinct from pg_get_expr(i2.indexprs, i2.indrelid)
-- index predicates is same
and pg_get_expr(i1.indpred, i1.indrelid) is not distinct from pg_get_expr(i2.indpred, i2.indrelid)
), redundant_indexes_fk as (
select
ri.*,
(
select count(1)
from fk_indexes fi
where
fi.fk_table_ref = ri.table_name
and fi.opclasses like (ri.opclasses || '%')
) > 0 as supports_fk
from redundant_indexes ri
), redundant_indexes_tmp_num as ( -- Cut recursive links
select row_number() over () num, rig.*
from redundant_indexes_fk rig
), redundant_indexes_tmp_links as (
select
ri1.*,
ri2.num as r_num
from redundant_indexes_tmp_num ri1
left join redundant_indexes_tmp_num ri2 on
ri2.reason_index_id = ri1.index_id
and ri1.reason_index_id = ri2.index_id
), redundant_indexes_tmp_cut as (
select *
from redundant_indexes_tmp_links
where num < r_num or r_num is null
), redundant_indexes_cut_grouped as (
select
distinct(num),
*
from redundant_indexes_tmp_cut
order by index_size_bytes desc
), redundant_indexes_grouped as (
select
index_id,
schema_name,
table_name,
table_size_bytes,
index_name,
access_method,
string_agg(distinct reason, ', ') as reason,
string_agg(main_index_def, ', ') as main_index_def,
string_agg(main_index_size, ', ') as main_index_size,
index_def,
index_size_bytes,
index_usage,
formated_index_name,
formated_schema_name,
formated_table_name,
formated_relation_name,
supports_fk
from redundant_indexes_cut_grouped
group by
index_id,
table_size_bytes,
schema_name,
table_name,
index_name,
access_method,
index_def,
index_size_bytes,
index_usage,
formated_index_name,
formated_schema_name,
formated_table_name,
formated_relation_name,
supports_fk
order by index_size_bytes desc
), redundant_indexes_num as (
select row_number() over () num, rig.*
from redundant_indexes_grouped rig
), redundant_indexes_json as (
select
json_object_agg(coalesce(rin.schema_name, 'public') || '.' || rin.index_name, rin) as json
from redundant_indexes_num rin
), redundant_indexes_total as (
select
sum(index_size_bytes) as index_size_bytes_sum,
sum(table_size_bytes) as table_size_bytes_sum
from redundant_indexes_grouped
), do_lines as (
select
format(
'DROP INDEX CONCURRENTLY %s; -- %s, %s, table %s',
formated_index_name,
pg_size_pretty(index_size_bytes)::text,
reason,
formated_relation_name
) as line
from redundant_indexes_grouped
order by table_name, index_name
), undo_lines as (
select
replace(
format('%s; -- table %s', index_def, formated_relation_name),
'CREATE INDEX',
'CREATE INDEX CONCURRENTLY'
) as line
from redundant_indexes_grouped
order by table_name, index_name
), database_stat as (
select row_to_json(dbstat)
from (
select
sd.stats_reset::timestamptz(0),
age(
date_trunc('minute',now()),
date_trunc('minute',sd.stats_reset)
) as stats_age,
((extract(epoch from now()) - extract(epoch from sd.stats_reset))/86400)::int as days,
(select pg_database_size(current_database())) as database_size_bytes
from pg_stat_database sd
where datname = current_database()
) as dbstat
) -- final result
select
jsonb_pretty(jsonb_build_object(
'redundant_indexes',
(select * from redundant_indexes_json),
'redundant_indexes_total',
(select row_to_json(rit) from redundant_indexes_total as rit),
'do',
(select json_agg(dl.line) from do_lines as dl),
'undo',
(select json_agg(ul.line) from undo_lines as ul),
'database_stat',
(select * from database_stat),
'min_index_size_bytes',
(select min_relpages * current_setting('block_size')::numeric from const)
));