文章目錄
- 環境
- 文檔用途
- 詳細信息
環境
系統平臺:Linux x86-64 Red Hat Enterprise Linux 7
版本:4.5.8
文檔用途
本文檔主要介紹HGDB索引膨脹的定義、產生的原因、如何檢查以及遇到索引膨脹如何處理(包括預防和解決)
詳細信息
索引膨脹的定義
假設對一個索引進行順序的數據插入,那么索引分裂應該只會發生在最右邊的葉子結點;若對索引進行無序的插入,那么中間的葉子結點會進行了分裂,多出了很多空閑空間,索引掃描的時候需要掃描更多的頁,造成了io和存儲空間上的浪費
產生膨脹索引的原因
1、表中的每個行版本(“元組”)都有一個未死亡的索引條目。當 VACUUM 刪除死元組時,它還必須刪除相應的索引條目,這會在索引頁中創建空白空間。此類空間可以重復使用,但如果沒有新條目添加到頁面,則該空間仍為空
2、頻繁更新相同的行,在VACUUM清理老元組前,表和索引會維護相同行的很多版本。若索引頁填滿,HGDB會將索引頁分裂成2個,在VACUUM執行完清理之后,最終會得到2個臃腫的頁面而不是1個
索引膨脹的檢查
提前安裝好pgstattuple插件,用于返回一個關系的物理長度、"死亡"元組的百分比以及其他信息
create extension pgstattuple;
如下查詢用于查看free_space占用前5的索引,空閑空間越大說明索引使用率越低
select oid::regclass,(pgstattuple(oid)).* from pg_class where relkind='i' order by free_space desc limit 5 offset 0;
如下查詢查看具體表中的各個索引大小
select relname, pg_relation_size(oid)/1024 || 'K' AS size from pg_class where relkind='i' and relname='stu_dex';
除了擴展也可以通過如下的sql查看索引膨脹的相關信息(膨脹大小、膨脹率等),便于對膨脹的索引做優化
SELECT current_database(), nspname AS schemaname, tblname, idxname, bs*(relpages)::bigint AS real_size,bs*(relpages-est_pages)::bigint AS extra_size,100 * (relpages-est_pages)::float / relpages AS extra_pct,fillfactor,CASE WHEN relpages > est_pages_ffTHEN bs*(relpages-est_pages_ff)ELSE 0END AS bloat_size,100 * (relpages-est_pages_ff)::float / relpages AS bloat_pct,FROM (SELECT coalesce(1 +ceil(reltuples/floor((bs-pageopqdata-pagehdr)/(4+nulldatahdrwidth)::float)), 0 -- ItemIdData size + computed avg size of a tuple (nulldatahdrwidth)) AS est_pages,coalesce(1 +ceil(reltuples/floor((bs-pageopqdata-pagehdr)*fillfactor/(100*(4+nulldatahdrwidth)::float))), 0) AS est_pages_ff,bs, nspname, tblname, idxname, relpages, fillfactor, is_na-- , pgstatindex(idxoid) AS pst, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples -- (DEBUG INFO)FROM (SELECT maxalign, bs, nspname, tblname, idxname, reltuples, relpages, idxoid, fillfactor,( index_tuple_hdr_bm +maxalign - CASE -- Add padding to the index tuple header to align on MAXALIGNWHEN index_tuple_hdr_bm%maxalign = 0 THEN maxalignELSE index_tuple_hdr_bm%maxalignEND+ nulldatawidth + maxalign - CASE -- Add padding to the data to align on MAXALIGNWHEN nulldatawidth = 0 THEN 0WHEN nulldatawidth::integer%maxalign = 0 THEN maxalignELSE nulldatawidth::integer%maxalignEND)::numeric AS nulldatahdrwidth, pagehdr, pageopqdata, is_na-- , index_tuple_hdr_bm, nulldatawidth -- (DEBUG INFO)FROM (SELECT n.nspname, i.tblname, i.idxname, i.reltuples, i.relpages,i.idxoid, i.fillfactor, current_setting('block_size')::numeric AS bs,CASE -- MAXALIGN: 4 on 32bits, 8 on 64bits (and mingw32 ?)WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8ELSE 4END AS maxalign,/* per page header, fixed size: 20 for 7.X, 24 for others */24 AS pagehdr,/* per page btree opaque data */16 AS pageopqdata,/* per tuple header: add IndexAttributeBitMapData if some cols are null-able */CASE WHEN max(coalesce(s.null_frac,0)) = 0THEN 8 -- IndexTupleData sizeELSE 8 + (( 32 + 8 - 1 ) / 8) -- IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8)END AS index_tuple_hdr_bm,/* data len: we remove null values save space using it fractionnal part from stats */sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024)) AS nulldatawidth,max( CASE WHEN i.atttypid = 'pg_catalog.name'::regtype THEN 1 ELSE 0 END ) > 0 AS is_naFROM (SELECT ct.relname AS tblname, ct.relnamespace, ic.idxname, ic.attpos, ic.indkey, ic.indkey[ic.attpos], ic.reltuples, ic.relpages, ic.tbloid, ic.idxoid, ic.fillfactor,coalesce(a1.attnum, a2.attnum) AS attnum, coalesce(a1.attname, a2.attname) AS attname, coalesce(a1.atttypid, a2.atttypid) AS atttypid,CASE WHEN a1.attnum IS NULLTHEN ic.idxnameELSE ct.relnameEND AS attrelnameFROM (SELECT idxname, reltuples, relpages, tbloid, idxoid, fillfactor, indkey,pg_catalog.generate_series(1,indnatts) AS attposFROM (SELECT ci.relname AS idxname, ci.reltuples, ci.relpages, i.indrelid AS tbloid,i.indexrelid AS idxoid,coalesce(substring(array_to_string(ci.reloptions, ' ')from 'fillfactor=([0-9]+)')::smallint, 90) AS fillfactor,i.indnatts,pg_catalog.string_to_array(pg_catalog.textin(pg_catalog.int2vectorout(i.indkey)),' ')::int[] AS indkeyFROM pg_catalog.pg_index iJOIN pg_catalog.pg_class ci ON ci.oid = i.indexrelidWHERE ci.relam=(SELECT oid FROM pg_am WHERE amname = 'btree')AND ci.relpages > 0) AS idx_data) AS icJOIN pg_catalog.pg_class ct ON ct.oid = ic.tbloidLEFT JOIN pg_catalog.pg_attribute a1 ONic.indkey[ic.attpos] <> 0AND a1.attrelid = ic.tbloidAND a1.attnum = ic.indkey[ic.attpos]LEFT JOIN pg_catalog.pg_attribute a2 ONic.indkey[ic.attpos] = 0AND a2.attrelid = ic.idxoidAND a2.attnum = ic.attpos) iJOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespaceJOIN pg_catalog.pg_stats s ON s.schemaname = n.nspnameAND s.tablename = i.attrelnameAND s.attname = i.attnameGROUP BY 1,2,3,4,5,6,7,8,9,10,11) AS rows_data_stats) AS rows_hdr_pdg_stats) AS relation_statsORDER BY nspname, tblname, idxname;
預防索引膨脹
實例級
vacuum命令運行的最小延遲:
alter system set autovacuum_naptime=15s;
在一個表上觸發vacuum的被插入、被更新或被刪除元組的最小數量:
alter system set autovacuum_vacuum_threshold=25;
在一個表上觸發analyze的被插入、被更新或被刪除元組的最小數量:
alter system set autovacuum_analyze_threshold=10;
決定是否觸發vaccum時作為一個分數將它加到autovacuum_vacuum_threshold上:
alter system set autovacuum_vacuum_scale_factor=0.01;
決定是否觸發analyze時作為一個分數將它加到autovacuum_vacuum_threshold上:
alter system set autovacuum_analyze_scale_factor=0.05;
autovacuum觸發條件:
pg_stat_all_tables.n_dead_tup大于 autovacuum_vacuum_threshold + pg_class.reltuples * autovacuum_vacuum_scale_factor
autoananlyze觸發條件:
pg_stat_all_tables.n_mod_since_analyze大于 autovacuum_analyze_threshold + pg_class.reltuples * autovacuum_analyze_scale_factor
表級
1、設置合適的autovacuum_vacuum_scale_factor,大表如果頻繁的有更新或刪除和插入操作, 建議設置較小的autovacuum_vacuum_scale_factor來降低空間的浪費,加快對表的vacuum操作頻率
對更新頻繁的表,單獨調整
alter table tablename set (autovacuum_vacuum_scale_factor=0.05);
2、設置表的fillfactor,對頻繁更新的表,調低fillfactor參數:
alter table tablename set (fillfactor=85);
解決索引膨脹
1、重建索引
創建新索引 create index CONCURRENTLY new_index ;
刪除舊索引 drop index old_index ;
或者
重建索引 reindex index 索引名稱 CONCURRENTLY ;
analyze tablename;
2、執行vacuum full
"完全"清理,這樣可以恢復更多的空間,但是花的時間更多并且在表上施加了排它鎖