標簽
PostgreSQL , dump_stat , 統計信息 , 導出導入
背景
《PostgreSQL 規格評估 - 微觀、宏觀、精準 多視角估算數據庫性能(選型、做預算不求人)》
EXPLAIN是PG數據庫用于輸出SQL執行計劃的語法,
1、生成的執行計劃中包含COST一項。
如果校準了成本因子,COST可以和SQL實際執行時間對其。因子校對的方法如下,實際上每一種硬件,我們只需要校對一遍即可。
《優化器成本因子校對(disk,ssd,memory IO開銷精算) - PostgreSQL real seq_page_cost & random_page_cost in disks,ssd,memory》
《優化器成本因子校對 - PostgreSQL explain cost constants alignment to timestamp》
校對因子如下:
#seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 1.2 # same scale as above
#cpu_tuple_cost = 0.01 # same scale as above
#cpu_index_tuple_cost = 0.005 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above # 以下不需要校對, 不考慮并行計算SQL
parallel_tuple_cost = 0.1 # same scale as above
parallel_setup_cost = 1000.0 # same scale as above
effective_cache_size = 10GB
2、評估COST還需要依賴統計信息柱狀圖:
涉及reltuples, relpages. 表示評估的記錄數以及占用多少個數據塊。注意源頭的block_size可能和PG的不一致,占用多少個塊需要轉換一下。(show block_size可以查看數據塊大小。) postgres=# \d pg_class Table "pg_catalog.pg_class" Column | Type | Collation | Nullable | Default
---------------------+--------------+-----------+----------+--------- relname | name | | not null | -- 對象名 relnamespace | oid | | not null | -- 對象所屬的schema, 對應pg_namespace.oid reltype | oid | | not null | reloftype | oid | | not null | relowner | oid | | not null | relam | oid | | not null | relfilenode | oid | | not null | reltablespace | oid | | not null | relpages | integer | | not null | -- 評估的頁數(單位為block_size) reltuples | real | | not null | -- 評估的記錄數 relallvisible | integer | | not null | reltoastrelid | oid | | not null | relhasindex | boolean | | not null | relisshared | boolean | | not null | relpersistence | "char" | | not null | relkind | "char" | | not null | relnatts | smallint | | not null | relchecks | smallint | | not null | relhasoids | boolean | | not null | relhaspkey | boolean | | not null | relhasrules | boolean | | not null | relhastriggers | boolean | | not null | relhassubclass | boolean | | not null | relrowsecurity | boolean | | not null | relforcerowsecurity | boolean | | not null | relispopulated | boolean | | not null | relreplident | "char" | | not null | relispartition | boolean | | not null | relfrozenxid | xid | | not null | relminmxid | xid | | not null | relacl | aclitem[] | | | reloptions | text[] | | | relpartbound | pg_node_tree | | |
Indexes: "pg_class_oid_index" UNIQUE, btree (oid) "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace) "pg_class_tblspc_relfilenode_index" btree (reltablespace, relfilenode) 涉及 空值比例、平均列寬、唯一值比例或個數、高頻值以及頻率、柱狀圖分布、存儲相關性、多值列(高頻元素及比例、元素柱狀圖分布)。 -- 這個是視圖: postgres=# \d pg_stats View "pg_catalog.pg_stats" Column | Type | Collation | Nullable | Default
------------------------+----------+-----------+----------+--------- schemaname | name | | | -- 對象所屬的schema tablename | name | | | -- 對象名 attname | name | | | -- 列名 inherited | boolean | | | -- 是否為繼承表的統計信息(false時表示當前表的統計信息,true時表示包含所有繼承表的統計信息) null_frac | real | | | -- 該列空值比例 avg_width | integer | | | -- 該列平均長度 n_distinct | real | | | -- 該列唯一值個數(-1表示唯一,小于1表示占比,大于等于1表示實際的唯一值個數) most_common_vals | anyarray | | | -- 該列高頻詞 most_common_freqs | real[] | | | -- 該列高頻詞對應的出現頻率 histogram_bounds | anyarray | | | -- 該列柱狀圖(表示隔出的每個BUCKET的記錄數均等) correlation | real | | | -- 該列存儲相關性(-1到1的區間),絕對值越小,存儲越離散。小于0表示反向相關,大于0表示正向相關 most_common_elems | anyarray | | | -- 該列為多值類型(數組)時,多值元素的高頻詞 most_common_elem_freqs | real[] | | | -- 多值元素高頻詞的出現頻率 elem_count_histogram | real[] | | | -- 多值元素的柱狀圖中,每個區間的非空唯一元素個數 -- 這個是實際存儲的數據(也就是要導入的部分):
-- https://www.postgresql.org/docs/10/static/catalog-pg-statistic.html postgres=# \d pg_statistic Table "pg_catalog.pg_statistic" Column | Type | Collation | Nullable | Default
-------------+----------+-----------+----------+--------- starelid | oid | | not null | -- 對象OID,對應pg_class.oid staattnum | smallint | | not null | -- 該列在表中的位置序號,對應pg_attribute.attnum stainherit | boolean | | not null | -- 是否為繼承表的統計信息(false時表示當前表的統計信息,true時表示包含所有繼承表的統計信息) stanullfrac | real | | not null | -- 空值比例 stawidth | integer | | not null | -- 平均長度 stadistinct | real | | not null | -- 唯一值個數、比例 stakind1 | smallint | | not null | -- 表示第1個SLOT的統計信息分類編號 stakind2 | smallint | | not null | -- 表示第2個SLOT的統計信息分類編號 stakind3 | smallint | | not null | -- 表示第3個SLOT的統計信息分類編號 stakind4 | smallint | | not null | -- 表示第4個SLOT的統計信息分類編號 stakind5 | smallint | | not null | -- 表示第5個SLOT的統計信息分類編號 staop1 | oid | | not null | -- 表示第1個SLOT的統計信息是用哪個operator生成的(例如統計柱狀圖邊界,需要用到 "<" 這個操作符) staop2 | oid | | not null | -- 表示第2個SLOT的統計信息是用哪個operator生成的(例如統計柱狀圖邊界,需要用到 "<" 這個操作符) staop3 | oid | | not null | -- 表示第3個SLOT的統計信息是用哪個operator生成的(例如統計柱狀圖邊界,需要用到 "<" 這個操作符) staop4 | oid | | not null | -- 表示第4個SLOT的統計信息是用哪個operator生成的(例如統計柱狀圖邊界,需要用到 "<" 這個操作符) staop5 | oid | | not null | -- 表示第5個SLOT的統計信息是用哪個operator生成的(例如統計柱狀圖邊界,需要用到 "<" 這個操作符) stanumbers1 | real[] | | | -- 表示第1個SLOT的以numeric[]為結果的統計信息,NULL說明這個SLOT分類沒有numeric的統計信息。 stanumbers2 | real[] | | | -- 表示第2個SLOT的以numeric[]為結果的統計信息,NULL說明這個SLOT分類沒有numeric的統計信息。 stanumbers3 | real[] | | | -- 表示第3個SLOT的以numeric[]為結果的統計信息,NULL說明這個SLOT分類沒有numeric的統計信息。 stanumbers4 | real[] | | | -- 表示第4個SLOT的以numeric[]為結果的統計信息,NULL說明這個SLOT分類沒有numeric的統計信息。 stanumbers5 | real[] | | | -- 表示第5個SLOT的以numeric[]為結果的統計信息,NULL說明這個SLOT分類沒有numeric的統計信息。 stavalues1 | anyarray | | | -- 表示第1個SLOT的以anyarray[]為結果的統計信息,NULL說明這個SLOT分類沒有anyarray[]統計信息。數組類型為列的元素類型,或者列本身的類型。 stavalues2 | anyarray | | | -- 表示第2個SLOT的以anyarray[]為結果的統計信息,NULL說明這個SLOT分類沒有anyarray[]統計信息。數組類型為列的元素類型,或者列本身的類型。 stavalues3 | anyarray | | | -- 表示第3個SLOT的以anyarray[]為結果的統計信息,NULL說明這個SLOT分類沒有anyarray[]統計信息。數組類型為列的元素類型,或者列本身的類型。 stavalues4 | anyarray | | | -- 表示第4個SLOT的以anyarray[]為結果的統計信息,NULL說明這個SLOT分類沒有anyarray[]統計信息。數組類型為列的元素類型,或者列本身的類型。 stavalues5 | anyarray | | | -- 表示第5個SLOT的以anyarray[]為結果的統計信息,NULL說明這個SLOT分類沒有anyarray[]統計信息。數組類型為列的元素類型,或者列本身的類型。
Indexes: "pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum, stainherit)
statkind的定義
src/include/catalog/pg_statistic.h
/* * Currently, five statistical slot "kinds" are defined by core PostgreSQL, * as documented below. Additional "kinds" will probably appear in * future to help cope with non-scalar datatypes. Also, custom data types * can define their own "kind" codes by mutual agreement between a custom * typanalyze routine and the selectivity estimation functions of the type's * operators. * * Code reading the pg_statistic relation should not assume that a particular * data "kind" will appear in any particular slot. Instead, search the * stakind fields to see if the desired data is available. (The standard * function get_attstatsslot() may be used for this.) */ /* * The present allocation of "kind" codes is: * * 1-99: reserved for assignment by the core PostgreSQL project * (values in this range will be documented in this file) * 100-199: reserved for assignment by the PostGIS project * (values to be documented in PostGIS documentation) * 200-299: reserved for assignment by the ESRI ST_Geometry project * (values to be documented in ESRI ST_Geometry documentation) * 300-9999: reserved for future public assignments * * For private use you may choose a "kind" code at random in the range * 10000-30000. However, for code that is to be widely disseminated it is * better to obtain a publicly defined "kind" code by request from the * PostgreSQL Global Development Group. */ /* * In a "most common values" slot, staop is the OID of the "=" operator * used to decide whether values are the same or not. stavalues contains * the K most common non-null values appearing in the column, and stanumbers * contains their frequencies (fractions of total row count). The values * shall be ordered in decreasing frequency. Note that since the arrays are * variable-size, K may be chosen by the statistics collector. Values should * not appear in MCV unless they have been observed to occur more than once; * a unique column will have no MCV slot. */
#define STATISTIC_KIND_MCV 1 /* * A "histogram" slot describes the distribution of scalar data. staop is * the OID of the "<" operator that describes the sort ordering. (In theory, * more than one histogram could appear, if a datatype has more than one * useful sort operator.) stavalues contains M (>=2) non-null values that * divide the non-null column data values into M-1 bins of approximately equal * population. The first stavalues item is the MIN and the last is the MAX. * stanumbers is not used and should be NULL. IMPORTANT POINT: if an MCV * slot is also provided, then the histogram describes the data distribution * *after removing the values listed in MCV* (thus, it's a "compressed * histogram" in the technical parlance). This allows a more accurate * representation of the distribution of a column with some very-common * values. In a column with only a few distinct values, it's possible that * the MCV list describes the entire data population; in this case the * histogram reduces to empty and should be omitted. */
#define STATISTIC_KIND_HISTOGRAM 2 /* * A "correlation" slot describes the correlation between the physical order * of table tuples and the ordering of data values of this column, as seen * by the "<" operator identified by staop. (As with the histogram, more * than one entry could theoretically appear.) stavalues is not used and * should be NULL. stanumbers contains a single entry, the correlation * coefficient between the sequence of data values and the sequence of * their actual tuple positions. The coefficient ranges from +1 to -1. */
#define STATISTIC_KIND_CORRELATION 3 /* * A "most common elements" slot is similar to a "most common values" slot, * except that it stores the most common non-null *elements* of the column * values. This is useful when the column datatype is an array or some other * type with identifiable elements (for instance, tsvector). staop contains * the equality operator appropriate to the element type. stavalues contains * the most common element values, and stanumbers their frequencies. Unlike * MCV slots, frequencies are measured as the fraction of non-null rows the * element value appears in, not the frequency of all rows. Also unlike * MCV slots, the values are sorted into the element type's default order * (to support binary search for a particular value). Since this puts the * minimum and maximum frequencies at unpredictable spots in stanumbers, * there are two extra members of stanumbers, holding copies of the minimum * and maximum frequencies. Optionally, there can be a third extra member, * which holds the frequency of null elements (expressed in the same terms: * the fraction of non-null rows that contain at least one null element). If * this member is omitted, the column is presumed to contain no null elements. * * Note: in current usage for tsvector columns, the stavalues elements are of * type text, even though their representation within tsvector is not * exactly text. */
#define STATISTIC_KIND_MCELEM 4 /* * A "distinct elements count histogram" slot describes the distribution of * the number of distinct element values present in each row of an array-type * column. Only non-null rows are considered, and only non-null elements. * staop contains the equality operator appropriate to the element type. * stavalues is not used and should be NULL. The last member of stanumbers is * the average count of distinct element values over all non-null rows. The * preceding M (>=2) members form a histogram that divides the population of * distinct-elements counts into M-1 bins of approximately equal population. * The first of these is the minimum observed count, and the last the maximum. */
#define STATISTIC_KIND_DECHIST 5 /* * A "length histogram" slot describes the distribution of range lengths in * rows of a range-type column. stanumbers contains a single entry, the * fraction of empty ranges. stavalues is a histogram of non-empty lengths, in * a format similar to STATISTIC_KIND_HISTOGRAM: it contains M (>=2) range * values that divide the column data values into M-1 bins of approximately * equal population. The lengths are stored as float8s, as measured by the * range type's subdiff function. Only non-null rows are considered. */
#define STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM 6 /* * A "bounds histogram" slot is similar to STATISTIC_KIND_HISTOGRAM, but for * a range-type column. stavalues contains M (>=2) range values that divide * the column data values into M-1 bins of approximately equal population. * Unlike a regular scalar histogram, this is actually two histograms combined * into a single array, with the lower bounds of each value forming a * histogram of lower bounds, and the upper bounds a histogram of upper * bounds. Only non-NULL, non-empty ranges are included. */
#define STATISTIC_KIND_BOUNDS_HISTOGRAM 7
那么這些統計信息如何導入導出呢?
導出導入統計信息
https://postgrespro.com/docs/postgresproee/9.6/dump-stat.html
dump_stat這個插件是PostgreSQL pro推出的兼容9.6版本的導出統計信息的插件。
代碼如下:
https://github.com/postgrespro/postgrespro/tree/PGPRO9_6
https://github.com/postgrespro/postgrespro/tree/PGPRO9_6/contrib/dump_stat
導出
通過dump_stat導出(導出的結構已經是SQL形式),然后通過SQL導入。
$ psql test -A
test=# \t
test=# \o dump_stat.sql
test=# select dump_statistic();
pg_statistic的每一條記錄,產生一條如下SQL:
WITH upsert as ( UPDATE pg_catalog.pg_statistic SET column_name = expression [, ...] WHERE to_schema_qualified_relation(starelid) = t_relname AND to_attname(t_relname, staattnum) = t_attname AND to_atttype(t_relname, staattnum) = t_atttype AND stainherit = t_stainherit RETURNING *)
ins as ( SELECT expression [, ...] WHERE NOT EXISTS (SELECT * FROM upsert) AND to_attnum(t_relname, t_attname) IS NOT NULL AND to_atttype(t_relname, t_attname) = t_atttype)
INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins; where expression can be one of: array_in(array_text, type_name::regtype::oid, -1)
value::type_name
stat導入的實際例子
(表示public.test表的info列的統計信息,如果存在則更新,不存在則插入。)
WITH upsert as ( UPDATE pg_catalog.pg_statistic SET stanullfrac = 0, stawidth = 4, stadistinct = -1, stakind1 = 2, stakind2 = 3, stakind3 = 0, stakind4 = 0, stakind5 = 0, staop1 = 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, staop2 = 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, staop3 = '0'::regoperator, staop4 = '0'::regoperator, staop5 = '0'::regoperator, stanumbers1 = NULL::real[], stanumbers2 = '{-0.5}'::real[], stanumbers3 = NULL::real[], stanumbers4 = NULL::real[], stanumbers5 = NULL::real[], stavalues1 = array_in('{abc,cde,test}', 'pg_catalog.text'::regtype, -1)::anyarray, stavalues2 = NULL::anyarray, stavalues3 = NULL::anyarray, stavalues4 = NULL::anyarray, stavalues5 = NULL::anyarray WHERE to_schema_qualified_relation(starelid) = 'public.test' AND to_attname('public.test', staattnum) = 'info' AND to_atttype('public.test', staattnum) = 'pg_catalog.text' AND stainherit = false RETURNING * ), ins as ( SELECT 'public.test'::regclass, to_attnum('public.test', 'info'), 'false'::boolean, 0::real, 4::integer, -1::real, 2, -- stakind=2 表示柱狀圖 3, -- stakind=3 表示相關性 0, 0, 0, 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, '0'::regoperator, '0'::regoperator, '0'::regoperator, NULL::real[], '{-0.5}'::real[], NULL::real[], NULL::real[], NULL::real[], array_in('{abc,cde,test}', 'pg_catalog.text'::regtype, -1)::anyarray, NULL::anyarray, NULL::anyarray, NULL::anyarray, NULL::anyarray WHERE NOT EXISTS (SELECT * FROM upsert) AND to_attnum('public.test', 'info') IS NOT NULL AND to_atttype('public.test', 'info') = 'pg_catalog.text' )
INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;
導入
1、修改postgresql.conf,允許修改系統表,重啟數據庫生效配置
vi postgresql.conf allow_system_table_mods=on pg_ctl restart -m fast
2、導入統計信息
-- 1 pg_class update pg_class set reltuples=?, relpages=? where oid=?; -- 2 pg_statistic WITH upsert as ( UPDATE pg_catalog.pg_statistic SET column_name = expression [, ...] WHERE to_schema_qualified_relation(starelid) = t_relname AND to_attname(t_relname, staattnum) = t_attname AND to_atttype(t_relname, staattnum) = t_atttype AND stainherit = t_stainherit RETURNING *)
ins as ( SELECT expression [, ...] WHERE NOT EXISTS (SELECT * FROM upsert) AND to_attnum(t_relname, t_attname) IS NOT NULL AND to_atttype(t_relname, t_attname) = t_atttype)
INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins; where expression can be one of: array_in(array_text, type_name::regtype::oid, -1)
value::type_name
3、導入完成后,將allow_system_table_mods設置為off,重啟數據庫。
dump_statistic代碼
CREATE FUNCTION dump_statistic(relid oid) RETURNS SETOF TEXT AS $$ DECLARE result text; cmd text; -- main query in_args text; -- args for insert up_args text; -- args for upsert fstaop text := '%s::regoperator'; arr_in text := 'array_in(%s, %s::regtype, -1)::anyarray'; stacols text[] = ARRAY['stakind', 'staop', 'stanumbers', 'stavalues' ]; r record; i int; j text; ncols int := 26; -- number of columns in pg_statistic stanum text[]; -- stanumbers{1, 5} staval text[]; -- stavalues{1, 5} staop text[]; -- staop{1, 5} relname text; -- quoted relation name attname text; -- quoted attribute name atttype text; -- quoted attribute type BEGIN for r in select * from pg_catalog.pg_statistic where starelid = relid and get_namespace(starelid) != to_namespace('information_schema') and get_namespace(starelid) != to_namespace('pg_catalog') loop relname := to_schema_qualified_relation(r.starelid); attname := quote_literal(to_attname(relname, r.staattnum)); atttype := quote_literal(to_atttype(relname, r.staattnum)); relname := quote_literal(relname); -- redefine relname in_args := ''; up_args = 'stanullfrac = %s, stawidth = %s, stadistinct = %s, '; cmd := 'WITH upsert as ( ' || 'UPDATE pg_catalog.pg_statistic SET %s ' || 'WHERE to_schema_qualified_relation(starelid) = ' || relname || ' ' 'AND to_attname(' || relname || ', staattnum) = ' || attname || ' ' 'AND to_atttype(' || relname || ', staattnum) = ' || atttype || ' ' 'AND stainherit = ' || r.stainherit || ' ' || 'RETURNING *), ' || 'ins as ( ' || 'SELECT %s ' || 'WHERE NOT EXISTS (SELECT * FROM upsert) ' || 'AND to_attnum(' || relname || ', ' || attname || ') IS NOT NULL ' 'AND to_atttype(' || relname || ', ' || attname || ') = ' || atttype || ') ' 'INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;'; for i in 1..ncols loop in_args := in_args || '%s'; if i != ncols then in_args := in_args || ', '; end if; end loop; for j in 1..4 loop for i in 1..5 loop up_args := up_args || format('%s%s = %%s', stacols[j], i); if i * j != 20 then up_args := up_args || ', '; end if; end loop; end loop; cmd := format(cmd, up_args, in_args); --prepare template for main query staop := array[format(fstaop, quote_literal(to_schema_qualified_operator(r.staop1))), format(fstaop, quote_literal(to_schema_qualified_operator(r.staop2))), format(fstaop, quote_literal(to_schema_qualified_operator(r.staop3))), format(fstaop, quote_literal(to_schema_qualified_operator(r.staop4))), format(fstaop, quote_literal(to_schema_qualified_operator(r.staop5)))]; stanum := array[r.stanumbers1::text, r.stanumbers2::text, r.stanumbers3::text, r.stanumbers4::text, r.stanumbers5::text]; for i in 1..5 loop if stanum[i] is null then stanum[i] := 'NULL::real[]'; else stanum[i] := '''' || stanum[i] || '''::real[]'; end if; end loop; if r.stavalues1 is not null then staval[1] := format(arr_in, quote_literal(r.stavalues1), quote_literal( to_schema_qualified_type( anyarray_elemtype(r.stavalues1)))); else staval[1] := 'NULL::anyarray'; end if; if r.stavalues2 is not null then staval[2] := format(arr_in, quote_literal(r.stavalues2), quote_literal( to_schema_qualified_type( anyarray_elemtype(r.stavalues2)))); else staval[2] := 'NULL::anyarray'; end if; if r.stavalues3 is not null then staval[3] := format(arr_in, quote_literal(r.stavalues3), quote_literal( to_schema_qualified_type( anyarray_elemtype(r.stavalues3)))); else staval[3] := 'NULL::anyarray'; end if; if r.stavalues4 is not null then staval[4] := format(arr_in, quote_literal(r.stavalues4), quote_literal( to_schema_qualified_type( anyarray_elemtype(r.stavalues4)))); else staval[4] := 'NULL::anyarray'; end if; if r.stavalues5 is not null then staval[5] := format(arr_in, quote_literal(r.stavalues5), quote_literal( to_schema_qualified_type( anyarray_elemtype(r.stavalues5)))); else staval[5] := 'NULL::anyarray'; end if; --DEBUG --staop := array['{arr}', '{arr}', '{arr}', '{arr}', '{arr}']; --stanum := array['{num}', '{num}', '{num}', '{num}', '{num}']; --staval := array['{val}', '{val}', '{val}', '{val}', '{val}']; result := format(cmd, r.stanullfrac, r.stawidth, r.stadistinct, -- stakind r.stakind1, r.stakind2, r.stakind3, r.stakind4, r.stakind5, -- staop staop[1], staop[2], staop[3], staop[4], staop[5], -- stanumbers stanum[1], stanum[2], stanum[3], stanum[4], stanum[5], -- stavalues staval[1], staval[2], staval[3], staval[4], staval[5], -- first 6 columns format('%s::regclass', relname), format('to_attnum(%s, %s)', relname, attname), '''' || r.stainherit || '''::boolean', r.stanullfrac || '::real', r.stawidth || '::integer', r.stadistinct || '::real', -- stakind r.stakind1, r.stakind2, r.stakind3, r.stakind4, r.stakind5, -- staop staop[1], staop[2], staop[3], staop[4], staop[5], -- stanumbers stanum[1], stanum[2], stanum[3], stanum[4], stanum[5], -- stavalues staval[1], staval[2], staval[3], staval[4], staval[5]); return next result; end loop; return; END;
$$ LANGUAGE plpgsql;
我們甚至可以將Oracle數據庫的統計信息,平移到PG數據庫,對齊需要的元素即可:
記錄數、占用多少個數據塊。每列的空值比例、平均列寬、唯一值比例或個數、高頻值以及頻率、柱狀圖分布、存儲相關性、多值列(高頻元素及比例、元素柱狀圖分布)。
好處:在遷移ORACLE數據時,可以關閉autovacuumm(提高導入速度),通過這種方法來導入統計信息。(只要元素對應即可,當然有些元素可能是ORACLE中不采集的,比如多值列的統計信息)。
參考
https://github.com/postgrespro/postgrespro/tree/PGPRO9_6/contrib/dump_stat