如何收集及删除列的统计信息

本文只涉及使用dbm_stats来收集或删除列的统计信息的一些命令,以备查询。

测试表如下(实验环境为10.2.0.4):

SQL> create table test(i int,a varchar2(30));
Table created.

SQL> insert into test select rownum,object_name from all_objects;
9907 rows created.

简单的说,列的统计信息,主要包括两种类型:

  • 只有基本信息:收集的统计信息只有1个桶(bucket)
  • 包含柱状图信息:收集的统计信息包含2到254个桶

也就是说,如果想收集列的基本信息,同时不希望收集柱状图,则需要指定bucket的size为1:

SQL> exec dbms_stats.gather_table_stats(user, 'TEST',
cascade=>false, method_opt=>'for columns i size 1');

PL/SQL procedure successfully completed.                                                                             

SQL> select column_name,num_distinct,num_buckets,low_value,high_value,density,avg_col_len,histogram
  2  from user_tab_columns where table_name='TEST';

COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LOW_VALUE  HIGH_VALUE    DENSITY AVG_COL_LEN HISTOGRAM
---------- ------------ ----------- ---------- ---------- ---------- ----------- ---------------
I                  9907           1 C102       C26408     .000100939           4 NONE
A                                                                                NONE

如果要收集列的柱状图信息,则bucket的个数必须大于等于2(最多不超过254)

SQL> exec dbms_stats.gather_table_stats(user, 'TEST',
cascade=>false, method_opt=>'for columns i size 2');

PL/SQL procedure successfully completed.

SQL> select column_name,num_distinct,num_buckets,low_value,high_value,density,avg_col_len,histogram
  2  from user_tab_columns where table_name='TEST';

COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LOW_VALUE  HIGH_VALUE    DENSITY AVG_COL_LEN HISTOGRAM
---------- ------------ ----------- ---------- ---------- ---------- ----------- ---------------
I                  9907           2 C102       C26408     .000100939           4 HEIGHT BALANCED
A                                                                                NONE

如果要删除列已有的柱状图信息而保留列的基本统计信息,则需要重新收集bucket为1的统计信息

SQL> exec dbms_stats.gather_table_stats(user, 'TEST',
cascade=>false, method_opt=>'for columns i size 1');

PL/SQL procedure successfully completed.

SQL> select column_name,num_distinct,num_buckets,low_value,high_value,density,avg_col_len,histogram
  2  from user_tab_columns where table_name='TEST';

COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LOW_VALUE  HIGH_VALUE    DENSITY AVG_COL_LEN HISTOGRAM
---------- ------------ ----------- ---------- ---------- ---------- ----------- ---------------
I                  9907           1 C102       C26408     .000100939           4 NONE
A                                                                                NONE

这个操作明显不太合理,重新收集统计信息的代价有时候是很大的,所以Oracle11g对此做出了改进,允许只删除柱状图而保留基本统计信息,命令语法如下:

exec dbms_stats.delete_column_stats('', '
', '', col_stat_type=>'HISTOGRAM');

而要彻底删除整个列的统计信息,则需要调用delete_column_stats过程

SQL> exec dbms_stats.delete_column_stats(user, 'TEST', 'I');

PL/SQL procedure successfully completed.

SQL> select column_name,num_distinct,num_buckets,low_value,high_value,density,avg_col_len,histogram
  2  from user_tab_columns where table_name='TEST';

COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LOW_VALUE  HIGH_VALUE    DENSITY  AVG_COL_LEN HISTOGRAM
---------- ------------ ----------- ---------- ---------- ----------  ----------- ---------------
I                                                                                 NONE
A                                                                                 NONE

可以在同一个过程中收集多个列的统计信息,并且可以为不同的列指定不同的bucket个数:

SQL> exec dbms_stats.gather_table_stats(user, 'TEST',
cascade=>false, method_opt => 'for columns size 1 T for columns size 2 A');

PL/SQL procedure successfully completed.

SQL> select column_name,num_distinct,num_buckets,low_value,high_value,density,avg_col_len,histogram
  2  from user_tab_columns where table_name='TEST';

COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LOW_VALUE  HIGH_VALUE              DENSITY AVG_COL_LEN HISTOGRAM
---------- ------------ ----------- ---------- -------------------- ---------- ----------- ---------------
I                  9907           1 C102       C26408               .000100939           4 NONE
A                  7376           2 41         5F75746C245F6C6E635F .000185239          18 HEIGHT BALANCED
                                               696E645F7061727473

值得注意的是,从上一篇文章我们知道,9i的dbms_stats中,method_opt的默认值是FOR ALL COLUMNS SIZE 1,也就是收集列的基本统计信息而不收集柱状图信息,而10g的默认值则变成了FOR ALL COLUMNS SIZE AUTO,则Oracle在收集列的基本信息之外,还会根据情况收集某些列的柱状图。