[20130327]dbms_stats.seed_col_usage的使用.txt
http://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload
http://www.itpub.net/thread-1498587-1-1.html
oracle 11.2.0.2介绍了Auto Column Group Creation,自动确立在那些字段上建立extended statistics。我以前在11.2.0.1下使用。
它也有这个命令,执行出现如下错误:
执行如下,以sys用户,出现:
SQL> exec DBMS_STATS.SEED_COL_USAGE (NULL,NULL,300);
BEGIN DBMS_STATS.SEED_COL_USAGE (NULL,NULL,300); END;
*
ERROR at line 1:
ORA-13754: "SQL Tuning Set" "" does not exist for user "SYS".
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 17379
ORA-06512: at "SYS.DBMS_STATS", line 28451
ORA-06512: at line 1
前一阵子,终于将测试环境升级到了11.2.0.3,自己重复这个测试:
1.建立测试环境:
SQL> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table t1 as select rownum id1 ,mod(rownum,2) id2,'test' name from dual connect by level
insert into t1 select rownum+10000 ,3 id2,decode(mod(rownum,2),1,'mmmm',0,'test') from dual connect by level
commit ;
SQL> select id2,name ,count(*) from t1 group by id2,name ;
ID2 NAME COUNT(*)
---------- -------------------- ----------
3 mmmm 50
1 test 5000
0 test 5000
3 test 50
SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for all columns size 1');
PL/SQL procedure successfully completed.
SQL> Select * from t1 where id2=3 and name='mmmm';
SQL> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------------------------------
SQL_ID f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 1683 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2"=3 AND "NAME"='mmmm'))
--可以发现 E-Rows=1683与实际的50相差很远。
再次执行:
PLAN_TABLE_OUTPUT
-----------------------------------------------------------
SQL_ID f6x401mx48a43, child number 1
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 50 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2"=3 AND "NAME"='mmmm'))
Note
-----
- cardinality feedback used for this statement
- Warning: basic plan statistics not available. These are only collected when:
* hint 'gather_plan_statistics' is used for the statement or
* parameter 'statistics_level' is set to 'ALL', at session or system level
25 rows selected.
--可以发现 E-Rows 正确,主要是使用了11G的新特性cardinality feedback。
2.使用dbms_stats.seed_col_usage报:
SQL> exec dbms_stats.seed_col_usage(null,'SCOTT',100);
BEGIN dbms_stats.seed_col_usage(null,'SCOTT',100); END;
*
ERROR at line 1:
ORA-13754: "SQL Tuning Set" "" does not exist for user "SCOTT".
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 17617
ORA-06512: at "SYS.DBMS_STATS", line 33396
ORA-06512: at line 1
SQL> exec dbms_stats.seed_col_usage(null,null,100);
PL/SQL procedure successfully completed.
--这样正常!不知道为什么?
--执行以上语句多次:
SQL> Select * from t1 where id2=3 and name='mmmm';
SQL> select dbms_stats.report_col_usage(user,'t1') from dual ;
DBMS_STATS.REPORT_COL_USAGE(USER,'T1')
-----------------------------------------------------------------------------------
LEGEND:
.......
EQ : Used in single table EQuality predicate
RANGE : Used in single table RANGE predicate
LIKE : Used in single table LIKE predicate
NULL : Used in single table is (not) NULL predicate
EQ_JOIN : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER : Used in single table FILTER predicate
JOIN : Used in JOIN predicate
GROUP_BY : Used in GROUP BY expression
...............................................................................
###############################################################################
COLUMN USAGE REPORT FOR SCOTT.T1
................................
1. ID2 : EQ
2. NAME : EQ
3. (ID2, NAME) : FILTER
###############################################################################
--根据提示可以在可以在ID2,name上建立column groups。
3.建立column groups:
http://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload
At this point you can get Oracle to automatically create the column groups for each of the tables based on the usage
information captured during the monitoring window. You simply have to call the DBMS_STATS.CREATE_EXTENDED_STATS function
for each table.This function requires just two arguments, the schema name and the table name. From then on, statistics
will be maintained for each column group whenever statistics are gathered on the table. In this example you will see two
column groups were created based on the information captured from the two queries in this workload.
SQL> select dbms_stats.create_extended_stats(user,'T1') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T1')
---------------------------------------------------------------------------------
###############################################################################
EXTENSIONS FOR SCOTT.T1
.......................
1. (ID2, NAME) : SYS_STUT3NILYU40ICEIZ7TLWV#HOP created
###############################################################################
SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for all columns size 1');
PL/SQL procedure successfully completed.
SQL> SELECT column_name, data_type, num_distinct, low_value, high_value,HISTOGRAM FROM dba_tab_cols WHERE wner = USER AND table_name = 'T1';
COLUMN_NAME DATA_TYPE NUM_DISTINCT LOW_VALUE HIGH_VALUE HISTOGRAM
-------------------- ---------- ------------ ---------- ---------- ---------------
ID1 NUMBER 10100 C102 C30202 NONE
ID2 NUMBER 3 80 C104 NONE
NAME CHAR 2 6D6D6D6D 74657374 NONE
SYS_STUT3NILYU40ICEI NUMBER 4 CA03165324 CA11624601 NONE
Z7TLWV#HOP 5E300A2717 5213202344
5E 36
4.再次执行:
SQL> Select * from t1 where id2=3 and name='mmmm';
PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 2525 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2"=3 AND "NAME"='mmmm'))
--E-Rows行还是相差太远。
--在分组上建立直方图看看。
SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for columns size 1 for columns SYS_STUT3NILYU40ICEIZ7TLWV#HOP size 254');
PL/SQL procedure successfully completed.
SQL> alter system flush shared_pool;
SQL> Select * from t1 where id2=3 and name='mmmm';
PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 50 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2"=3 AND "NAME"='mmmm'))
--可以发现现在 E-Rows =50,估计正确。
SQL> Select * from t1 where id2=3 and name='test';
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 2rh118hps0wfj, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='test'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 50 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2"=3 AND "NAME"='test'))
--可以发现现在 E-Rows =50,估计正确。
BTW:这样的分组仅仅对等值判断正确,对于范围扫描依旧不行。
PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID 17jqnvxzm8bh9, child number 0
-------------------------------------
Select * from t1 where id2 between 2 and 3 and name='test'
Plan hash value: 3617692013
--------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | | 10 (100)|
|* 1 | TABLE ACCESS FULL| T1 | 99 | 10 (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID2">=2 AND "NAME"='test' AND "ID2"