[20130327]dbms_stats.seed_col_usage的使用.txt

[20130327]dbms_stats.seed_col_usage的使用.txt

http://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload
http://www.itpub.net/thread-1498587-1-1.html

oracle 11.2.0.2介绍了Auto Column Group Creation,自动确立在那些字段上建立extended statistics。我以前在11.2.0.1下使用。
它也有这个命令,执行出现如下错误:

执行如下,以sys用户,出现:
SQL> exec DBMS_STATS.SEED_COL_USAGE (NULL,NULL,300);
BEGIN DBMS_STATS.SEED_COL_USAGE (NULL,NULL,300); END;

*
ERROR at line 1:
ORA-13754: "SQL Tuning Set" "" does not exist for user "SYS".
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 17379
ORA-06512: at "SYS.DBMS_STATS", line 28451
ORA-06512: at line 1

前一阵子,终于将测试环境升级到了11.2.0.3,自己重复这个测试:

1.建立测试环境:

SQL> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table t1 as select rownum id1 ,mod(rownum,2) id2,'test' name from dual connect by level
insert into t1 select rownum+10000 ,3 id2,decode(mod(rownum,2),1,'mmmm',0,'test') from dual connect by level
commit ;
SQL> select id2,name ,count(*) from t1 group by id2,name ;
       ID2 NAME                   COUNT(*)
---------- -------------------- ----------
         3 mmmm                         50
         1 test                       5000
         0 test                       5000
         3 test                         50
SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for all columns size 1');
PL/SQL procedure successfully completed.
SQL> Select * from t1 where id2=3 and name='mmmm';
SQL> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------------------------------
SQL_ID  f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |   1683 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2"=3 AND "NAME"='mmmm'))

--可以发现 E-Rows=1683与实际的50相差很远。

再次执行:

PLAN_TABLE_OUTPUT
-----------------------------------------------------------
SQL_ID  f6x401mx48a43, child number 1
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |     50 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2"=3 AND "NAME"='mmmm'))
Note
-----
   - cardinality feedback used for this statement
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level
25 rows selected.

--可以发现 E-Rows 正确,主要是使用了11G的新特性cardinality feedback。
 
2.使用dbms_stats.seed_col_usage报:

SQL>  exec dbms_stats.seed_col_usage(null,'SCOTT',100);
BEGIN dbms_stats.seed_col_usage(null,'SCOTT',100); END;
*
ERROR at line 1:
ORA-13754: "SQL Tuning Set" "" does not exist for user "SCOTT".
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 17617
ORA-06512: at "SYS.DBMS_STATS", line 33396
ORA-06512: at line 1
SQL>  exec dbms_stats.seed_col_usage(null,null,100);
PL/SQL procedure successfully completed.

--这样正常!不知道为什么?
--执行以上语句多次:

SQL> Select * from t1 where id2=3 and name='mmmm';
SQL> select dbms_stats.report_col_usage(user,'t1') from dual ;
DBMS_STATS.REPORT_COL_USAGE(USER,'T1')
-----------------------------------------------------------------------------------
LEGEND:
.......
EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................
###############################################################################
COLUMN USAGE REPORT FOR SCOTT.T1
................................
1. ID2                                 : EQ
2. NAME                                : EQ
3. (ID2, NAME)                         : FILTER
###############################################################################

--根据提示可以在可以在ID2,name上建立column groups。

3.建立column groups:

http://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload

    At this point you can get Oracle to automatically create the column groups for each of the tables based on the usage
information captured during the monitoring window. You simply have to call the DBMS_STATS.CREATE_EXTENDED_STATS function
for each table.This function requires just two arguments, the schema name and the table name. From then on, statistics
will be maintained for each column group whenever statistics are gathered on the table. In this example you will see two
column groups were created based on the information captured from the two queries in this workload.

SQL> select  dbms_stats.create_extended_stats(user,'T1') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T1')
---------------------------------------------------------------------------------
###############################################################################
EXTENSIONS FOR SCOTT.T1
.......................
1. (ID2, NAME)                         : SYS_STUT3NILYU40ICEIZ7TLWV#HOP created
###############################################################################
SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for all columns size 1');
PL/SQL procedure successfully completed.
SQL> SELECT column_name, data_type, num_distinct, low_value, high_value,HISTOGRAM FROM dba_tab_cols WHERE wner = USER AND table_name = 'T1';
COLUMN_NAME          DATA_TYPE  NUM_DISTINCT LOW_VALUE  HIGH_VALUE HISTOGRAM
-------------------- ---------- ------------ ---------- ---------- ---------------
ID1                  NUMBER            10100 C102       C30202     NONE
ID2                  NUMBER                3 80         C104       NONE
NAME                 CHAR                  2 6D6D6D6D   74657374   NONE
SYS_STUT3NILYU40ICEI NUMBER                4 CA03165324 CA11624601 NONE
Z7TLWV#HOP                                   5E300A2717 5213202344
                                             5E         36
4.再次执行:
SQL> Select * from t1 where id2=3 and name='mmmm';
PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID  f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |   2525 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2"=3 AND "NAME"='mmmm'))

--E-Rows行还是相差太远。

--在分组上建立直方图看看。

SQL> exec dbms_stats.gather_table_stats(user,'t1',method_opt => 'for columns size 1 for columns SYS_STUT3NILYU40ICEIZ7TLWV#HOP size 254');
PL/SQL procedure successfully completed.
SQL> alter system flush shared_pool;
SQL> Select * from t1 where id2=3 and name='mmmm';
PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID  f6x401mx48a43, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='mmmm'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |     50 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2"=3 AND "NAME"='mmmm'))

--可以发现现在 E-Rows =50,估计正确。

SQL> Select * from t1 where id2=3 and name='test';
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  2rh118hps0wfj, child number 0
-------------------------------------
Select * from t1 where id2=3 and name='test'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |     50 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2"=3 AND "NAME"='test'))

--可以发现现在 E-Rows =50,估计正确。

BTW:这样的分组仅仅对等值判断正确,对于范围扫描依旧不行。

PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID  17jqnvxzm8bh9, child number 0
-------------------------------------
Select * from t1 where id2 between 2 and 3  and name='test'
Plan hash value: 3617692013
--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |    10 (100)|
|*  1 |  TABLE ACCESS FULL| T1   |     99 |    10   (0)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("ID2">=2 AND "NAME"='test' AND "ID2"
时间: 2024-09-11 00:17:20

[20130327]dbms_stats.seed_col_usage的使用.txt的相关文章

[20130328]dbms_stats.seed_col_usage的使用2.txt

[20130328]dbms_stats.seed_col_usage的使用2.txt 对昨天dbms_stats.seed_col_usage的使用做一些补充. SQL> exec dbms_stats.reset_col_usage(NULL,NULL) PL/SQL procedure successfully completed. -- This procedure deletes the recorded column (group) usage information -- from

[20130528]dbms_stats.gather_table_stats的method_opt.txt

[20130528]dbms_stats.gather_table_stats的method_opt.txt 同事想删除某个表的直方图信息,结果老是删除不掉.问我,我看了一下,马上明白使用参数错误,他使用的是method_opt=>'for columns size 1'.缺少了all参数.应该写成method_opt=>'for all columns size 1'. 实际上这个问题我以前做一些测试的时候我就发现,一直没有好好的总结.找个机会总结一下. 1.测试环境: SQL> @v

[20130502]dbms_stats缺省参数.txt

[20130502]dbms_stats缺省参数.txt 今天抽空看了dbms_stats缺省参数,我的测试环境是11G. column cascade format a30column degree format a10column estimate_percent format a30column method_opt format a30column no_invalidate format a30column granularity format a10column publish fo

[20130529] dbms_stats.report_col_usage.txt

[20130529] dbms_stats.report_col_usage.txt 11G下dbms_stats包增加了report_col_usage函数,我曾经写过两篇blog. http://space.itpub.net/267265/viewspace-757319http://space.itpub.net/267265/viewspace-757320 要收集exented statistics,先要执行,exec dbms_stats.seed_col_usage(null,n

深入并行:从并行加载到12c Adaptive特性深度理解Oracle并行

陈焕生 Oracle Real-World Performance Group 成员,senior performance engineer,专注于 OLTP.OLAP 系统 在 Exadata 平台和 In-Memory 特性上的最佳实践.个人博客 http://dbsid.com . 编辑手记:感谢陈焕生的精品文章,这篇文章首发在ACOUG,在此转载分享给大家,Sidney撰写这个系列的文章时间跨度也有两年,下篇刚刚出炉. 上篇分为两篇文章: 深入并行:从生产者到消费者模型深度理解Oracl

Oracle 12C优化器的巨大变化,上生产必读(下)

在Oracle 12c数据库中,随着新的查询优化自适应方法的引入,还有对可用的统计信息的强化,优化器实现了一个巨大的飞跃.今天就让我们继续这个话题,一起来揭晓Oracle 12C其他的强大功能吧. 16 全局临时表上的会话级统计信息  全局临时表通常用于存储应用程序上下文中的中间结果.一个全局临时表的定义,是全系统中拥有适当权限的所有用户所共享的,但其数据永远是会话私有的.在全局临时表(必须是会话级临时表,PRESERVE ROWS ON COMMIT,提交时保留数据)上收集统计信息是行得通的:

Oracle多列统计信息

Oracle多列统计信息    通常,当我们将SQL语句提交给Oracle数据库时,Oracle会选择一种最优方式来执行,这是通过查询优化器Query Optimizer来实现的.CBO(Cost-Based Optimizer)是Oracle默认使用的查询优化器模式.在CBO中,SQL执行计划的生成,是以一种寻找成本(Cost)最优为目标导向的执行计划探索过程.所谓成本(Cost)就是将CPU和IO消耗整合起来的量化指标,每一个执行计划的成本就是经过优化器内部公式估算出的数字值.      

[20150228]DBMS_STATS Tracing.txt

[20150228]DBMS_STATS Tracing.txt --这个是很久的链接,可以跟踪dbms_stats的操作过程,自己测试看看. http://www.pythian.com/blog/options-for-tracing-oracle-dbms_stats/ Tracing is enabled by calling dbms_stats.set_global_prefs('trace',) Following are the possible values for the t

[20130924]12c dbms_stats包的一些缺省参数.txt

[20130924]12c dbms_stats包的一些缺省参数.txt 11G下: SQL> @ver BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production SELECT DBMS_STATS.get_param ('AU