[20170615]直方图-高度直方图(11g).txt

[20170615]直方图-高度直方图(11g).txt

--//昨天看了一些直方图的资料,重新看jonathanlewis写<CBO>书籍,在测试时遇到一些与原来书讲的不一样的地方.
--//自己重复测试看看.

1.环境以及测试建立:
SCOTT@book> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SCOTT@book> define m_demo_size=80
SCOTT@book> drop table t1 purge ;
Table dropped.

create table t1 (
    skew        not null,   
    padding
)
as
with generator as (
    select    --+ materialize
        rownum     id
    from    all_objects
    where    rownum <= 5000
)
select
    /*+ ordered use_nl(v2) */
    v1.id,
    rpad('x',400)
from
    generator    v1,
    generator    v2
where
    v1.id <= &m_demo_size
and    v2.id <= &m_demo_size
and    v2.id <= v1.id
order by
    v2.id,v1.id
;

create index t1_i1 on t1(skew);

begin
    dbms_stats.gather_table_stats(
        user,
        't1',
        cascade => true,
        estimate_percent => null,
        method_opt => 'for all columns size 75'
    );
end;
/

select
    num_distinct, density, num_Buckets,histogram,sample_size
from
    user_tab_columns
where
    table_name = 'T1'
and    column_name = 'SKEW'
;

NUM_DISTINCT    DENSITY NUM_BUCKETS HISTOGRAM       SAMPLE_SIZE
------------ ---------- ----------- --------------- -----------
          80 .013973812          75 HEIGHT BALANCED        3240

select
    endpoint_number, endpoint_value
from
    user_tab_histograms
where
    column_name = 'SKEW'
and    table_name = 'T1'
order by
    endpoint_number
;

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
              0              1
              1              9
              2             13
              3             16
              4             19
              5             21
              6             23
              7             25
              8             26
              9             28
             10             29
             11             31
             12             32
             13             33
             14             35
             15             36
             16             37
             17             38
             18             39
             19             40
             20             41
             21             42
             22             43
             23             44
             24             45
             25             46
             26             47
             27             48
             28             49
             29             50
             30             51
             32             52
             33             53
             34             54
             35             55
             37             56
             38             57
             39             58
             41             59
             42             60
             43             61
             45             62
             46             63
             48             64
             49             65
             51             66
             52             67
             54             68
             56             69
             57             70
             59             71
             60             72
             62             73
             64             74
             66             75
             67             76
             69             77
             71             78
             73             79
             75             80
60 rows selected.

prompt    equality on a popular value - uses bucket counts

select
    count(*)
from    t1
where    skew = 77
;
SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  ftrt6fax5mrgr, child number 0
-------------------------------------
select     count(*) from    t1 where    skew = 77
Plan hash value: 2432955788
----------------------------------------------------------------------------
| Id  | Operation         | Name  | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |        |       |     1 (100)|          |
|   1 |  SORT AGGREGATE   |       |      1 |     3 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_I1 |     86 |   258 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / T1@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SKEW"=77)

--//skew=77 是流行值.占了2个桶. 2/75*3240=86.3999999999999999784 ,也就是流行值的计算是占用backup数量/backup总数量*NDV.

prompt    equality on a non-popular value - uses density

select
    count(*)
from    t1
where    skew = 72
;

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  2u51xnc3hnfcf, child number 0
-------------------------------------
select     count(*) from    t1 where    skew = 72

Plan hash value: 2432955788

----------------------------------------------------------------------------
| Id  | Operation         | Name  | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |        |       |     1 (100)|          |
|   1 |  SORT AGGREGATE   |       |      1 |     3 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_I1 |     29 |    87 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / T1@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SKEW"=72)

--//如果不是流行值,就不用上来类似的公式而是,而是3240*DENSITY
--//3240*.013973812=45.27515088 ?? ,存在误差,说明oracle改变了算法.建立10053分析看看.

SCOTT@book> @ &r/10053x 2u51xnc3hnfcf 0
PL/SQL procedure successfully completed.

--//查看跟踪文件:
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  Column (#1):
    NewDensity:0.008958, OldDensity:0.013974 BktCnt:75, PopBktCnt:32, PopValCnt:16, NDV:80
  Column (#1): SKEW(
    AvgLen: 3 NDV: 80 Nulls: 0 Density: 0.008958 Min: 1 Max: 80
    Histogram: HtBal  #Bkts: 75  UncompBkts: 75  EndPtVals: 60
  Table: T1  Alias: T1
    Card: Original: 3240.000000  Rounded: 29  Computed: 29.03  Non Adjusted: 29.03
  Access Path: TableScan
    Cost:  57.06  Resp: 57.06  Degree: 0
      Cost_io: 57.00  Cost_cpu: 2093652
      Resp_io: 57.00  Resp_cpu: 2093652
  Access Path: index (index (FFS))
    Index: T1_I1
    resc_io: 4.00  resc_cpu: 600650
    ix_sel: 0.000000  ix_sel_with_filters: 1.000000
  Access Path: index (FFS)
    Cost:  4.02  Resp: 4.02  Degree: 1
      Cost_io: 4.00  Cost_cpu: 600650
      Resp_io: 4.00  Resp_cpu: 600650
  Access Path: index (AllEqRange)
    Index: T1_I1
    resc_io: 1.00  resc_cpu: 13971
    ix_sel: 0.008958  ix_sel_with_filters: 0.008958
    Cost: 1.00  Resp: 1.00  Degree: 1
  Best:: AccessPath: IndexRange
  Index: T1_I1
         Cost: 1.00  Degree: 1  Resp: 1.00  Card: 29.03  Bytes: 0
    check parallelism for statement[<unnamed>]
kkfdtParallel: parallel is possible (no statement type restrictions)
    kkfdPaForcePrm: dop:1 ()
kkfdPaPrm: use dictionary DOP(1) on table
kkfdPaPrm:- The table : 90429
kkfdPaPrm:DOP = 1 (computed from hint/dictionary/autodop)
kkfdiPaPrm: dop:1 serial(?)
***************************************

--找到如下连接:
http://www.adellera.it/blog/2009/10/16/cbo-newdensity-replaces-density-in-11g-10204-densities-part-iii/

NewDensity is not stored anywhere in the data dictionary, but it is computed at query optimization time by the CBO (note
that density is still computed by dbms_stats using the old formula, but then it is ignored by the CBO). The NewDensity formula
is based mainly on some histogram-derived figures; using the same names found in 10053 traces:

NewDensity = [(BktCnt - PopBktCnt) / BktCnt] / (NDV - PopValCnt)
--//按照这个公式计算:
(75-32)/75/(80-16) = .00895833333333333333
--//与NewDensity:0.008958非常接近.不过我找遍跟踪文件并没有上面的公式,oracle应该不公开,这些都是基于统计学得来的公式.

时间: 2024-09-20 14:22:21

[20170615]直方图-高度直方图(11g).txt的相关文章

[20161208]11g直方图与char数据类型.txt

[20161208]11g直方图与char数据类型.txt --以前看tom大师的书提到过不要使用char数据类型,哪怕是char(1)也不要使用,最近看了几篇blob里面都提到了11g升级后会出现char数 --据类型直方图统计发生了变化,我重复别人的例子来说明.再次强调不要生产环境使用char类型. --参考链接:http://blog.dbi-services.com/histograms-on-character-strings-between-11-2-0-3-and-11-2-0-4

[20131220]频率直方图的简单探究.txt

[20131220]频率直方图的简单探究.txt http://allthingsoracle.com/histograms-part-1-why/http://www.itpub.net/thread-1816475-1-1.html 昨天本想看12c的混合直方图的相关信息,无意之中发现以上链接,Jonathan Lewis给出很好的例子,newkid的翻译写的很清晰,自己再按照上面的介绍写一些例子做一些测试. 我的测试环境11.2.0.3,建立测试环境: SYS@test> @verBANN

图像直方图与直方图均衡化

图像直方图与直方图均衡化 图像直方图以及灰度与彩色图像的直方图均衡化 图像直方图: 概述: 图像的直方图用来表征该图像像素值的分布情况.用一定数目的小区间(bin)来指定表征像素值的范围,每个小区间会得到落入该小区间表示范围的像素数目. 图像直方图图形化显示不同的像素值在不同的强度值上的出现频率,对于灰度图像来说强度范围为[0~255]之间,对于RGB的彩色图像可以独立显示三种颜色的图像直方图. 同时直方图是用来寻找灰度图像二值化阈值常用而且是有效的手段之一,如果一幅灰度图像的直方图显示为两个波

[20171211]ora-16014 11g.txt

[20171211]ora-16014 11g.txt --//上午测试了10g下备库log_archive_dest_1参数配置VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES)的错误.在11G也测试看看: 1.环境:SCOTT@book> @ &r/ver1PORT_STRING VERSION BANNER------------------------------ -------------- ---------------------------------

[20140210]一条sql语句的优化(11g).txt

  [20140210]一条sql语句的优化(11g).txt 今天下午看生产系统数据库,无意中发现一个错误,同时优化也有点小问题,写一个测试脚本. 1.建立测试环境: SCOTT@test> @ver BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 -

[20140909]oracle cluster index (11g).txt

[20140909]oracle cluster index (11g).txt --应用中除了堆表,很少使用cluser表,也就仅仅在生产系统使用IOT索引组织表. --实际上系统表中许多都是cluster表.比如SYS.TAB$,SYS.COL$等都建立在cluster中. --没事,简单研究一下其存储结构. 1.建立测试环境: SCOTT@test> @ver BANNER ----------------------------------------------------------

OpenCV分通道显示图片,灰度,融合,直方图,彩色直方图

     代码有参考跟整合:没有一一列出出处   // split_rgb.cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" #include <iostream> #include <vector> #include "opencv2/core/core.hpp" #include "opencv2/highgui/highgui.hpp" #include "opencv

[20170114]12c varchar2类型直方图.txt

[20170114]12c varchar2类型直方图.txt --我曾经提到慎用nvarchar2数据类型,链接:http://blog.itpub.net/267265/viewspace-2120925/ --我那里提到数据类型nvarchar2类型,因为1个字符占用2个字节,这样如果前面16个字符重复很多,直方图的建立就是鸡肋, --毫无用处(因为分析仅仅对前面32个字节有效),12c 直方图支持更多类型: 高度直方图,频率直方图.混和类型(HYBRID). --看看12c关于直方图方面

Oracle直方图

Oracle直方图 直方图是一种按数据出现的频率来进行分类存储的方法.在oracle中直方图是用来描述表中列数据的分布情况.每一个sql在被执行前都要经过优化这一步骤那么在优化器给出一个最优执行计划之优化器应该要知道sql语句中所引用的底层对象的详细信息. 直方图描述的对象包括列中不同值的数量和它们出现的频率.现在存储每一个不同值和它出现的频率是不可行的,特别是对于大表来说列中有上万个不同值,oracle使用直方图来存储关于列中数据分布的有用信息而且oracle的CBO使用直方图信息来计算出一个