[20140718]12c执行计划新特性- Partial Join Evaluation (PJE).txt
--以前经常一些blogl讲什么使用in还是exists好的相关讨论,实际上11g以上的版本查询转换多数情况下会选择好的执行计划,只要建立好
--对应的约束,索引建立好,oracle多会选择好的执行计划.
--12c在这些基础上引入了Partial Join Evaluation (PJE),能够进一步减少逻辑读,还是通过例子来说明问题:
SCOTT@test01p> @ver
BANNER CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production 0
create table emp1 as select * from emp;
alter table EMP1 modify empno number(10);
insert into EMP1(empno,deptno) select rownum+10000,40 from EMP,(select * from dual connect by level execute dbms_stats.gather_table_stats(user,'emp1',cascade=>true,method_opt=>'for all columns size 1',no_invalidate=>false);
--例子带的emp表没有deptno=40的雇员,增加deptno=40的雇员是更好的说明Partial Join Evaluation (PJE).
SCOTT@test01p> select owner,table_name,blocks from dba_tables where owner=user and table_name='EMP2';
OWNER TABLE_NAME BLOCKS
------ ---------- ----------
SCOTT EMP2 46
--emp2的blocks=46.
SCOTT@test01p> select /*+full(dept) */deptno from dept;
DEPTNO
----------
10
20
30
40
--dept表的depno仅仅有10,20,30,40.
SCOTT@test01p> alter session set statistics_level=all;
Session altered.
SCOTT@test01p> select /*+full(dept) */ deptno,dname from dept where deptno in ( select deptno from emp1);
DEPTNO DNAME
---------- --------------
20 RESEARCH
30 SALES
10 ACCOUNTING
40 OPERATIONS
--注:我不加提示执行计划使用MERGE JOIN SEMI,而不是hash join semi.我加了这个提示.
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 0u9dzjua9uc24, child number 0
-------------------------------------
select /*+full(dept) */ deptno,dname from dept where deptno in (select deptno from emp1)
Plan hash value: 1309553802
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 17 (100)| 4 |00:00:00.01 | 13 | | | |
|* 1 | HASH JOIN SEMI | | 1 | 4 | 17 (0)| 4 |00:00:00.01 | 13 | 1599K| 1599K| 1010K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 7 | | | |
| 3 | TABLE ACCESS FULL| EMP1 | 1 | 14000 | 14 (0)| 15 |00:00:00.01 | 6 | | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DEPTNO"="DEPTNO")
--注意看执行ID=3行,A-ROWS=15行,buffers=6,也就是没有全表扫描完成!仅仅扫描15条EMP1的记录,就输出了结果,这个是因为前面15条记录已经
--包括了deptno=10,20,30,40的记录,继续扫描emp1表已经无意义.
--使用exists也是一样.
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 7xbwbhv50rcvt, child number 0
-------------------------------------
select /*+ full(dept) */ deptno,dname from dept where exists (select 1 from emp1 where emp1.deptno=dept.deptno)
Plan hash value: 1309553802
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 17 (100)| 4 |00:00:00.01 | 13 | | | |
|* 1 | HASH JOIN SEMI | | 1 | 4 | 17 (0)| 4 |00:00:00.01 | 13 | 1599K| 1599K| 1007K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 7 | | | |
| 3 | TABLE ACCESS FULL| EMP1 | 1 | 14000 | 14 (0)| 15 |00:00:00.01 | 6 | | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP1"."DEPTNO"="DEPT"."DEPTNO")
--如果建立emp2表不包含deptno=40的记录,情况如何呢?
create table emp2 as select * from emp;
alter table EMP2 modify empno number(10);
insert into EMP2(empno,deptno) select rownum+10000,10 from EMP,(select * from dual connect by level execute dbms_stats.gather_table_stats(user,'emp2',cascade=>true,method_opt=>'for all columns size 1',no_invalidate=>false);
SCOTT@test01p> select /*+full(dept) */ deptno,dname from dept where deptno in ( select deptno from emp2);
DEPTNO DNAME
---------- --------------
20 RESEARCH
30 SALES
10 ACCOUNTING
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 106cp3bm86r84, child number 0
-------------------------------------
select /*+full(dept) */ deptno,dname from dept where deptno in (select deptno from emp2)
Plan hash value: 4213155305
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 17 (100)| 3 |00:00:00.03 | 54 | 44 | | | |
|* 1 | HASH JOIN SEMI | | 1 | 3 | 17 (0)| 3 |00:00:00.03 | 54 | 44 | 1599K| 1599K| 1007K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 7 | 0 | | | |
| 3 | TABLE ACCESS FULL| EMP2 | 1 | 14000 | 14 (0)| 14000 |00:00:00.03 | 47 | 44 | | | |
--------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DEPTNO"="DEPTNO")
--注意看执行ID=3行,A-ROWS=14000行,buffers=47,也就是因为emp2表不存在deptno=40的记录,直到结束也没有deptno=40的记录.
--我看执行计划的Outline,并没有PARTIAL_JOIN,也许是版本的问题.
@dpc '' ''
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
DB_VERSION('12.1.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$5DA710D3")
UNNEST(@"SEL$2")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$2")
FULL(@"SEL$5DA710D3" "DEPT"@"SEL$1")
FULL(@"SEL$5DA710D3" "EMP1"@"SEL$2")
LEADING(@"SEL$5DA710D3" "DEPT"@"SEL$1" "EMP1"@"SEL$2")
USE_HASH(@"SEL$5DA710D3" "EMP1"@"SEL$2")
END_OUTLINE_DATA
*/
--这个特性受隐含参数_optimizer_partial_join_eval的控制.
SYS@test> set linesize 200
SYS@test> @hide PARTIAL_JOIN
old 10: and lower(a.ksppinm) like lower('%&1%')
new 10: and lower(a.ksppinm) like lower('%PARTIAL_JOIN%')
NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
----------------------------- --------------------------------- ------------- ------------- ------------
_optimizer_partial_join_eval partial join evaluation parameter TRUE TRUE TRUE
--这个执行计划的特性仅仅有全部的之值在dept表.如果有一些记录没有匹配emp2表,只能全表扫描第2个表才能知道结果.
--这样像emp2表这种情况,以前会如何优化呢?
--通过在emp2上建立deptno索引来解决,而这个索引在正常的业务中很少会使用,重复值很多,许多执行计划会不使用,这样
--为了这样一条语句,代价有点大.看看具体的情况:
SCOTT@test01p> create index i_emp_deptno on emp2(deptno);
Index created.
SCOTT@test01p> select /*+f1ull(dept) */ deptno,dname from dept where deptno in ( select deptno from emp2);
DEPTNO DNAME
---------- --------------
10 ACCOUNTING
20 RESEARCH
30 SALES
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 32ckk3kt5z7hn, child number 0
-------------------------------------
select /*+f1ull(dept) */ deptno,dname from dept where deptno in (select deptno from emp2)
Plan hash value: 3987593338
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 7 (100)| 3 |00:00:00.01 | 14 |
| 1 | NESTED LOOPS SEMI | | 1 | 3 | 7 (0)| 3 |00:00:00.01 | 14 |
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 8 |
|* 3 | INDEX RANGE SCAN | I_EMP_DEPTNO | 4 | 10500 | 1 (0)| 3 |00:00:00.01 | 6 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPTNO"="DEPTNO")
SCOTT@test01p> select /*+ full(dept) */ deptno,dname from dept where exists (select 1 from emp2 where emp2.deptno=dept.deptno);
DEPTNO DNAME
---------- --------------
10 ACCOUNTING
20 RESEARCH
30 SALES
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID f0gzx28d1hz3z, child number 0
-------------------------------------
select /*+ full(dept) */ deptno,dname from dept where exists (select 1 from emp2 where emp2.deptno=dept.deptno)
Plan hash value: 3987593338
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 7 (100)| 3 |00:00:00.01 | 14 |
| 1 | NESTED LOOPS SEMI | | 1 | 3 | 7 (0)| 3 |00:00:00.01 | 14 |
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 8 |
|* 3 | INDEX RANGE SCAN | I_EMP_DEPTNO | 4 | 10500 | 1 (0)| 3 |00:00:00.01 | 6 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP2"."DEPTNO"="DEPT"."DEPTNO")
--我们可以发现使用in 和 exists ,执行计划都是一样的.
--感到奇怪的是我这个版本修改参数并没有改变执行计划.不知道为什么?
SYS@test> alter session set "_optimizer_partial_join_eval"=false;
Session altered.
--继续看看11g的情况.
SCOTT@test> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table emp1 as select * from emp;
alter table EMP1 modify empno number(10);
insert into EMP1(empno,deptno) select rownum+10000,40 from EMP,(select * from dual connect by level execute dbms_stats.gather_table_stats(user,'emp1',cascade=>true,method_opt=>'for all columns size 1',no_invalidate=>false);
SCOTT@test> alter session set statistics_level=all;
Session altered.
SCOTT@test> select /*+full(dept) */ deptno,dname from dept where deptno in ( select deptno from emp1);
DEPTNO DNAME
---------- --------------
20 RESEARCH
30 SALES
10 ACCOUNTING
40 OPERATIONS
SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 0u9dzjua9uc24, child number 0
-------------------------------------
select /*+full(dept) */ deptno,dname from dept where deptno in (
select deptno from emp1)
Plan hash value: 1309553802
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 19 (100)| 4 |00:00:00.06 | 54 | | | |
|* 1 | HASH JOIN SEMI | | 1 | 3 | 19 (6)| 4 |00:00:00.06 | 54 | 1180K| 1180K| 1066K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 6 | 4 (0)| 6 |00:00:00.01 | 8 | | | |
| 3 | TABLE ACCESS FULL| EMP1 | 1 | 14000 | 14 (0)| 14000 |00:00:00.01 | 46 | | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DEPTNO"="DEPTNO")
--很明显,11g下id=3,emp1执行全部扫描.
--使用exists 也是这种情况.
SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID fzz6dafumdjgz, child number 0
-------------------------------------
select /*+ full(dept) */ deptno,dname from dept where exists (select 1 from emp1 where emp1.deptno=dept.deptno)
Plan hash value: 1309553802
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 19 (100)| 4 |00:00:00.06 | 54 | | | |
|* 1 | HASH JOIN SEMI | | 1 | 3 | 19 (6)| 4 |00:00:00.06 | 54 | 1180K| 1180K| 768K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 6 | 4 (0)| 6 |00:00:00.01 | 8 | | | |
| 3 | TABLE ACCESS FULL| EMP1 | 1 | 14000 | 14 (0)| 14000 |00:00:00.01 | 46 | | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP1"."DEPTNO"="DEPT"."DEPTNO")
SCOTT@test> @hide PARTIAL_JOIN
no rows selected
--11g下一个简单的改进版本是:
SCOTT@test> create index i_emp_deptno on emp1(deptno);
Index created.
SCOTT@test> select /*+ full(dept) */ deptno,dname from dept where exists (select 1 from emp1 where emp1.deptno=dept.deptno and rownum DEPTNO DNAME
---------- --------------
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS
SCOTT@test> @dpc '' advanced
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID c3ukmu0mxfs4n, child number 0
-------------------------------------
select /*+ full(dept) */ deptno,dname from dept where exists (select 1
from emp1 where emp1.deptno=dept.deptno and rownum
Plan hash value: 4010202402
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 7 (100)| | 4 |00:00:00.01 | 21 |
|* 1 | FILTER | | 1 | | | | | 4 |00:00:00.01 | 21 |
| 2 | TABLE ACCESS FULL| DEPT | 1 | 6 | 66 | 4 (0)| 00:00:01 | 6 |00:00:00.01 | 9 |
|* 3 | COUNT STOPKEY | | 6 | | | | | 4 |00:00:00.01 | 12 |
|* 4 | INDEX RANGE SCAN| I_EMP_DEPTNO | 6 | 1 | 3 | 1 (0)| 00:00:01 | 4 |00:00:00.01 | 12 |
-----------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / DEPT@SEL$1
3 - SEL$2
4 - SEL$2 / EMP1@SEL$2
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$2")
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "DEPT"@"SEL$1")
INDEX(@"SEL$2" "EMP1"@"SEL$2" ("EMP1"."DEPTNO"))
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NOT NULL)
3 - filter(ROWNUM 4 - access("EMP1"."DEPTNO"=:B1)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "DEPTNO"[NUMBER,22], "DNAME"[VARCHAR2,14]
2 - "DEPTNO"[NUMBER,22], "DNAME"[VARCHAR2,14]
--使用rownum
--总结:
--12c下引入了Partial Join Evaluation (PJE)减少了逻辑读,但是要第1个表的相关信息在第2个表里面,而且与表2的数据分布有关,如果
--表2没有包含表1的全部信息一样要全表扫描.当然也要注意不要看到全表扫描,就武断的认为执行计划不好.