全表扫描是Oracle访问数据库表是较为常见的访问方式之一。很多朋友一看到SQL语句执行计划中的全表扫描,就要考虑对其进行修理一番。全表扫描的存在,的确存在可能优化的余地。但事实上很多时候全表扫描也并非是最低效的,完全要看不同的情形与场合,任一方式都是有利有弊的,也就是具体情况要具体分析。本文描述了什么是全表扫描以及何时发生全表扫描,何时全表扫描才低效。
本文涉及到的相关链接:
高水位线和全表扫描
启用 AUTOTRACE 功能 Oracle 测试常用表BIG_TABLE Oracle db_file_mulitblock_read_count参数
1、什么是全表扫描?
全表扫描就是扫表表中所有的行,实际上是扫描表中所有的数据块,因为Oracle中最小的存储单位是Oracle block。
扫描所有的数据块就包括高水位线以内的数据块,即使是空数据块在没有被释放的情形下也会被扫描而导致I/O增加。
在全表扫描期间,通常情况下,表上这些相邻的数据块被按顺序(sequentially)的方式访问以使得一次I/O可以读取多个数据块。
一次读取更多的数据块有助于全表扫描使用更少的I/O,对于可读取的数据块被限制于参数DB_FILE_MULTIBLOCK_READ_COUNT。
2、何时发生全表扫描?
a、表上的索引失效或无法被使用的情形(如对谓词使用函数、计算、NULL值、不等运算符、类型转换)
b、查询条件返回了整个表的大部分数据
c、使用了并行方式访问表
d、使用full 提示
e、统计信息缺失时使得Oracle认为全表扫描比索引扫描更高效
f、表上的数据块小于DB_FILE_MULTIBLOCK_READ_COUNT值的情形可能产生全表扫描
3、演示全表扫描的情形
a、准备演示环境 scott@ORA11G> select * from v$version where rownum<2; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production --创建表t scott@ORA11G> CREATE TABLE t 2 AS 3 SELECT rownum AS n, rpad('*',100,'*') AS pad 4 FROM dual 5 CONNECT BY level <= 1000; Table created. --添加索引 scott@ORA11G> create unique index t_pk on t(n); Index created. scott@ORA11G> alter table t add constraint t_pk primary key(n) using index t_pk; Table altered. --收集统计信息 scott@ORA11G> execute dbms_stats.gather_table_stats('SCOTT','T',cascade=>true); PL/SQL procedure successfully completed. scott@ORA11G> set autot trace exp; scott@ORA11G> select count(*) from t; --->count(*)的时候使用了索引快速扫描 Execution Plan ---------------------------------------------------------- Plan hash value: 454320086 ---------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| T_PK | 1000 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------- scott@ORA11G> set autot off; scott@ORA11G> alter table t move; --->进行move table Table altered. -->move 之后索引失效,如下所示 scott@ORA11G> @idx_info Enter value for owner: scott Enter value for table_name: t Table Name INDEX_NAME CL_NAM CL_POS STATUS IDX_TYP DSCD ------------- -------------- -------------------- ------ -------- --------------- ---- T T_PK N 1 UNUSABLE NORMAL ASC b、索引失效导致全表扫描 scott@ORA11G> set autot trace exp; scott@ORA11G> select count(*) from t; Execution Plan ---------------------------------------------------------- Plan hash value: 2966233522 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T | 1000 | 7 (0)| 00:00:01 | ------------------------------------------------------------------- scott@ORA11G> set autot off; scott@ORA11G> alter index t_pk rebuild; -->重建索引 Index altered. scott@ORA11G> @idx_info Enter value for owner: scott Enter value for table_name: t Table Name INDEX_NAME CL_NAM CL_POS STATUS IDX_TYP DSCD -------------- ---------------- -------------------- ------ -------- --------------- ---- T T_PK N 1 VALID NORMAL ASC c、返回了整个表的大部分数据使用了全表扫描 scott@ORA11G> select count(pad) from t where n<=990; Execution Plan ---------------------------------------------------------- Plan hash value: 2966233522 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 7 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 105 | | | |* 2 | TABLE ACCESS FULL| T | 991 | 101K| 7 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("N"<=990) --返回小部分数据时,使用的是索引扫描 scott@ORA11G> select count(pad) from t where n<=10; Execution Plan ---------------------------------------------------------- Plan hash value: 4270555908 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 105 | | | | 2 | TABLE ACCESS BY INDEX ROWID| T | 10 | 1050 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | T_PK | 10 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("N"<=10) d、使用并行方式访问表时使用了全表扫描 scott@ORA11G> select /*+ parallel(3) */ count(pad) from t where n<=10; Execution Plan ---------------------------------------------------------- Plan hash value: 3126468333 ---------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 3 (0)| 00:00:01 | | | | | 1 | SORT AGGREGATE | | 1 | 105 | | | | | | | 2 | PX COORDINATOR | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 105 | | | Q1,00 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | 1 | 105 | | | Q1,00 | PCWP | | | 5 | PX BLOCK ITERATOR | | 10 | 1050 | 3 (0)| 00:00:01 | Q1,00 | PCWC | | |* 6 | TABLE ACCESS FULL| T | 10 | 1050 | 3 (0)| 00:00:01 | Q1,00 | PCWP | | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter("N"<=10) Note ----- - Degree of Parallelism is 3 because of hint --Author : Robinson --Blog : http://blog.csdn.net/robinson_0612 e、使用full提示时使用了全表扫描 scott@ORA11G> select /*+ full(t) */ count(pad) from t where n<=10; Execution Plan ---------------------------------------------------------- Plan hash value: 2966233522 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 7 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 105 | | | |* 2 | TABLE ACCESS FULL| T | 10 | 1050 | 7 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("N"<=10) f、统计信息缺失导致全表扫描的情形 scott@ORA11G> exec dbms_stats.delete_table_stats('SCOTT','T'); PL/SQL procedure successfully completed. scott@ORA11G> select count(pad) from t where n<=10; Execution Plan ---------------------------------------------------------- Plan hash value: 2966233522 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 65 | 7 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 65 | | | |* 2 | TABLE ACCESS FULL| T | 10 | 650 | 7 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("N"<=10) Note ----- - dynamic sampling used for this statement (level=2) --上面的执行计划使用了全表扫描,而且提示使用了动态采样,也就是缺乏统计信息 --表上的数据块小于DB_FILE_MULTIBLOCK_READ_COUNT值的情形可能产生全表扫描的情形不演示
4、全表扫描何时低效?
--先来做几个实验 a、演示表上的相关信息 scott@ORA11G> @idx_info Enter value for owner: scott Enter value for table_name: big_table Table Name Index Name CL_NAM CL_POS Status IDX_TYP DSCD ------------------------- ------------------------- --------- ------ -------- --------------- ---- BIG_TABLE BIG_TABLE_PK ID 1 VALID NORMAL ASC scott@ORA11G> @idx_stat Enter value for input_table_name: big_table Enter value for owner: scott AVG LEAF BLKS AVG DATA BLKS BLEV IDX_NAME LEAF_BLKS DST_KEYS PER KEY PER KEY CLUST_FACT LAST_ANALYZED TB_BLKS TB_ROWS ---- -------------- ---------- ---------- ------------- ------------- ---------- ------------------ ---------- ---------- 1 BIG_TABLE_PK 208 100000 1 1 1483 20130524 10:45:51 1515 100000 --数据库参数设置 scott@ORA11G> show parameter optimizer_index_ NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ optimizer_index_caching integer 0 optimizer_index_cost_adj integer 100 scott@ORA11G> show parameter optimizer_mode NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ optimizer_mode string ALL_ROWS b、查询返回20%数据行的情形 scott@ORA11G> alter system flush buffer_cache; scott@ORA11G> select sum(object_id),avg(object_id) from big_table where id between 20000 and 40000; Execution Plan ---------------------------------------------------------- Plan hash value: 3098837282 -- 执行计划中,使用了索引范围扫描 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 | 341 (0)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 18 | | | | 2 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 20046 | 352K| 341 (0)| 00:00:05 | |* 3 | INDEX RANGE SCAN | BIG_TABLE_PK | 20046 | | 43 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("ID">=20000 AND "ID"<=40000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 351 consistent gets 351 physical reads 0 redo size 427 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed scott@ORA11G> alter system flush buffer_cache; scott@ORA11G> select /*+ full(big_table) */ sum(object_id),avg(object_id) from big_table where id between 20000 and 40000; Execution Plan ---------------------------------------------------------- Plan hash value: 599409829 ---- 使用了提示执行为全表扫描 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 | 413 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 18 | | | |* 2 | TABLE ACCESS FULL| BIG_TABLE | 20046 | 352K| 413 (1)| 00:00:05 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("ID"<=40000 AND "ID">=20000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 1486 consistent gets 1484 physical reads 0 redo size 427 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed --注意对比上面两次操作中的consistent gets与physical reads c、查询返回30%数据行的情形 scott@ORA11G> alter system flush buffer_cache; scott@ORA11G> select sum(object_id),avg(object_id) from big_table where id between 20000 and 50000; Execution Plan ---------------------------------------------------------- Plan hash value: 599409829 --->尽管返回数据的总行数为30%,而此时优化器使用了全表扫描 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 | 413 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 18 | | | |* 2 | TABLE ACCESS FULL| BIG_TABLE | 30012 | 527K| 413 (1)| 00:00:05 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("ID"<=50000 AND "ID">=20000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 1486 consistent gets 1484 physical reads 0 redo size 427 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed --下面使用提示来强制优化器走索引扫描 scott@ORA11G> alter system flush buffer_cache; scott@ORA11G> select /*+ index(big_table big_table_pk) */ sum(object_id),avg(object_id) 2 from big_table where id between 20000 and 50000; Execution Plan ---------------------------------------------------------- Plan hash value: 3098837282 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 | 511 (1)| 00:00:07 | | 1 | SORT AGGREGATE | | 1 | 18 | | | | 2 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 30012 | 527K| 511 (1)| 00:00:07 | |* 3 | INDEX RANGE SCAN | BIG_TABLE_PK | 30012 | | 64 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("ID">=20000 AND "ID"<=50000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 526 consistent gets 526 physical reads 0 redo size 427 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed --注意观察每一次测试时所耗用的物理读与逻辑读 --从上面的测试可以看出,当表上所返回的数据行数接近于表上的30%时,Oracle 倾向于使用全表扫描 --而对于表上所返回的数据行数接近于表上的30%的情形,我们给与索引提示,此时比全表扫描更高效,即全表扫描是低效的 --笔者同时测试了数据返回总行数接近80%的情形以及创建了一个百万记录的进行对比测试 --大致结论,如果查询所返回的数据的总行数仅仅是表上数据的百分之八十以下,而使用了全表扫描,即可认为该全表扫描是低效的 --注: --具体情况需要具体分析,如果你的表是千万级的,返回总数据的百分之零点几都会导致很大的差异 --其次,表上的索引应具有良好的聚簇因子,如不然,测试的结果可能有天壤之别 --最后,上面所描述的返回总行数应与执行结果返回的行数有差异,是指多少行参与了sum(object_id)
5、小表的全表扫描是否高效?
--使用scott下dept表,仅有4行数据 scott@ORA11G> select * from dept where deptno>10; 3 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2985873453 --->执行计划选择了索引扫描 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 60 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 60 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | PK_DEPT | 3 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("DEPTNO">10) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets -->使用了4次逻辑读 0 physical reads 0 redo size 515 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3 rows processed -->下面强制使用全表扫描 scott@ORA11G> select /*+ full(dept) */ * from dept where deptno>10; 3 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3383998547 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 60 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| DEPT | 3 | 60 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("DEPTNO">10) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets -->此时的逻辑读同样为4次 0 physical reads 0 redo size 515 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3 rows processed --下面来看看count(*)的情形 scott@ORA11G> select count(*) from dept; 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3051237957 --->执行计划选择了索引全扫描 -------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FULL SCAN| PK_DEPT | 4 | 1 (0)| 00:00:01 | -------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 1 consistent gets -->逻辑读仅为1次 0 physical reads 0 redo size 335 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed -->下面强制使用全表扫描 scott@ORA11G> select /*+ full(dept) */ count(*) from dept; 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 315352865 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| DEPT | 4 | 3 (0)| 00:00:01 | ------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets -->使用了3次逻辑读 0 physical reads 0 redo size 335 bytes sent via SQL*Net to client 349 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed --对于小表,从上面的情形可以看出,使用索引扫描也是比全表扫描高效 --因此,建议始终为小表建立索引
更多参考
PL/SQL 联合数组与嵌套表
PL/SQL 变长数组
PL/SQL --> PL/SQL记录
dbms_xplan之display_cursor函数的使用