原创 转载请注明出处
先引出ORACLE WAIT INTERFACE中的原话:
In addition to SQL tuning, it may also be worthwhile to check the index’s clustering factor if the execution plan calls for table access by index rowid. The clustering factor of an index defines how ordered the rows are in the table. It affects the number of I/Os required for the whole operation. If the DBA_INDEXES.CLUSTERING_FACTOR of the index approaches the number of blocks in the table, then most of the rows in the table are ordered. This is desirable. However, if the clustering factor approaches the number of rows in the table, it means the rows in the table are randomly ordered. In this case, it is unlikely for the index entries in the same leaf block to point to rows in the same data block, and thus it requires more I/Os to complete the operation. You can improve the index’s clustering factor by rebuilding the table so that rows are ordered according to the index key and rebuilding the index thereafter. What happens if the table has more than one index? Well, that is the downside. You can only cater to the most used index.
然后给出中文概念:
在索引的分析数据上clustering_factor是一个很重要的参数,表示的是索引和表之间的关系,因为,索引是按照一定的顺序排列的,但是,对于表来说是按照一种heap的形式存放,每一行可能分布在段上任何一个块上,所以要是通过索引来查找数据行的时候,就有可以一个索引块对应多个,甚至全部表的块,所以引入了clustering_factor这个参数来表示表上数据存放和索引之间的对应关系。这样CBO就可以根据这个参数来判断使用这个索引产生的cost是多少。
一般来说,如果这个表的排列是按照索引列的顺序存放数据的话,这个参数就应该和数据表上的块相类似。
然后做简单的试验证明:
1、试验1
SQL> create table test10
2 as select * from dba_tables;
SQL> create index test10_ind
2 on test10(SAMPLE_SIZE);
execute dbms_stats.gather_table_stats(ownname => 'sys',tabname => 'TEST10',cascade => true);
SQL> select BLOCKS from dba_tables where table_name='TEST10';
BLOCKS
----------
46
SQL> select dba_indexes.clustering_factor from dba_indexes where dba_indexes.index_name='TEST10_IND';
CLUSTERING_FACTOR
-----------------
496
由于是建立索引的时候是以SAMPLE_SIZE为关键字,这个时候索引会对其排序然后形成一个B-Tree的结构,当然这个时候索引上的一个块也许通过ROWID会访问TABLE上的多个块,所以因子就是496,相差很大。
2、试验2
SQL> create table test11
2 as
3 select * from dba_tables order by SAMPLE_SIZE;
SQL> create table index11
2 on test11(SAMPLE_SIZE);
SQL> execute dbms_stats.gather_table_stats(ownname => 'sys',tabname => 'TEST11',cascade => true);
SQL> select BLOCKS from dba_tables where table_name='TEST11';
BLOCKS
----------
46
SQL> select dba_indexes.clustering_factor from dba_indexes where dba_indexes.index_name='INDEX11';
CLUSTERING_FACTOR
-----------------
44
这次的试验建立表的时候我使用了索引关键字进行排序,这个时候看到索引的群集因子和表的块数相差不大,除去表中HEADER块,其实是相等的。这个时候索引块一一对应表块,通过索引访问表的I/O也会低,是最好的