[20150926]索引压缩问题.txt

--以前如果索引前缀重复值很多，我会选择索引压缩，这样减少磁盘空间占用，索引范围扫描也可以减少磁盘IO，虽然这样可能消耗一些
--CUP资源，感觉影响不大。

--看了链接感觉自己忽略一些问题，可能导致达不到预期效果。
--通过例子来说明：

1.建立测试环境：
SCOTT@test01p> @ver1

PORT_STRING VERSION BANNER CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0 12.1.0.1.0 Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production 0

SCOTT@test01p> create table t as select rownum id , cast('test' as varchar2(20)) name from dual connect by level<=2e5;
Table created.

SCOTT@test01p> insert into t select * from t where id <=1e5;
100000 rows created.

SCOTT@test01p> commit ;
Commit complete.

2.建立索引看看：

SCOTT@test01p> create index i_t_id on t(id) pctfree 0;
Index created.

SCOTT@test01p> select index_name, leaf_blocks, blevel from dba_indexes where table_name='T';
INDEX_NAME LEAF_BLOCKS BLEVEL
-------------------- ----------- ----------
I_T_ID 598 2

--如果这个时候我选择索引压缩看看。
SCOTT@test01p> alter index i_t_id rebuild compress 1;
Index altered.

SCOTT@test01p> select index_name, leaf_blocks, blevel from dba_indexes where table_name='T';
INDEX_NAME LEAF_BLOCKS BLEVEL
-------------------- ----------- ----------
I_T_ID 688 2

--可以发现这种方式占用的磁盘空间反而变大。

3.实际上还可以通过分析索引确定选择压缩的前缀。

SCOTT@test01p> validate index i_t_id;
Index analyzed.

SCOTT@test01p> @i
HEIGHT BLOCKS NAME LF_ROWS LF_BLKS LF_ROWS_LEN LF_BLK_LEN BR_ROWS BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS
---------- ---------- ---------- ---------- ---------- ----------- ---------- ---------- ---------- ----------- ---------- ----------- --------------- -------------
3 768 I_T_ID 300000 688 3300000 7996 687 3 8259 8032 0 0 200000

MOST_REPEATED_KEY BTREE_SPACE USED_SPACE PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS PRE_ROWS PRE_ROWS_LEN OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------------- ----------- ---------- ---------- ------------ -------------------- ---------- ------------ -------------- ----------------
2 5525344 5496340 100 1.5 4.25 200020 2188081 0 12

--实际上执行@i ,就是查询index_stats视图，因为显示的原因，我分开执行了2次。
--OPT_CMPR_COUNT＝0 ，索引压缩前缀是0，选择不压缩。

4.继续插入数据：
SCOTT@test01p> insert into t select * from t where id <=1e5;
200000 rows created.

-- 这样id=1 - 1e5 的数据每个存在4条。

SCOTT@test01p> @i
HEIGHT BLOCKS NAME LF_ROWS LF_BLKS LF_ROWS_LEN LF_BLK_LEN BR_ROWS BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS
---------- ---------- ---------- ---------- ---------- ----------- ---------- ---------- ---------- ----------- ---------- ----------- --------------- -------------
3 1280 I_T_ID 500000 1138 5500000 7996 1137 5 13555 8032 0 0 200000

--这种情况下OPT_CMPR_COUNT＝1,选择索引压缩才有效果。看看对比情况：

SCOTT@test01p> alter index i_t_id rebuild compress 1;
Index altered.

SCOTT@test01p> select index_name, leaf_blocks, blevel from dba_indexes where table_name='T';
INDEX_NAME LEAF_BLOCKS BLEVEL
-------------------- ----------- ----------
I_T_ID 963 2

SCOTT@test01p> alter index i_t_id rebuild nocompress ;
Index altered.

SCOTT@test01p> select index_name, leaf_blocks, blevel from dba_indexes where table_name='T';
INDEX_NAME LEAF_BLOCKS BLEVEL
-------------------- ----------- ----------
I_T_ID 995 2

--对比发现即使在这样的情况下，选择索引压缩获得的效益也非常小。不如不选择索引压缩。

5.12c 还支持Index Advanced Compression技术，我目前的测试环境还不支持：

SCOTT@test01p> alter index i_t_id rebuild compress advanced low;
alter index i_t_id rebuild compress advanced low
*
ERROR at line 1:
ORA-65451: Advanced index compression is not supported for tablespaces on this storage type.

--无法测试。

6.总结：
--总之索引压缩要在前缀重复值达到一定程度的情况下有效，而且如果存在许多许多单一的值的情况效果会非产差。
--像我前面的测试1e5+1 - 2e5 之间的值仅仅出现一次，这个因素直接影响了索引压缩效率。

时间： 2024-11-13 03:37:12

[20150926]索引压缩问题.txt

[20150926]索引压缩问题.txt的相关文章

DB2面向OLTP环境的物理数据库设计：数据和索引压缩

[20171031]rman备份压缩模式.txt

[20171202]关于函数索引的状态.txt

[20150304]唯一索引与阻塞.txt

[20150626]建立索引pctfree=0.txt

[20130110]linux传输文件-tar+nc+各种压缩方式.txt

[20131013]索引部分数据.txt

[20130815]关于虚拟索引的问题.txt

[20121015]探索索引-学习bbed.txt