[20111223]索引键值在B tree索引块中的顺序.txt

参考链接：http://www.adellera.it/blog/2009/05/24/order-keys-inside-index-blocks/

自己为了加强理解重复一下对方的测试！

1.建立测试表以及索引

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> create table t (x varchar2(10));
SQL> create index i_t_x on t(x);

随机插入如下记录:

insert into t values('000000');
insert into t values('777777');
insert into t values('111111');
insert into t values('666666');
insert into t values('222222');
insert into t values('555555');
insert into t values('333333');
insert into t values('444444');
commit ;

2.转储对应的索引块:

SQL> select header_file,header_block from dba_segments where wner='SCOTT' and segment_name='I_T_X';

HEADER_FILE HEADER_BLOCK
----------- ------------
4 11298

SQL> alter system dump datafile 4 block 11299;

3.查看:

Block header dump: 0x01002c23
Object id on Block? Y
seg/obj: 0x12afc csc: 0x00.3d4076 itc: 2 flg: E typ: 2 - INDEX
brn: 0 bdba: 0x1002c20 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl           Xid                  Uba         Flag Lck        Scn/Fsc
0x01   0x0000.000.00000000 0x00000000.0000.00 ----    0 fsc 0x0000.00000000
0x02   0x0035.001.00000007 0x00c00ae0.000e.1a --U-    8 fsc 0x0000.003d4081
Leaf block dump
===============
header address 47190358755940=0x2aeb5c920a64
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: pcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 8
kdxcofbo 52=0x34
kdxcofeo 7904=0x1ee0
kdxcoavs 7852
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8032
row#0[8016] flag: ------, lock: 2, len=16
col 0; len 6; (6): 30 30 30 30 30 30         -- 000000
col 1; len 6; (6): 01 00 05 0e 00 00
row#1[7984] flag: ------, lock: 2, len=16
col 0; len 6; (6): 31 31 31 31 31 31         -- 111111
col 1; len 6; (6): 01 00 05 0e 00 02
row#2[7952] flag: ------, lock: 2, len=16
col 0; len 6; (6): 32 32 32 32 32 32         -- 222222
col 1; len 6; (6): 01 00 05 0e 00 04
row#3[7920] flag: ------, lock: 2, len=16
col 0; len 6; (6): 33 33 33 33 33 33         -- 333333
col 1; len 6; (6): 01 00 05 0e 00 06
row#4[7904] flag: ------, lock: 2, len=16
col 0; len 6; (6): 34 34 34 34 34 34         -- 444444
col 1; len 6; (6): 01 00 05 0e 00 07
row#5[7936] flag: ------, lock: 2, len=16
col 0; len 6; (6): 35 35 35 35 35 35         -- 555555
col 1; len 6; (6): 01 00 05 0e 00 05
row#6[7968] flag: ------, lock: 2, len=16
col 0; len 6; (6): 36 36 36 36 36 36         -- 666666
col 1; len 6; (6): 01 00 05 0e 00 03
row#7[8000] flag: ------, lock: 2, len=16
col 0; len 6; (6): 37 37 37 37 37 37         -- 777777
col 1; len 6; (6): 01 00 05 0e 00 01
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 4 minblk 11299 maxblk 11299

注意看row#0 括号的数字,表示该键值的索引块的位置,col0是索引键值,col1是rowid。

插入顺序:   '000000' '777777' '111111'    '666666' '222222'    '555555'       '333333'    '444444'
在块中位置:   8016       8000    7984          7968           7952            7936       7920         7904
   可以看出索引在块中插入并不是安装键值排序的，而是从尾部按照插入顺序插入的。

4.在"row directory"中记录键值的具体位置：

Start dump data blocks tsn: 4 file#:4 minblk 11299 maxblk 11299
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4, rdba=16788515
BH (0xa8ff5fd8) file#: 4 rdba: 0x01002c23 (4/11299) class: 1 ba: 0xa8f2a000
set: 6 pool 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 186,19
dbwrid: 0 obj: 76540 objn: 76540 tsn: 4 afn: 4 hint: f
hash: [0xbe252cb0,0xbe252cb0] lru: [0xa8ff61f0,0xa8ff5f90]
ckptq: [NULL] fileq: [NULL] objq: [0xa8ff6218,0xba80aa40]
st: XCURRENT md: NULL tch: 2
flags: block_written_once redo_since_read
LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [1]
cr pin refcnt: 0 sh pin refcnt: 0
Block dump from disk:
buffer tsn: 4 rdba: 0x01002c23 (4/11299)
scn: 0x0000.003d4081 seq: 0x01 flg: 0x06 tail: 0x40810601
frmt: 0x02 chkval: 0xbf9f type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00002AEB5C920A00 to 0x00002AEB5C922A00
2AEB5C920A00 0000A206 01002C23 003D4081 06010000 [....#,...@=.....]
2AEB5C920A10 0000BF9F 00000002 00012AFC 003D4076 [.........*..v@=.]
2AEB5C920A20 00000000 00320002 01002C20 00000000 [......2. ,......]
2AEB5C920A30 00000000 00000000 00000000 00000000 [................]
2AEB5C920A40 00000000 00010035 00000007 00C00AE0 [....5...........]
2AEB5C920A50 001A000E 00002008 003D4081 00000000 [..... ...@=.....]
2AEB5C920A60 00000000 02800000 00000000 00340008 [..............4.]
2AEB5C920A70 1EAC1EE0 00000000 00000000 00000000 [................]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2AEB5C920A80 00000000 00001F60 1F301F50 1EF01F10 [....`...P.0.....]
2AEB5C920A90 1F001EE0 1F401F20 00000000 00000000 [.... .@.........]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2AEB5C920AA0 00000000 00000000 00000000 00000000 [................]
Repeat 489 times
2AEB5C922940 00000000 34060200 34343434 00010634 [.......444444...]
2AEB5C922950 07000E05 33060200 33333333 00010633 [.......333333...]
2AEB5C922960 06000E05 35060200 35353535 00010635 [.......555555...]
2AEB5C922970 05000E05 32060200 32323232 00010632 [.......222222...]
2AEB5C922980 04000E05 36060200 36363636 00010636 [.......666666...]
2AEB5C922990 03000E05 31060200 31313131 00010631 [.......111111...]
2AEB5C9229A0 02000E05 37060200 37373737 00010637 [.......777777...]
2AEB5C9229B0 01000E05 30060200 30303030 00010630 [.......000000...]
2AEB5C9229C0 00000E05 00000000 00000000 00000000 [................]
2AEB5C9229D0 00000000 00000000 00000000 00000000 [................]
Repeat 1 times
2AEB5C9229F0 00000000 00000000 00000000 40810601 [...............@]

kdxlebksz 8032,从row directory中确定如下：
2AEB5C920A80 00000000 00001F60 1F301F50 1EF01F10 [....`...P.0.....]
2AEB5C920A90 1F001EE0 1F401F20 00000000 00000000 [.... .@.........]

插入顺序: '000000' '777777' '111111' '666666' '222222' '555555' '333333' '444444'
在块中位置: 8016 8000 7984 7968 7952 7936 7920 7904

1F60=8032 end of "bottom heap" (kdxlebksz), also 8016 + 16

1F30=7984   '111111'
1F50=8016   '000000'
1EF0=7920   '333333'
1F10=7952   '222222'
1F00=7936   '555555'
1EE0=7904   '444444'
1F40=8000   '777777'
1F20=7968   '666666'

-- 摘抄链接，不自己写了，以免出现不必要的错误：http://www.adellera.it/blog/2009/05/24/order-keys-inside-index-blocks/

that is, the addresses in the row directory are ordered by the (binary) value of the key they point to - remember that you must swap the high and low 2-byte quantities in each 32-bit word (the usual little-endian ordering; I haven't checked whether the results would be different on a machine that is not little-endian, as my one - an x86 - is).
如果交换字节顺序，可以确定按照以上地址顺序排列的。说明在索引插入时，键值从底部插入，而排序仅仅需要修改这个row directory就ok了。这样插入效率比较高。

The reason for this ordering is to improve the efficiency of typical visits to the index block. Consider a range scan, when the kernel has to find all the index keys contained in the search interval [min_extreme, max_extreme] (possibly min_extreme=max_extreme for equality predicates). As it is well known, the kernel first walks the branch blocks to identify the leaf block that contains min_extreme (or max_extreme, but that would be the same for our discussion); then, it has to identify all the keys that are >= min_extreme and = min_extreme, and then walk the next entries in the row directory until it gets to the last one that is
The advantage is huge, since visiting all the keys is a O( N ) operation, while a binary search has only O ( log(2,N) ) complexity.

To fully appreciate this on a concrete example, consider that each key of our example needs 16+2 bytes of storage, hence a perfectly packed 8K block might contain approximately 8000 / 18 = 444 keys. Thanks to the ordering, for the very frequent equality search (classic for Primary Key searches for example), the processing consists essentially of the binary search only - hence the number of keys to be considered are down to about ceil( log(2, 444) ) = 9, thus consuming only 9/444 = 2%
(!!) of the resources.

It is also worth remembering that this way, only a portion of the block is accessed, thus needing less accesses to central memory (it is unlikely in fact that the whole 8K can be found in the processor's caches), thus reducing the elapsed time considerably since stalling for central memory is a very common bottleneck in database systems - and thus improving scalability for the whole system thanks to the reduction of the traffic on the memory bus.

Of course there's a price to be paid for modifications, since each modification has to keep the row directory ordered, thus shifting the slots of the row directory, which is a relatively expensive operation. Oracle pays an higher bill at modification time to save (hugely) at read time.

--modify 代价很大!

As a final observation: note that the branch block visit can be considered a binary search as well, "pre-calculated" by the B+tree structure - the ordering inside a block is just the same divide-and-conquer algorithm applied in a different way.

时间： 2024-10-24 21:19:06

[20111223]索引键值在B tree索引块中的顺序.txt

[20111223]索引键值在B tree索引块中的顺序.txt的相关文章

[20160711索引键值在B tree索引块中的顺序2

[20160711]索引键值在Btree索引块中的顺序3

非分区键的GLOBAL分区索引键值更新会造成麻烦吗？

第六章——根据执行计划优化性能（3）——键值查找

php数组索引与键值操作技巧实例分析_php技巧

php获取数组中键值最大数组项的索引值[原创]_php技巧

索引键的唯一性（1/4）：堆表上的唯一与非唯一非聚集索引的区别

C#中ConcurrentDictionary&amp;lt;TKey, TValue&amp;gt; 类中Tkey键是否排序和支持索引下标搜索

索引数据块中出现热块及Latch的场景示例