[20141217]记录长度与块大小.txt
--昨天看了http://savvinov.com/2014/12/15/4k-bug-is-not-a-bug/
--提到转载:
A couple of weeks back I received an update on my "4k bug" SR (slow multirow DML performance for small block sizes). As
it turns out, the observed behavior is not a bug. It's actually more interesting than that. It is an undocumented
feature that controls the insert mode depending on the estimated row size (the sum of all column sizes). If the
estimated column size exceeds the size of the block, then Oracle switches to row-by-row processing.
--记录长度超出块大小,oracle插入会交换到row-by-row processing方式,自己在8k的数据块测试看看。
SCOTT@test> @ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.3.0 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table t1(x number, x2 varchar2(4000),x3 varchar2(4000)) SEGMENT CREATION IMMEDIATE;
create table t2(x number, x2 varchar2(1000),x3 varchar2(1000)) SEGMENT CREATION IMMEDIATE;
--消除段延迟的影响。
set autotrace traceonly
insert into t1 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | LOAD TABLE CONVENTIONAL | T1 | | | |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVELStatistics
----------------------------------------------------------
403 recursive calls
128692 db block gets
3553 consistent gets
3877 physical reads
46041644 redo size
840 bytes sent via SQL*Net to client
852 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
100000 rows processed
insert into t2 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level
Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | LOAD TABLE CONVENTIONAL | T2 | | | |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVELStatistics
----------------------------------------------------------
358 recursive calls
30132 db block gets
6227 consistent gets
3094 physical reads
23675572 redo size
843 bytes sent via SQL*Net to client
852 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
100000 rows processed
-- 注意观察redo大小,前者46041644,后者23675572。
--46041644/23675572=1.94,增加了1倍。
exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't1',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)
exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't2',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)
SCOTT@test> set autotrace off
SCOTT@test> select table_name,blocks,avg_row_len from dba_tables where owner=user and table_name in ('T1','T2');
TABLE_NAME BLOCKS AVG_ROW_LEN
---------- ---------- -----------
T2 3016 207
T1 3016 207
--可以看到占用块大小一样。
--使用更小的表结构看看。
create table t3(x number, x2 varchar2(100),x3 varchar2(100)) SEGMENT CREATION IMMEDIATE;
set autotrace traceonly
insert into t3 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level
Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | LOAD TABLE CONVENTIONAL | T3 | | | |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVELStatistics
----------------------------------------------------------
421 recursive calls
30203 db block gets
6270 consistent gets
3083 physical reads
23683316 redo size
843 bytes sent via SQL*Net to client
852 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
100000 rows processed
--基本与t2的插入一样。
--转载作者的内容:
The query below can identify the tables that will have slow bulk DML because of the high maximum row length:
select c.owner,
c.table_name,
sum(data_length) est_row_length,
ts.block_size
from dba_tab_columns c,
(select owner, table_name, tablespace_name
from dba_tables
union select table_owner owner, table_name, tablespace_name
from dba_tab_partitions
) t,
dba_tablespaces ts
where c.owner = t.owner
and c.table_name = t.table_name
and ts.tablespace_name = t.tablespace_name
group by ts.block_size, c.owner, c.table_name
having sum(data_length)>=ts.block_size
order by 1, 2
--说明他的脚本查询的是sum(data_length)>=ts.block_size,应该查不到T1。
For the tables returned by this query, bulk DML will be internally performed row-by-row, decreasing performance and
increasing redo generation. If you want to change this, you'll need to either shrink column size, or migrate the table
to a tablespace with a larger block size.
--这个算给表结构设计不合理,一上来不管如何varchar2都是4000的一个建议,不要再这样设计表结构,根据业务选择合适的长度才是正
--道。当然这个对于批量操作才有影响。
--什么是row-by-row processing,我给看看这方面的文档,再写篇blog。