[20141217]记录长度与块大小.txt

--昨天看了http://savvinov.com/2014/12/15/4k-bug-is-not-a-bug/
--提到转载:
A couple of weeks back I received an update on my "4k bug" SR (slow multirow DML performance for small block sizes). As
it turns out, the observed behavior is not a bug. It's actually more interesting than that. It is an undocumented
feature that controls the insert mode depending on the estimated row size (the sum of all column sizes). If the
estimated column size exceeds the size of the block, then Oracle switches to row-by-row processing.

--记录长度超出块大小，oracle插入会交换到row-by-row processing方式，自己在8k的数据块测试看看。

SCOTT@test> @ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.3.0 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

create table t1(x number, x2 varchar2(4000),x3 varchar2(4000)) SEGMENT CREATION IMMEDIATE;
create table t2(x number, x2 varchar2(1000),x3 varchar2(1000)) SEGMENT CREATION IMMEDIATE;
--消除段延迟的影响。

set autotrace traceonly

insert into t1 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825

------------------------------------------------------------------------------
| Id | Operation                     | Name | Rows | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 | LOAD TABLE CONVENTIONAL      | T1   |       |            |          |
|* 2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        403 recursive calls
     128692 db block gets
       3553 consistent gets
       3877 physical reads
   46041644 redo size
        840 bytes sent via SQL*Net to client
        852 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          3 sorts (memory)
          0 sorts (disk)
     100000 rows processed

insert into t2 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level
Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id | Operation                     | Name | Rows | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 | LOAD TABLE CONVENTIONAL      | T2   |       |            |          |
|* 2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        358 recursive calls
      30132 db block gets
       6227 consistent gets
       3094 physical reads
   23675572 redo size
        843 bytes sent via SQL*Net to client
        852 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          3 sorts (memory)
          0 sorts (disk)
     100000 rows processed

-- 注意观察redo大小,前者46041644，后者23675572。
--46041644/23675572=1.94,增加了1倍。

exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't1',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)
exec sys.dbms_stats.gather_table_stats ( OwnName => 'SCOTT',TabName => 't2',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)

SCOTT@test> set autotrace off
SCOTT@test> select table_name,blocks,avg_row_len from dba_tables where owner=user and table_name in ('T1','T2');
TABLE_NAME     BLOCKS AVG_ROW_LEN
---------- ---------- -----------
T2               3016         207
T1               3016         207
--可以看到占用块大小一样。

--使用更小的表结构看看。
create table t3(x number, x2 varchar2(100),x3 varchar2(100)) SEGMENT CREATION IMMEDIATE;
set autotrace traceonly
insert into t3 select level, rpad(' ', 100, ' '),rpad('a',100,'a') from dual connect by level

Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
------------------------------------------------------------------------------
| Id | Operation                     | Name | Rows | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |      |     1 |     2   (0)| 00:00:01 |
|   1 | LOAD TABLE CONVENTIONAL      | T3   |       |            |          |
|* 2 |   CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(LEVELStatistics
----------------------------------------------------------
        421 recursive calls
      30203 db block gets
       6270 consistent gets
       3083 physical reads
   23683316 redo size
        843 bytes sent via SQL*Net to client
        852 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          3 sorts (memory)
          0 sorts (disk)
     100000 rows processed
--基本与t2的插入一样。

--转载作者的内容：

The query below can identify the tables that will have slow bulk DML because of the high maximum row length:

select c.owner,
       c.table_name,
       sum(data_length) est_row_length,
       ts.block_size
from dba_tab_columns c,
     (select owner, table_name, tablespace_name
      from dba_tables
      union select table_owner owner, table_name, tablespace_name
      from dba_tab_partitions
     ) t,
     dba_tablespaces ts
where c.owner = t.owner
and c.table_name = t.table_name
and ts.tablespace_name = t.tablespace_name
group by ts.block_size, c.owner, c.table_name
having sum(data_length)>=ts.block_size
order by 1, 2

--说明他的脚本查询的是sum(data_length)>=ts.block_size，应该查不到T1。

For the tables returned by this query, bulk DML will be internally performed row-by-row, decreasing performance and
increasing redo generation. If you want to change this, you'll need to either shrink column size, or migrate the table
to a tablespace with a larger block size.

--这个算给表结构设计不合理，一上来不管如何varchar2都是4000的一个建议，不要再这样设计表结构，根据业务选择合适的长度才是正
--道。当然这个对于批量操作才有影响。
--什么是row-by-row processing，我给看看这方面的文档，再写篇blog。

时间： 2024-11-14 11:29:48

[20141217]记录长度与块大小.txt

[20141217]记录长度与块大小.txt的相关文章

[20170926]tune2fs调整保留块百分比.txt

[20150522]bbed与数据块检查和.txt

[20161111]数据文件的第0块2.txt

[20130402]分区的段大小.txt

如何查看window文件系统块大小

《Hadoop MapReduce实战手册》一2.6 设置HDFS块大小

发生&amp;quot;指定的初始化向量(IV)与此算法的块大小不匹配&amp;quot;错误提示的解决方法

查看linux文件系统块大小的实现方法_Linux

linux查看文件系统块大小与内存页大小的简单方法_Linux