[20140209]行迁移和expdp导出.txt
前一阵子与别人聊天,谈到一个系统升级expdp导出很慢,我比较熟悉这个系统,当他说出导出很慢的那张表的时候,
我随口讲不会这个表存在大量的行迁移吧。我记得以前学习oracle,听别人讲课,讲过一句话,如果你看这个系统
的用户模式下所有表的pctfree设置都是10的话,那么这个系统没有dba管理。按照这样看,中国大部分数据库系统
没有dba管理。
我自己以前对行迁移还是比较重视的,当然现在变懒了。我发现许多dba也不是太重视这个问题,慢慢我对这些调整
也不重视。而且那上面提到的那张表,里面存在一个修改过程,把一个标志从'0'=>'1',同时在另外的备注字段保存
修改的日期,保存的日期实际上以字符的形式来保存的,这样内容'2012-02-02 10:10:10'占用的空间19个字符,这
样修改记录长度每条都要增加19个字节。如果原来记录长度很短的情况下,保存在一个数据块的记录数会很多,发
生行迁移的行会非常多。我自己以前对这个表也很重视,总想彻底消除行迁移情况,但是我最终放弃了这个想法,
我发现要彻底消除pctfree要接近40才基本消除行迁移,但是这样带来另外一个副作用就是导致空间的浪费。因为
记录并不总等你达到pctfree的设置才进行修改,因为插入与修改是交互进行的。另外的方法就是使用ALTER TABLE
MINIMIZE RECORDS_PER_BLOCK.参考如下链接:
http://blog.itpub.net/267265/viewspace-763315/
下面做一个测试看看出现大量行迁移时expdp的情况。
1.建立测试例子:
SCOTT@test> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
create table t (id number, name varchar2(20)) pctfree 0;
insert into t
with a as (select /*+ Materialize */ rownum id from dual connect by levelselect rownum,null from a x,a where rownum
commit ;
update t set name=lpad(id,20,'y');
commit;
--建立CHAINED_ROWS表。
--SQL> analyze table t list chained rows into chained_rows;
--注: 执行前要建立chained_rows表,最好不要放在system表空间.$ORACLE_HOME/rdbms/admin/utlchain.sql.
--truncate table chained_rows;
SCOTT@test> select count(*) from chained_rows;
COUNT(*)
----------
1000000
--存在大量的行迁移。
2.使用expdp导出测试:
--CREATE OR REPLACE DIRECTORY DATA_PUMP_DIR AS '/u01/app/oracle11g/admin/test/dpdump/';
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO EXP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO IMP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO SCOTT;
--exec dbms_workload_repository.create_snapshot();
expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
--exec dbms_workload_repository.create_snapshot();
--如果查看select * from v$session_wait where wait_class'Idle',可以发现大量的是db file sequential read等待事件。
$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:38:30 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01": scott/******** directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T" 28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
/u01/app/oracle11g/admin/test/dpdump/t.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:38:55
--测试多次,大约在25秒完成。
--如果我建立一张新表tx,导出看看。
SCOTT@test> create table tx as select * from t;
Table created.
$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx dumpfile=tx.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:50:51 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01": scott/******** directory=DATA_PUMP_DIR tables=tx dumpfile=tx.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 35 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."TX" 28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
/u01/app/oracle11g/admin/test/dpdump/tx.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:51:00
--测试多次,大约在10秒完成.
3.使用strace跟踪看看。
看到db file sequential read等待事件,如果执行多次,数据块应该都集中在内存,应该看到的db file sequential read不多,那是什么情况导
致缓慢呢?
$ strace -c -o /tmp/aa1 expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
36.82 0.006311 37 169 2 read
12.45 0.002134 14 158 71 open
10.89 0.001866 30 62 write
8.61 0.001476 16 95 close
4.56 0.000782 9 89 mmap
3.99 0.000683 683 1 clone
3.61 0.000618 12 53 munmap
2.67 0.000457 7 64 fstat
2.38 0.000408 11 38 brk
1.70 0.000291 7 42 rt_sigaction
1.64 0.000281 7 38 fcntl
1.47 0.000252 7 36 lseek
1.35 0.000232 9 27 18 stat
1.02 0.000174 9 19 mprotect
0.88 0.000151 151 1 execve
0.65 0.000112 16 7 socket
0.65 0.000111 8 14 gettimeofday
0.58 0.000100 8 13 rt_sigprocmask
0.54 0.000093 12 8 6 access
0.46 0.000079 16 5 4 connect
0.40 0.000069 7 10 uname
0.40 0.000068 34 2 getdents64
0.34 0.000058 7 8 getrlimit
0.29 0.000050 7 7 getuid
0.20 0.000035 12 3 recvmsg
0.18 0.000030 30 1 readlink
0.16 0.000027 9 3 getcwd
0.15 0.000025 8 3 futex
0.13 0.000023 12 2 pipe
0.13 0.000022 22 1 sendto
0.12 0.000021 11 2 bind
0.08 0.000014 7 2 getsockname
0.08 0.000014 7 2 setrlimit
0.08 0.000013 7 2 times
0.08 0.000013 13 1 _sysctl
0.06 0.000010 10 1 ioctl
0.04 0.000007 7 1 getppid
0.04 0.000007 7 1 arch_prctl
0.04 0.000007 7 1 gettid
0.04 0.000007 7 1 time
0.04 0.000007 7 1 set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00 0.017138 994 101 total
$ strace -c -o /tmp/aa1 expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx dumpfile=tx.dmp
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
33.23 0.004392 26 169 2 read
15.27 0.002018 33 62 write
13.14 0.001737 11 158 71 open
5.23 0.000691 691 1 clone
4.84 0.000640 7 89 mmap
4.72 0.000624 7 95 close
4.35 0.000575 11 53 munmap
2.50 0.000330 5 64 fstat
2.38 0.000315 8 38 brk
1.65 0.000218 5 42 rt_sigaction
1.58 0.000209 6 38 fcntl
1.40 0.000185 5 36 lseek
1.31 0.000173 6 27 18 stat
1.18 0.000156 156 1 execve
1.01 0.000133 7 19 mprotect
0.63 0.000083 12 7 socket
0.62 0.000082 16 5 4 connect
0.61 0.000081 10 8 6 access
0.57 0.000076 5 14 gettimeofday
0.54 0.000071 5 13 rt_sigprocmask
0.49 0.000065 33 2 getdents64
0.34 0.000045 5 10 uname
0.30 0.000039 5 8 getrlimit
0.26 0.000034 5 7 getuid
0.25 0.000033 33 1 readlink
0.22 0.000029 10 3 recvmsg
0.20 0.000026 26 1 sendto
0.17 0.000023 8 3 getcwd
0.17 0.000022 11 2 pipe
0.15 0.000020 7 3 futex
0.14 0.000018 9 2 bind
0.08 0.000011 6 2 getsockname
0.08 0.000011 11 1 _sysctl
0.08 0.000011 6 2 setrlimit
0.08 0.000010 5 2 times
0.06 0.000008 8 1 ioctl
0.04 0.000005 5 1 arch_prctl
0.04 0.000005 5 1 gettid
0.04 0.000005 5 1 time
0.04 0.000005 5 1 set_tid_address
0.03 0.000004 4 1 getppid
------ ----------- ----------- --------- --------- ----------------
100.00 0.013218 994 101 total
--很难看出差异在那里?
4.常见解决方法:
通过上面对比,如果表很大,行迁移很多的情况下,时间差异还是很明显的。
最根本的解决是检查应用,设置大的pctfree来消除行迁移。
move 表空间以及在线重定义表或者ctas都可以解决已经出现的情况。但是非常不合理,特别是表很大的情况下。
一些索引要重建。
如何在这种情况下加快expdp操作呢?我google许多链接查询: Row Migration expdp
http://www.ora-solutions.net/web/2012/09/12/datapump-export-suffering-from-oracle-row-migration/
http://nzdba.wordpress.com/2013/06/30/migration-vs-datapump/
--里面都提到access methods,摘要如下:
With datapump there are 2 different access methods: EXTERNAL_TABLE and DIRECT_PATH. Usually, the datapump utility
decides on it's own which method to use. It turned out that with EXTERNAL_TABLE, the table export takes only 10 minutes
and does not perform these single-block I/O. It only appears with DIRECT_PATH.
--当我使用expdp help=y时,我发现access_method参数并不存在,难道又是某个隐含秘密?我执行如下:
$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 11:25:46 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01": scott/******** directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T" 28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
/u01/app/oracle11g/admin/test/dpdump/t_good.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 11:25:56
--测试多次,大约在10秒完成。
5.补充:
--链接http://blog.csdn.net/tianlesoftware/article/details/6090757
Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:
The parameter ACCESS_METHOD is an undocumented parameter and should only be used when requested by Oracle Support.
If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.
If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:
ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."EMP" using client specified AUTOMATIC method
The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).
If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.
Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:
...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."MY_TAB" using client specified DIRECT_PATH method ...
总结:
如果行迁移很大,expdp导致加入access_method=external_table可以加快导出的速度。