[20171122]参数filesystemio_options=setall.txt
--//首先给出oracle官方的解析:
https://docs.oracle.com/cd/E11882_01/server.112/e41573/os.htm#PFGRF94410
9.1.1.2 FILESYSTEMIO_OPTIONS Initialization Parameter
You can use the FILESYSTEMIO_OPTIONS initialization parameter to enable or disable asynchronous I/O or direct I/O on
file system files. This parameter is platform-specific and has a default value that is best for a particular platform.
FILESYTEMIO_OPTIONS can be set to one of the following values:
ASYNCH: enable asynchronous I/O on file system files, which has no timing requirement for transmission.
DIRECTIO: enable direct I/O on file system files, which bypasses the buffer cache.
SETALL: enable both asynchronous and direct I/O on file system files.
NONE: disable both asynchronous and direct I/O on file system files.
--//本文测试filesystemio_options=setall,SETALL: enable both asynchronous and direct I/O on file system files.
--//意味者OS的缓存毫无用处.通过以下测试说明问题.
1.环境:
--// 说明数据库已经全部移动到本地磁盘.
SYS@book> @ &r/ver1
PORT_STRING VERSION BANNER
-------------------- ---------- ----------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
SYS@book> alter system set filesystemio_options=setall scope=spfile;
System altered.
--//重启数据库.
2.建立测试环境:
--//以前已经测试直接路径读的情况,为了避免直接路径读干扰关闭直接路径读.
SCOTT@book> alter system set "_serial_direct_read"=never scope=memory;
Session altered.
SCOTT@book> create table t as select rownum id from dual connect by level<=2;
Table created.
SCOTT@book> ALTER TABLE t MINIMIZE RECORDS_PER_BLOCK ;
Table altered.
--//这样可以实现每块2条记录.
SCOTT@book> insert into t select rownum+2 from dual connect by level <=8e4-2;
79998 rows created.
SCOTT@book> commit ;
Commit complete.
--//分析表略.
SCOTT@book> select OWNER,SEGMENT_NAME,SEGMENT_TYPE,HEADER_FILE,HEADER_BLOCK,BYTES,BLOCKS from dba_segments where owner=user and segment_name='T';
OWNER SEGMENT_NAME SEGMENT_TYPE HEADER_FILE HEADER_BLOCK BYTES BLOCKS
------ -------------------- ------------------ ----------- ------------ ---------- ----------
SCOTT T TABLE 4 546 335544320 40960
--//占用335544320/1024/1024=320M,40960块.
SCOTT@book> select object_id,data_object_id from dba_objects where owner=user and object_name='T';
OBJECT_ID DATA_OBJECT_ID
---------- --------------
90736 90736
3.测试os不缓存与oracle的数据缓存(大部分)的情况:
$ find /mnt/ramdisk/book/ -name "*.*" -print0 | xargs -0 -I{} cachedel {}
$ cachedel /mnt/ramdisk/book/users01.dbf
$ cachestats /mnt/ramdisk/book/users01.dbf
/mnt/ramdisk/book/users01.dbf pages in cache: 0/554498 (0.0%) [filesize=2217992.0K, pagesize=4K]
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
40196
--//如果显示很少,执行1次select count(*) from t;让数据缓存相应的数据块.
SCOTT@book> set timing on
SCOTT@book> select count(*) from t;
COUNT(*)
----------
80000
Elapsed: 00:00:00.09
--//访问需要0.09秒.
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
40196
$ cachestats /mnt/ramdisk/book/users01.dbf
/mnt/ramdisk/book/users01.dbf pages in cache: 0/554498 (0.0%) [filesize=2217992.0K, pagesize=4K]
--//注意看os不缓存/mnt/ramdisk/book/users01.dbf,filesystemio_options=setall本意从这里就可以看出来.
--//就是os不缓存数据文件.也就是无论如何读写,数据库都不会缓存数据文件,日志文件.
$ find /mnt/ramdisk/book/ -name "*.*" -print0 | xargs -0 -I{} cachedel {}
$ find /mnt/ramdisk/book/ -name "control*.*" -print0 | xargs -0 -I{} cachestats {}
/mnt/ramdisk/book/control02.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
/mnt/ramdisk/book/control01.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
$ find /mnt/ramdisk/book/ -name "redo0[123].*" -print0 | xargs -0 -I{} cachestats {}
/mnt/ramdisk/book/redo03.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redo01.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redo02.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
--//清除os缓存后,无论数据库做如何操作,os都不会缓存oracle相关的文件.换一句话.oracle要读取数据不会利用os的缓存.
$ find /mnt/ramdisk/book/ -name "*.*" -print0 | xargs -0 -I{} cachestats {}
/mnt/ramdisk/book/redo03.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/system01.dbf pages in cache: 0/194562 (0.0%) [filesize=778248.0K, pagesize=4K]
/mnt/ramdisk/book/undotbs01.dbf pages in cache: 0/275202 (0.0%) [filesize=1100808.0K, pagesize=4K]
/mnt/ramdisk/book/control02.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
/mnt/ramdisk/book/redostb04.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redostb03.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redo01.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/sysaux01.dbf_20171120 pages in cache: 0/240642 (0.0%) [filesize=962568.0K, pagesize=4K]
/mnt/ramdisk/book/redostb01.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/sysaux01.dbf pages in cache: 0/240642 (0.0%) [filesize=962568.0K, pagesize=4K]
/mnt/ramdisk/book/redostb02.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/temp01.dbf pages in cache: 0/105986 (0.0%) [filesize=423944.0K, pagesize=4K]
/mnt/ramdisk/book/control01.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
/mnt/ramdisk/book/users01.dbf pages in cache: 0/554498 (0.0%) [filesize=2217992.0K, pagesize=4K]
/mnt/ramdisk/book/tea01.dbf pages in cache: 0/10242 (0.0%) [filesize=40968.0K, pagesize=4K]
/mnt/ramdisk/book/example01.dbf pages in cache: 0/88642 (0.0%) [filesize=354568.0K, pagesize=4K]
/mnt/ramdisk/book/redo02.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
4.测试os不缓存,而oracle的数据不缓存呢:
# cachedel /mnt/ramdisk/book/users01.dbf
$ cachestats /mnt/ramdisk/book/users01.dbf
/mnt/ramdisk/book/users01.dbf pages in cache: 0/554498 (0.0%) [filesize=2217992.0K, pagesize=4K]
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
40196
SCOTT@book> alter system flush buffer_cache;
System altered.
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
0
SCOTT@book> select count(*) from t;
COUNT(*)
----------
80000
Elapsed: 00:00:01.00
--//你可以发现需要1.00秒.
$ cachestats /mnt/ramdisk/book/users01.dbf
/mnt/ramdisk/book/users01.dbf pages in cache: 0/554498 (0.0%) [filesize=2217992.0K, pagesize=4K]
--//可以发现os缓存还是0.
5.测试os缓存,而oracle的数据不缓存的情况:
$ dd if=/mnt/ramdisk/book/users01.dbf of=/dev/null bs=10M
216+1 records in
216+1 records out
2271223808 bytes (2.3 GB) copied, 3.6771 seconds, 618 MB/s
$ cachestats /mnt/ramdisk/book/users01.dbf
/mnt/ramdisk/book/users01.dbf pages in cache: 554498/554498 (100.0%) [filesize=2217992.0K, pagesize=4K]
SCOTT@book> alter system flush buffer_cache;
System altered.
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
0
SCOTT@book> select count(*) from t;
COUNT(*)
----------
80000
Elapsed: 00:00:01.05
--//时间依旧是1秒.os缓存毫无用处.
SCOTT@book> select count(*) from v$bh where OBJD=90736 and STATUS<>'free';
COUNT(*)
----------
40196
Elapsed: 00:00:00.08
SCOTT@book> select count(*) from t;
COUNT(*)
----------
80000
Elapsed: 00:00:00.09
--//一旦数据库缓存相应数据块,查询很快.
--//其它情况不再测试,设置filesystemio_options=setall,采用是异步IO而直接读写相应数据文件.
--//而os的缓存对查询性能改善无用,也就是oracle会绕过os缓存直接读写数据文件.
--//讲的再通俗一点任何读写操作都是硬读,硬写数据文件.日志文件,控制文件.
$ find /mnt/ramdisk/book/ -name "*.*" -print0 | xargs -0 -I{} cachestats {}
/mnt/ramdisk/book/redo03.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/system01.dbf pages in cache: 0/194562 (0.0%) [filesize=778248.0K, pagesize=4K]
/mnt/ramdisk/book/undotbs01.dbf pages in cache: 0/275202 (0.0%) [filesize=1100808.0K, pagesize=4K]
/mnt/ramdisk/book/control02.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
/mnt/ramdisk/book/redostb04.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redostb03.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/redo01.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/sysaux01.dbf_20171120 pages in cache: 0/240642 (0.0%) [filesize=962568.0K, pagesize=4K]
/mnt/ramdisk/book/redostb01.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/sysaux01.dbf pages in cache: 0/240642 (0.0%) [filesize=962568.0K, pagesize=4K]
/mnt/ramdisk/book/redostb02.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
/mnt/ramdisk/book/temp01.dbf pages in cache: 0/105986 (0.0%) [filesize=423944.0K, pagesize=4K]
/mnt/ramdisk/book/control01.ctl pages in cache: 0/2612 (0.0%) [filesize=10448.0K, pagesize=4K]
/mnt/ramdisk/book/users01.dbf pages in cache: 554492/554498 (100.0%) [filesize=2217992.0K, pagesize=4K]
/mnt/ramdisk/book/tea01.dbf pages in cache: 0/10242 (0.0%) [filesize=40968.0K, pagesize=4K]
/mnt/ramdisk/book/example01.dbf pages in cache: 0/88642 (0.0%) [filesize=354568.0K, pagesize=4K]
/mnt/ramdisk/book/redo02.log pages in cache: 0/12801 (0.0%) [filesize=51200.5K, pagesize=4K]
--//可以发现除了我执行dd缓存/mnt/ramdisk/book/users01.dbf文件,其它的缓存还是0.
--//这样带来一个"好处"就是一定程度减少数据文件,日志文件丢失,避开os缓存的影响.
--//也正是os不会缓存这些文件,你可以相应设置sga等相关参数大一些,避免一些内存浪费.
6.存疑:
--//我在测试中FILESYSTEMIO_OPTIONS=asynch中自己一直存在一个疑问:
--//就是要一定程度冒丢失数据的风险.
--//而FILESYSTEMIO_OPTIONS=setall因为os缓存没用.这样我觉得一定程度减少数据丢失风险,但是一定程度增加IO操作,
--//如果磁盘性能不好,这样会相对慢一些.
--//换一句话讲如果你数据库本身存在IO问题,这样的设置可能会提前暴露这些问题.
--//我个人的建议如果你内存不足,而磁盘存储很好(磁盘IO很快),可以采用这样的模式.
SELECT file_no
,filetype_id
,filetype_name
,ASYNCH_IO
,access_method
,RETRIES_ON_ERROR
FROM V$IOSTAT_FILE order by 1;
FILE_NO FILETYPE_ID FILETYPE_NAME ASYNCH_IO ACCESS_METH RETRIES_ON_ERROR
---------- ----------- ---------------------------- --------- ----------- ----------------
0 0 Other ASYNC_OFF OS_LIB 0
0 1 Control File ASYNC_OFF 0
0 18 Data Pump Dump File ASYNC_OFF 0
0 17 Flashback Log ASYNC_OFF 0
0 12 Data File Copy ASYNC_OFF 0
0 11 Archive Log Backup ASYNC_OFF 0
0 10 Data File Incremental Backup ASYNC_OFF 0
0 9 Data File Backup ASYNC_OFF 0
0 3 Log File ASYNC_OFF 0
0 4 Archive Log ASYNC_OFF 0
1 6 Temp File ASYNC_ON OS_LIB 0
1 2 Data File ASYNC_ON OS_LIB 0
2 2 Data File ASYNC_ON OS_LIB 0
3 2 Data File ASYNC_ON OS_LIB 0
4 2 Data File ASYNC_ON OS_LIB 0
5 2 Data File ASYNC_ON OS_LIB 0
6 2 Data File ASYNC_ON OS_LIB 0
17 rows selected.
--//找到一个链接:http://www.cnblogs.com/sopost/p/3589731.html,摘要如下:
LGWR 会绕过操作系统的缓冲,直接写入数据文件中,以确保REDO LOG 的信息不会因为操作系统出现故障(比如宕机)而丢失要求确保写入
REDO LOG 文件的数据。
--//这点很重要!!如何证明呢?我的感觉在设置filesystemio_options=asynch的情况下不是这样,而是也会OS缓存.我的测试:(略)
--//换一句话讲在filesystemio_options=setall的模式下,才是这样的操作模式,而filesystemio_options=asynch不是(我个人的理解).
实际上,虽然Oracle 数据库使用了绕过缓冲直接写REDO LOG 文件的方法,以避免操作系统故障导致的数据丢失,不过我们还是无法确保
这些数据已经确确实实被写到了物理磁盘上。因为我们RDBMS 使用的绝大多数存储系统都是带有写缓冲的,写缓冲可以有效的提高存储系
统写性能,不过也带来了另外的一个问题,就是说一旦存储出现故障,可能会导致REDO LOG 的信息丢失,甚至导致REDO LOG 出现严重损
坏。存储故障的概率较小,不过这种小概率事件一旦发生还是会导致一些数据库事务的丢失,因此虽然Oracle 的内部算法可以确保一旦
事务提交成功,事务就确认被保存完毕了,不过还是可能出现提交成功的事务丢失的现象。
实际上,Oracle 在设计REDO LOG 文件的时候,已经最大限度的考虑了REDO LOG 文件的安全性,REDO LOG 文件的BLOCK SIZE 和数据库
的BLOCK SIZE 是完全不同的,REDO LOG 文件的BLOCK SIZE 是和操作系统的IO BLOCK SZIE 完全相同的,这种设计确保了一个REDO LOG
BLOCK 是在一次物理IO 中同时写入的,因此REDOLOG BLOCK 不会出现块断裂的现象。
----------------------------
7.总结:
通过前面的比较可以发现设置filesystemio_options=setall,读取数据块时不会充分利用os的缓存,所有的io相关操作都是硬写硬读相关
文件,这样对磁盘性能提出很高的需求,如果你磁盘IO不行,这样设置可能会很慢,如果加上sql语句不佳,存在大量IO相关操作,性能更差.
但是如果你磁盘IO性能很好,可以这样设置,而且这样一定程度减少丢失数据风险,因为os不会缓存,你可以设置sga,db_cache_size等参数
大一些.提高数据库缓存的数据量.
作为dba一定要了解该参数具体含义,而不是道听途说,建立科学理性的分析.
最后我给承认许多知识理解还不是很透彻,完全是我自己乱猜想....希望内行高手指正.
--//如果你看Oracle Database11g DBA Handbook.pdf,提到:
Using Raw Devices P291
Raw devices are available with most Unix operating systems. When they are used, Oracle bypasses
the Unix buffer cache and eliminates the file system overhead. For I/O-intensive applications,
they may result in a performance improvement of around 20 percent over traditional file systems
(and a slightly smaller improvement over Automatic Storage Management). Recent file system
enhancements have largely overcome this performance difference.
Raw devices cannot be managed with the same commands as file systems. For example, the
tar command cannot be used to back up individual files; instead, the dd command must be used.
This is a much less flexible command to use and limits your recovery capabilities.
--//到底那种设置更合理,选择那种存储数据方式,还是给综合各种情况,认真理性分析.
--//实际上,回到头,合理使用资源,良好的设计,sql语句的优化才是王道..