Oracle I/O and Operation System Cache

[转]

Direct or Async I/O

It’s pretty well documented that the most efficient configuration for Oracle is direct asynchronous I/O.

Direct I/O means to avoid unnecessarily copying bits from one memory location to another between the hardware and the SGA. Every flavor of Unix implements some form of caching for block devices and filesystems; in Linux the buffer cache (for block devices) or page cache (for filesystem data) can cache database blocks. Most applications benefit from this caching – however Oracle does not. There are two primary reasons this caching is bad for Oracle:

Oracle caches this data itself in the SGA and it does a far better job of predicting what blocks will be best retained in memory. When the OS caches data you essentially end up with two copies of every data block – one that’s really unused. You waste half the memory in your server with almost no performance benefit.
The data has to be copied extra times before arriving in the SGA – and these extra “copy” operations slow down your read operations.

Asynchronous I/O means to multitask your read and write operations. Remember how in MSDOS you could only run one program at a time? Believe it or not, this is how most Unix filesystems do I/O by default – one thing at a time. Using Asynchronous is equivalent to upgrading to an Operating System that allows you to run more than one program at a time. It allows Oracle to issue multiple read and write operations concurrently – this is obviously more efficient.

There are different ways of implementing Direct I/O and Asynchronous I/O in different environments (CIO, QIO, ODM, etc) – but it’s always best to enable them and then shift memory from the OS caches to the SGA.

The Linux I/O Path

I should start out by saying that I’m not a kernel expert. After some digging through source code and mailing lists I think that I’ve got a handle on the main concepts here – but I’m open to correction since this is complicated stuff and I easily could miss some details of how it’s all implemented! Let me know if you have anything to add!

On Linux, asynchronous I/O requests are submitted through the io_submit() call. (Kevin Closson once wrote about tracing this call.) This function essentially puts the I/O request into a queue which is managed by the Linux kernel. I think that this workqueue is a generic kernel object which you can’t and shouldn’t need to tweak. The I/O request is subsequently picked up by one of several kernel AIO background processes and serviced.

What the kernel thread does depends on the underlying device. FC cards support asynchronous requests to the target (it’s part of the SCSI spec) – so in this case the kernel thread will issue a number of SCSI read or write requests in parallel. This is where a secondqueue comes in! And this queue, unlike the kernel workqueues, can be tweaked.

The target device (Symetrix, DSxxxx, 3Par, etc) also has an FC port which has its own queue for I/O requests that are being serviced. On a 3par device these ports only have between 500 and 2000 slots for concurrent I/O requests depending on the model. This means that all the servers accessing any LUNs over that port cannot issue more than 500-2000 concurrent I/O requests – or you will start to see “QUEUE FULL SCSI” errors.

My best guess: this means that the default limit set by the Linux qla2xxx driver for concurrent I/O requests on QLogic cards (32 per LUN) is conservative. The driver sets this limit by limiting the size of the queue. So naturally… my next question was if I could increase performance by increasing this limit…

Tweaking the HBA Queue Depth

I did run a few tests – but I think I’ll save the results for a second post since this one has already gotten pretty long. Any guesses about what the results were?

时间： 2024-08-03 23:20:45

Oracle I/O and Operation System Cache

Direct or Async I/O

The Linux I/O Path

Tweaking the HBA Queue Depth

Oracle I/O and Operation System Cache的相关文章

Oracle结果集缓存（Result Cache）--服务器、客户端、函数缓存

Win7旗舰版系统开机提示“Error Loading Operation System”怎么办

请高人指教一下：asp.net跟oracle怎样才能连接？System.Data.OracleClient.dll怎么用呀？

ORACLE Active dataguard 一个latch: row cache objects BUG

oracle blob 图片接收是system.byte[]怎么在image中显示出来，代码如下

一个典型kernel bug的追踪之（一）

MySQL服务器内存使用

UDE-00008: operation generated ORACLE error 31626

Oracle中library cache pin与PROCEDURE的重建