在6.x中的安装请参考http://digoal.lofter.com/post/6ced3_450adf3
前段时间在DELL R610服务器pci-e插槽上插了1块OCZ的SSD硬盘(价格3000多RMB, 比较实惠).
型号 : OCZ REVODRIVE3
容量 : 240 GB
因为服务器是单电源的, 两个PCI-E插槽看起来不能同时使用, 所以把另一个槽位上插的HBA卡拔掉就能使用了.
另外就是, OCZ网站上没有提供这个型号的SSD的linux驱动.
但是用Z的可以, 下载地址 :
http://www.oczenterprise.com/drivers.html
操作系统版本:
[root@db-172-16-3-150 ~]# cat /etc/redhat-release
CentOS release 5.7 (Final)
内核
[root@db-172-16-3-150 ~]# uname -a
Linux db-172-16-3-150.sky-mobi.com 2.6.18-274.el5 #1 SMP Fri Jul 22 04:43:29 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
驱动安装如下 :
wget http://www.oczenterprise.com/files/drivers/OCZ%20Centos_5.7_64-Bit_r1480.tar.gz
tar -zxvf OCZ\ Centos_5.7_64-Bit_r1480.tar.gz
cp ocz10xx.ko /lib/modules/2.6.18-274.el5/extra/
depmod -a
modprobe ocz10xx
接下来使用fdisk -l就能看到这块新加的硬盘了.
[root@db-172-16-3-150 ~]# fdisk -l
Disk /dev/sda: 146.1 GB, 146163105792 bytes
255 heads, 63 sectors/track, 17769 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 3824 30716248+ 83 Linux
/dev/sda2 3825 4868 8385930 82 Linux swap / Solaris
/dev/sda3 4869 17769 103627282+ 83 Linux
Disk /dev/sdb: 199.4 GB, 199447543808 bytes
255 heads, 63 sectors/track, 24248 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdd: 240.0 GB, 240068197888 bytes
255 heads, 63 sectors/track, 29186 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/dm-0: 199.4 GB, 199447543808 bytes
255 heads, 63 sectors/track, 24248 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-0 doesn't contain a valid partition table
Disk /dev/dm-1: 240.0 GB, 240068197888 bytes
255 heads, 63 sectors/track, 29186 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-1 doesn't contain a valid partition table
是个多路径设备 :
[root@db-172-16-3-150 ~]# multipath -ll
mpath1 (SATA_OCZ-REVODRIVE3_OCZ-Z2134R0TLQBNE659) dm-1 ATA,OCZ-REVODRIVE3
[size=224G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 4:0:126:0 sdd 8:48 [active][ready]
mpath0 (360026b902fe2ce0017c654500ba7cfad) dm-0 DELL,PERC 6/i
[size=186G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 0:2:1:0 sdb 8:16 [active][ready]
其中mpath0也是一块SSD, 只是这块不是PCI-E的, 是SATA接口的, DELL品牌硬盘(大概2万元1块), 插在DELL服务器自带的RAID卡上. 后面会对比一下两个SSD以及机械盘的FSYNC性能差异.
(mpath0这块SSD 硬盘: 200GB Solid State Drive SATA MLC 3G 2.5in Hot-plug Hard Drive-Limited Warranty Only)
查看这块OCZ SSD硬盘的信息 :
[root@db-172-16-3-150 ~]# hdparm -I /dev/sdd
/dev/sdd:
ATA device, with non-removable media
Model Number: OCZ-REVODRIVE3
Serial Number: OCZ-Z2134R0TLQBNE659
Firmware Revision: 2.15
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5
Standards:
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 468883199
LBA48 user addressable sectors: 468883199
device size with M = 1024*1024: 228946 MBytes
device size with M = 1000*1000: 240068 MBytes (240 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Advanced power management level: unknown setting (0x00fe)
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
* Advanced Power Management feature set
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
* 48-bit Address feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* IDLE_IMMEDIATE with UNLOAD
Write-Read-Verify feature set
* WRITE_UNCORRECTABLE command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* unknown 76[3]
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* unknown 76[14]
* unknown 76[15]
DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
2min for SECURITY ERASE UNIT.
Checksum: correct
注意这块盘默认的WRITE CACHE是打开了, 为了安全建议关闭.
关闭Write cache前 :
[root@db-172-16-3-150 ~]# hdparm -I /dev/sdd|grep cache
* Write cache
关闭Write cache :
[root@db-172-16-3-150 ~]# hdparm -W 0 /dev/mapper/mpath1
/dev/mapper/mpath1:
setting drive write-caching to 0 (off)
HDIO_DRIVE_CMD(setcache) failed: Input/output error
关闭Write cache后 :
[root@db-172-16-3-150 ~]# hdparm -I /dev/mapper/mpath1|grep cache
Write cache
接下来简单的测试了一下fsync的能力, 为了比较方便, 到DELL服务器的RAID配置中, 关闭write back. 改为write through. 这样在测试DELL服务器自带的SAS 146GB 10k转硬盘以及SSD SATA硬盘时将不会使用RAID卡的写cache. 而是直接写到磁盘. 这样与OCZ的PCI-E SSD才有得好比.
1. 关闭RAID 写cache的详细操作略.
2. 将OCZ PCI-E SSD的写cache关闭.
hdparm -W 0 /dev/mapper/mpath1
放到rc.local, 否则重启后又会开启write cache.
3. 使用PostgreSQL 9.2.x带的pg_test_fsync测试fsync能力.
3.1 机械盘
pg9.2.0@db-172-16-3-150-> pg_test_fsync
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 16kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 146.169 ops/sec
fsync 161.802 ops/sec
fsync_writethrough n/a
open_sync 166.520 ops/sec
Compare file sync methods using two 16kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 166.658 ops/sec
fsync 162.162 ops/sec
fsync_writethrough n/a
open_sync 83.330 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 166.673 ops/sec
2 * 8kB open_sync writes 82.329 ops/sec
4 * 4kB open_sync writes 41.665 ops/sec
8 * 2kB open_sync writes 20.832 ops/sec
16 * 1kB open_sync writes 10.293 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 161.665 ops/sec
write, close, fsync 163.661 ops/sec
Non-Sync'ed 16kB writes:
write 148617.025 ops/sec
3.2 SSD SATA 2.5寸
pg9.2.0@db-172-16-3-150-> pg_test_fsync
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 16kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 264.964 ops/sec
fsync 245.810 ops/sec
fsync_writethrough n/a
open_sync 266.335 ops/sec
Compare file sync methods using two 16kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 135.904 ops/sec
fsync 131.025 ops/sec
fsync_writethrough n/a
open_sync 132.228 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 267.800 ops/sec
2 * 8kB open_sync writes 251.967 ops/sec
4 * 4kB open_sync writes 241.487 ops/sec
8 * 2kB open_sync writes 221.293 ops/sec
16 * 1kB open_sync writes 189.691 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 245.371 ops/sec
write, close, fsync 245.360 ops/sec
Non-Sync'ed 16kB writes:
write 149532.140 ops/sec
3.3 OCZ PCI-E SSD
pg9.2.0@db-172-16-3-150-> pg_test_fsync
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 16kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 9466.480 ops/sec
fsync 261.271 ops/sec
fsync_writethrough n/a
open_sync 12701.793 ops/sec
Compare file sync methods using two 16kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 6148.294 ops/sec
fsync 250.694 ops/sec
fsync_writethrough n/a
open_sync 6321.015 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 12598.168 ops/sec
2 * 8kB open_sync writes 7976.631 ops/sec
4 * 4kB open_sync writes 4740.957 ops/sec
8 * 2kB open_sync writes 2232.638 ops/sec
16 * 1kB open_sync writes 1138.060 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 250.710 ops/sec
write, close, fsync 250.055 ops/sec
Non-Sync'ed 16kB writes:
write 149753.941 ops/sec
【参考】
http://www.oczenterprise.com/briefs/ocz-vca-technology.pdf
http://en.wikipedia.org/wiki/TRIM
http://en.wikipedia.org/wiki/S.M.A.R.T.
http://serverascode.com/2012/05/08/ocz-zdrive-r4-installation-performance.html
http://www.oczenterprise.com/downloads/solutions/z-driver4-installation-guide.pdf