ZFS fsync IOPS performance in FreeBSD

我在上一篇文章讲了一下ZFS的性能优化.

文章提到在Linux (CentOS 6.5 x64)中, ZFS的fsync调用性能不佳的问题, 完全不如ext4, 于是在同一台主机, 我安装了FreeBSD 10 x64. 使用同样的硬件测试一下fsync的性能.

PostgreSQL的安装参考

http://blog.163.com/digoal@126/blog/static/163877040201451181344545/

http://blog.163.com/digoal@126/blog/static/163877040201451282121414/

首先查看块设备, 这里使用12块4 TB的SATA盘.

# gpart list -a
Geom name: mfid[1-12]

创建zpool, 

# zpool create zp1 mfid1 mfid2 mfid3 mfid4 mfid5 mfid6 mfid7 mfid8 mfid9 mfid10 mfid11 mfid12
# zpool get all zp1
NAME  PROPERTY                       VALUE                          SOURCE
zp1   size                           43.5T                          -
zp1   capacity                       0%                             -
zp1   altroot                        -                              default
zp1   health                         ONLINE                         -
zp1   guid                           8490038421326880416            default
zp1   version                        -                              default
zp1   bootfs                         -                              default
zp1   delegation                     on                             default
zp1   autoreplace                    off                            default
zp1   cachefile                      -                              default
zp1   failmode                       wait                           default
zp1   listsnapshots                  off                            default
zp1   autoexpand                     off                            default
zp1   dedupditto                     0                              default
zp1   dedupratio                     1.00x                          -
zp1   free                           43.5T                          -
zp1   allocated                      285K                           -
zp1   readonly                       off                            -
zp1   comment                        -                              default
zp1   expandsize                     0                              -
zp1   freeing                        0                              default
zp1   feature@async_destroy          enabled                        local
zp1   feature@empty_bpobj            active                         local
zp1   feature@lz4_compress           enabled                        local
zp1   feature@multi_vdev_crash_dump  enabled                        local

创建zfs

# zfs create -o mountpoint=/data01 -o atime=off zp1/data01
# zfs get all zp1/data01
NAME        PROPERTY              VALUE                  SOURCE
zp1/data01  type                  filesystem             -
zp1/data01  creation              Thu Jun 26 23:52 2014  -
zp1/data01  used                  32K                    -
zp1/data01  available             42.8T                  -
zp1/data01  referenced            32K                    -
zp1/data01  compressratio         1.00x                  -
zp1/data01  mounted               yes                    -
zp1/data01  quota                 none                   default
zp1/data01  reservation           none                   default
zp1/data01  recordsize            128K                   default
zp1/data01  mountpoint            /data01                local
zp1/data01  sharenfs              off                    default
zp1/data01  checksum              on                     default
zp1/data01  compression           off                    default
zp1/data01  atime                 off                    local
zp1/data01  devices               on                     default
zp1/data01  exec                  on                     default
zp1/data01  setuid                on                     default
zp1/data01  readonly              off                    default
zp1/data01  jailed                off                    default
zp1/data01  snapdir               hidden                 default
zp1/data01  aclmode               discard                default
zp1/data01  aclinherit            restricted             default
zp1/data01  canmount              on                     default
zp1/data01  xattr                 off                    temporary
zp1/data01  copies                1                      default
zp1/data01  version               5                      -
zp1/data01  utf8only              off                    -
zp1/data01  normalization         none                   -
zp1/data01  casesensitivity       sensitive              -
zp1/data01  vscan                 off                    default
zp1/data01  nbmand                off                    default
zp1/data01  sharesmb              off                    default
zp1/data01  refquota              none                   default
zp1/data01  refreservation        none                   default
zp1/data01  primarycache          all                    default
zp1/data01  secondarycache        all                    default
zp1/data01  usedbysnapshots       0                      -
zp1/data01  usedbydataset         32K                    -
zp1/data01  usedbychildren        0                      -
zp1/data01  usedbyrefreservation  0                      -
zp1/data01  logbias               latency                default
zp1/data01  dedup                 off                    default
zp1/data01  mlslabel                                     -
zp1/data01  sync                  disabled               local
zp1/data01  refcompressratio      1.00x                  -
zp1/data01  written               32K                    -
zp1/data01  logicalused           16K                    -
zp1/data01  logicalreferenced     16K                    -

测试fsync, 相比Linux有很大的提升, 基本达到了块设备的瓶颈.

# /opt/pgsql9.3.4/bin/pg_test_fsync -f /data01/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                            6676.001 ops/sec     150 usecs/op
        fsync_writethrough                            n/a
        open_sync                        6087.783 ops/sec     164 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                            4750.841 ops/sec     210 usecs/op
        fsync_writethrough                            n/a
        open_sync                        3065.099 ops/sec     326 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        4965.249 ops/sec     201 usecs/op
         2 *  8kB open_sync writes       3039.074 ops/sec     329 usecs/op
         4 *  4kB open_sync writes       1598.735 ops/sec     625 usecs/op
         8 *  2kB open_sync writes       1326.517 ops/sec     754 usecs/op
        16 *  1kB open_sync writes        620.992 ops/sec    1610 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close              5422.742 ops/sec     184 usecs/op
        write, close, fsync              5552.278 ops/sec     180 usecs/op

Non-Sync'ed 8kB writes:
        write                           67460.621 ops/sec      15 usecs/op

# zpool iostat -v 1
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zp1          747M  43.5T      0  7.31K      0  39.1M
  mfid1     62.8M  3.62T      0    638      0  3.27M
  mfid2     61.9M  3.62T      0    615      0  3.23M
  mfid3     62.8M  3.62T      0    615      0  3.23M
  mfid4     62.0M  3.62T      0    615      0  3.23M
  mfid5     62.9M  3.62T      0    616      0  3.24M
  mfid6     62.0M  3.62T      0    616      0  3.24M
  mfid7     62.9M  3.62T      0    620      0  3.24M
  mfid8     61.6M  3.62T      0    620      0  3.24M
  mfid9     62.2M  3.62T      0    619      0  3.23M
  mfid10    61.8M  3.62T      0    615      0  3.23M
  mfid11    62.2M  3.62T      0    648      0  3.41M
  mfid12    62.1M  3.62T      0    650      0  3.29M
----------  -----  -----  -----  -----  -----  -----
zroot       2.69G   273G      0      0      0      0
  mfid0p3   2.69G   273G      0      0      0      0
----------  -----  -----  -----  -----  -----  -----

# iostat -x 1
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
mfid0      0.0   0.0     0.0     0.0    0   0.0   0
mfid1      0.0 416.6     0.0  7468.5    0   0.1   2
mfid2      0.0 416.6     0.0  7468.5    0   0.0   2
mfid3      0.0 429.6     0.0  7480.0    0   0.1   2
mfid4      0.0 433.6     0.0  7484.0    0   0.1   3
mfid5      0.0 433.6     0.0  7495.9    0   0.1   2
mfid6      0.0 421.6     0.0  7484.5    0   0.1   3
mfid7      0.0 417.6     0.0  7488.5    0   0.1   3
mfid8      0.0 438.6     0.0  7638.3    0   0.1   2
mfid9      0.0 437.6     0.0  7510.4    0   0.1   2
mfid10     0.0 428.6     0.0  7494.4    0   0.1   4
mfid11     0.0 416.6     0.0  7468.5    0   0.1   2
mfid12     0.0 416.6     0.0  7468.5    0   0.1   2

disable sync的情形, FreeBSD和Linux下差不多.

# zfs set sync=disabled zp1/data01
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /data01/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                           115687.300 ops/sec       9 usecs/op
        fsync_writethrough                            n/a
        open_sync                       126789.698 ops/sec       8 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                           65027.801 ops/sec      15 usecs/op
        fsync_writethrough                            n/a
        open_sync                       60239.232 ops/sec      17 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       115246.114 ops/sec       9 usecs/op
         2 *  8kB open_sync writes      63999.355 ops/sec      16 usecs/op
         4 *  4kB open_sync writes      33661.426 ops/sec      30 usecs/op
         8 *  2kB open_sync writes      18960.527 ops/sec      53 usecs/op
        16 *  1kB open_sync writes       8251.087 ops/sec     121 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             47380.701 ops/sec      21 usecs/op
        write, close, fsync             50214.128 ops/sec      20 usecs/op

Non-Sync'ed 8kB writes:
        write                           78263.057 ops/sec      13 usecs/op

[参考]
1. http://blog.163.com/digoal@126/blog/static/1638770402014526992910/

2. http://blog.163.com/digoal@126/blog/static/163877040201451181344545/

3. http://blog.163.com/digoal@126/blog/static/163877040201451282121414/

时间: 2024-11-08 20:26:15

ZFS fsync IOPS performance in FreeBSD的相关文章

iops performance loss by network and hypervisor

本文测试一下Linux NFS 网络文件系统, 以及在此之上跑虚拟机镜像的话, 会带来多少性能损失. 网络环境, 1000MB - 1000MB [root@39 ~]# ethtool em1 Settings for em1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause fra

ZFS on Linux performance tuning when zfs_iput_taskq take near 100% CPU

一台机子的zfs_iput_taskq进程CPU占用100%.  处理方法 : # cat /etc/modprobe.d/zfs.conf options zfs zfs_arc_max=10240000000 # zfs list NAME USED AVAIL REFER MOUNTPOINT zp1 548M 267G 136K /zp1 zp1/data01 302M 267G 302M /data01 zp1/pg_log 980K 267G 980K /pg_log zp1/sit

ZFS (sync, async) R/W IOPS / throughput performance tuning

本文讨论一下zfs读写IOPS或吞吐量的优化技巧, (读写操作分同步和异步两种情况). 影响性能的因素 1. 底层设备的性能直接影响同步读写 iops, throughput. 异步读写和cache(arc, l2arc) 设备或配置有关. 2. vdev 的冗余选择影响iops, through. 因为ZPOOL的IO是均分到各vdevs的, 所以vdev越多, IO和吞吐能力越好. vdev本身的话, 写性能 mirror > raidz1 > raidz2 > raidz3 , 

ZFS ARC & L2ARC zfs-$ver/module/zfs/arc.c

可调参数, 含义以及默认值见arc.c或/sys/module/zfs/parameter/$parm_name 如果是freebsd或其他原生支持zfs的系统, 调整sysctl.conf. parm: zfs_arc_min:Min arc size (ulong) parm: zfs_arc_max:Max arc size (ulong) parm: zfs_arc_meta_limit:Meta limit for arc size (ulong) parm: zfs_arc_meta

13.5. zfs mount/umount

Legacy mount points must be managed through legacy tools. An attempt to use ZFS tools result in an error. # zfs mount pool/home/billm cannot mount 'pool/home/billm': legacy mountpoint use mount(1M) to mount this filesystem # mount -F zfs tank/home/bi

Website Cloud Architecture Best Practices

This article introduces the technical challenges brought by cloud technology to traditional enterprises, and explains in depth the best practices of cloud architecture. It was first published in Lingyun, a magazine jointly sponsored by Alibaba Cloud

btrfs vs ext4 vs (zfs on freebsd) iozone

之前测试了btrfs和ext4的fsync接口性能对比,再用iozone测试一下综合性能. http://blog.163.com/digoal@126/blog/static/163877040201510301944705/ http://blog.163.com/digoal@126/blog/static/1638770402015103004729831/ 测试case: 2G文件,单个请求32K到16MB分布,测试所有的iozone case,全程使用O_DIRECT. [root@

ZFS module parameters in Linux (OR kernel parameter in FreeBSD or zfs-in-kernel OS)

ZFS 在linux中以模块的形式加载, ZFS的内核参数没有整合到linux内核参数中, 所以无法使用sysctl来修改. 只能通过修改模块的参数来变更这些参数. ZFS内核参数的意义可参见modinfo $modname, 或者直接查看zfs源代码. 有些参数是根据系统读数设定的(如内存) 与Linux不同, 在FreeBSD里zfs是在内核中的, 所以这些参数可以通过sysctl.conf保存, 或使用sysctl -w直接修改. 例如我的一台安装了0.6.3 zfs的系统, CentOS

CentOS 7 lvm cache dev VS zfs VS flashcache VS bcache VS direct SSD

本文测试结果仅供参考, rhel 7.0的lvm cache也只是一个预览性质的特性, 从测试结果来看, 用在生产环境也尚早. 前段时间对比了Linux下ZFS和FreeBSD下ZFS的性能, 在fsync接口上存在较大的性能差异, 这个问题已经提交给zfsonlinux的开发组员.  http://blog.163.com/digoal@126/blog/static/1638770402014526992910/ https://github.com/zfsonlinux/zfs/issue