ZIL (ZFS intent log) zil.c

ZIL或称SLOG, 被用于提升ZFS系统的离散fsync性能.

类似数据库的redo log或wal.

注意

1. 每个dataset对应一个zil, 也就是说一个zpool有多个zfs的话, 如果有log设备, 那么在log设备中实际上包含了多个ZIL entry.

数据写入ZIL后(fsync), 即使服务器异常, 也可以用于恢复文件系统.

2. 并不是每一笔FSYNC都会用到ZIL, 只有小于2*zil_slog_limit 的commit操作才会用到. 如果你的zil 块设备够强的话, 可以调大伙调到UINT64_MAX, 那么就不检测了, 所有的commit都用上zil设备. 一般不需要调整

/*
 * The zfs intent log (ZIL) saves transaction records of system calls
 * that change the file system in memory with enough information
 * to be able to replay them. These are stored in memory until
 * either the DMU transaction group (txg) commits them to the stable pool
 * and they can be discarded, or they are flushed to the stable log
 * (also in the pool) due to a fsync, O_DSYNC or other synchronous
 * requirement. In the event of a panic or power fail then those log
 * records (transactions) are replayed.
 *
 * There is one ZIL per file system. Its on-disk (pool) format consists
 * of 3 parts:
 *
 *      - ZIL header
 *      - ZIL blocks
 *      - ZIL records
 *
 * A log record holds a system call transaction. Log blocks can
 * hold many log records and the blocks are chained together.
 * Each ZIL block contains a block pointer (blkptr_t) to the next
 * ZIL block in the chain. The ZIL header points to the first
 * block in the chain. Note there is not a fixed place in the pool
 * to hold blocks. They are dynamically allocated and freed as
 * needed from the blocks available. Figure X shows the ZIL structure:
 */

可调参数,
/*
 * This global ZIL switch affects all pools
 */
int zil_replay_disable = 0;    /* disable intent logging replay */

/*
 * Tunable parameter for debugging or performance analysis.  Setting
 * zfs_nocacheflush will cause corruption on power loss if a volatile
 * out-of-order write cache is enabled.
 */
int zfs_nocacheflush = 0;

/*
 * Define a limited set of intent log block sizes.
 * These must be a multiple of 4KB. Note only the amount used (again
 * aligned to 4KB) actually gets written. However, we can't always just
 * allocate SPA_MAXBLOCKSIZE as the slog space could be exhausted.
 */
uint64_t zil_block_buckets[] = {
    4096,               /* non TX_WRITE */
    8192+4096,          /* data base */
    32*1024 + 4096,     /* NFS writes */
    UINT64_MAX
};

/*
 * Use the slog as long as the current commit size is less than the
 * limit or the total list size is less than 2X the limit.  Limit
 * checking is disabled by setting zil_slog_limit to UINT64_MAX.
 */
unsigned long zil_slog_limit = 1024 * 1024;
#define USE_SLOG(zilog) (((zilog)->zl_cur_used < zil_slog_limit) || \
        ((zilog)->zl_itx_list_sz < (zil_slog_limit << 1)))

#if defined(_KERNEL) && defined(HAVE_SPL)
module_param(zil_replay_disable, int, 0644);
MODULE_PARM_DESC(zil_replay_disable, "Disable intent logging replay");

module_param(zfs_nocacheflush, int, 0644);
MODULE_PARM_DESC(zfs_nocacheflush, "Disable cache flushes");

module_param(zil_slog_limit, ulong, 0644);
MODULE_PARM_DESC(zil_slog_limit, "Max commit bytes to separate log device");
#endif

时间： 2024-10-05 11:02:45

ZIL (ZFS intent log) zil.c的相关文章

ZFS 那点事

最近看到很多关于ZFS移植到Linux上文章,看来ZFS还是很被大家看好,那就写点关于ZFS的东西,之前对ZFS的使用主要集中在SmartOS上,那就在聊聊我对SmartOS上使用ZFS的体验和ZFS的特性吧. ZFS COW (Copy On Write) 首先说下ZFS的copy on write 这个技术并不复杂,看下图比较清晰, cow 图-1: 可以看到uberblock实际上是Merkle Tree的root. 它记录了文件系统的所有状态,当检索一个数据块的时候, 会从uberblo

PostgreSQL OLTP on ZFS 性能优化

环境 PostgreSQL 9.5 rc1 数据块大小为8KB CentOS 6.x x64 zfsonlinux 3*aliflash 256G内存 32核 Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz pg_xlog on ext4 ext4 mount option ( defaults,noatime,nodiratime,discard,nodelalloc,data=writeback,nobarrier ) $P

use export and import move ZPOOL's underdev from one machine to another OR upgrade a zfs version OR recover destroyed pools

前面我们介绍了zfs的pool, 类似LVM. 由多个块设备组成. 如果这些块设备要从一个机器转移到另一台机器的话, 怎么实现呢? zfs通过export和import来实现底层块设备的转移. 在已有POOL的主机上, 先将会读写POOL或dataset的正在运行的程序停止掉, 然后执行export. 执行export会把cache flush到底层的块设备, 同时卸载dataset和pool. import时, 可能需要指定块设备的目录, 但是并不需要指定顺序. 例如 : [root@spar

ZIL (ZFS intent log) zil.c

ZIL (ZFS intent log) zil.c的相关文章

ZFS 那点事

PostgreSQL OLTP on ZFS 性能优化

use export and import move ZPOOL's underdev from one machine to another OR upgrade a zfs version OR recover destroyed pools

ZFS (sync, async) R/W IOPS / throughput performance tuning

Android Activity 的四种启动模式 lunchMode 和 Intent.setFlags();singleTask的两种启动方式。

详解Activity之singletast启动模式及如何使用intent传值_Android

ZFS snapshot used with PostgreSQL PITR or FAST degrade or PG-XC GreenPlum plproxy MPP DB's consistent backup

zfs pool self healing and scrub and pre-replace "bad"-disks

send and receive ZFS snapshot between machines