在网站集群架构中,NFS 服务器处于后端的存储层。为前端的 Web 服务器集群提供一致的静态数据存储、访问。存储内容多为用户上传的图片,头像等。若存储层不采用分布式文件系统,而使用 NFS 则会存在单点故障,此时 NFS 故障会导致整个集群不能向外提供完整的服务(暂不考虑CDN)。为了提高整个集群架构的高可用,可以使用 DRBD+Heartbeat 的组合来实现 NFS 的高可用。让两台 NFS 服务器中的一台为主提供服务,当主 NFS 出现故障时,由 Heartbeat 自动切换至备的 NFS 继续提供服务。这样就实现了 NFS 的高可用。使用此方案的弊端是,一台服务器始终处于备机状态,服务器的利用率不高。
DRBD 可以理解为基于网络的 raid-1 。DRBD 可以实现将两个底层的块设备通过网络来做镜像同步。关于 DRBD 的详细内容,后期会介绍,或者自行 baidu。
Heartbeat 是 Linux-HA 项目的一个组件,它可以完成心跳检测和资源接管。Heartbeat 已将 DRBD 的控制脚本纳入到了软件包内,可直接使用。
约定:
三台服务器分别为:
1、NFS 主服务器:nfs-a.z-dig.com 172.16.1.110 /dev/sdb –DRBD– > /dev/drbd0 –mount– > /data
2、NFS 备服务器:nfs-b.z-dig.com 172.16.1.120 /dev/sdb –DRBD– > /dev/drbd0 –mount– > /data
提供 NFS 服务的 VIP :172.16.1.100
3、WebSever 服务器:web.z-dig.com 172.16.1.50 mount -t ext2 172.16.1.100:/data /www/upload
虚拟机准备:
将三台服务器的防火墙和 SELinux 关闭,为两台 NFS 服务器增加一块新硬盘。因为 DRBD 要将主机名解析为对应的IP地址,所以要编辑好/etc/hosts文件。虚拟机可以连通外网。
NFS 主:
[root@nfs-a ~]# uname -nr
nfs-a.z-dig.com 2.6.32-504.el6.x86_64
[root@nfs-a ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root@nfs-a ~]# ifconfig eth0|awk -F '[ :]+' 'NR==2{print $4}'
172.16.1.110
[root@nfs-a ~]# tail -n 3 /etc/hosts
172.16.1.110 nfs-a.z-dig.com
172.16.1.120 nfs-b.z-dig.com
172.16.1.50 web.z-dig.com
[root@nfs-a ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@nfs-a ~]# getsebool
getsebool: SELinux is disabled
[root@nfs-a ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
sda 8:0 0 8G 0 disk
├─sda1 8:1 0 200M 0 part /boot
├─sda2 8:2 0 1G 0 part [SWAP]
└─sda3 8:3 0 6.8G 0 part /
sdb 8:16 0 1G 0 disk
NFS 备:
[root@nfs-b ~]# uname -nr
nfs-b.z-dig.com 2.6.32-504.el6.x86_64
[root@nfs-b ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root@nfs-b ~]# ifconfig eth0|awk -F '[ :]+' 'NR==2{print $4}'
172.16.1.120
[root@nfs-b ~]# tail -n 3 /etc/hosts
172.16.1.110 nfs-a.z-dig.com
172.16.1.120 nfs-b.z-dig.com
172.16.1.50 web.z-dig.com
[root@nfs-b ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@nfs-b ~]# getsebool
getsebool: SELinux is disabled
[root@nfs-b ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
sda 8:0 0 8G 0 disk
├─sda1 8:1 0 200M 0 part /boot
├─sda2 8:2 0 1G 0 part [SWAP]
└─sda3 8:3 0 6.8G 0 part /
sdb 8:16 0 1G 0 disk
[root@nfs-b ~]#
Web
[root@web ~]# uname -nr
web.z-dig.com 2.6.32-504.el6.x86_64
[root@web ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root@web ~]# tail -n 3 /etc/hosts
172.16.1.110 nfs-a.z-dig.com
172.16.1.120 nfs-b.z-dig.com
172.16.1.50 web.z-dig.com
[root@web ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@web ~]# getsebool
getsebool: SELinux is disabled
[root@web ~]#
一、配置DRBD:
1、在两台NFS服务器安装DRBD:
默认的官网 yum 源中没有 drbd 软件,所以要使用 ELRepo 源。
[root@nfs-a ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
[root@nfs-a ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
[root@nfs-a ~]# yum -y install drbd84-utils kmod-drbd84
[root@nfs-a ~]# modprobe drbd
[root@nfs-a ~]# lsmod |grep drbd
drbd 365931 0
libcrc32c 1246 1 drbd
[root@nfs-a ~]#
[root@nfs-b ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
[root@nfs-b ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
[root@nfs-b ~]# yum -y install drbd84-utils kmod-drbd84
[root@nfs-b ~]# modprobe drbd
[root@nfs-b ~]# lsmod|grep drbd
drbd 365931 0
libcrc32c 1246 1 drbd
[root@nfs-b ~]#
2、编辑配置文件:
使用 /dev/sdb 整个盘作为 drbd 的底层存储设备,/dev/drbd0 为要使用的设备。/etc/drbd.conf 默认配置文件保持不变,/etc/drbd.d/global_common.conf 可根据具体的需求不做更改(在单独的资源配置文件定义各参数)或做更改全局生效。本例不做更改,都保持默认状态。
[root@nfs-a ~]# grep -Ev '#|^$' /etc/drbd.conf
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
[root@nfs-a ~]#
[root@nfs-a ~]# grep -Ev '#|^$' /etc/drbd.d/global_common.conf
global {
usage-count yes;
}
common {
handlers {
}
startup {
}
options {
}
disk {
}
net {
}
}
[root@nfs-a ~]#
在 /etc/drbd.d/ 下创建资源配置文件 r0.res 。drbd 的配置文件包含 网络 磁盘 等详细的参数,可参考官网,本例只做简单的演示。
[root@nfs-a drbd.d]# cat /etc/drbd.d/r0.res
resource r0 {
net {
protocol C;
cram-hmac-alg "sha1";
shared-secret "c4f9375f9834b4e7f0a528cc65c055702bf5f24a";
}
device /dev/drbd0;
disk /dev/sdb;
meta-disk internal;
on nfs-a.z-dig.com {
address 172.16.1.110:7780;
}
on nfs-b.z-dig.com {
address 172.16.1.120:7780;
}
}
[root@nfs-a drbd.d]#
将配置文件复制到 nfs-b.z-dig.com
[root@nfs-a drbd.d]# scp /etc/drbd.d/r0.res nfs-b.z-dig.com:/etc/drbd.d/
3、初始化设备:
[root@nfs-a drbd.d]# drbdadm create-md r0
...
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
success
[root@nfs-a drbd.d]#
[root@nfs-a drbd.d]# drbdadm up r0
[root@nfs-a drbd.d]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508
[root@nfs-a drbd.d]#
[root@nfs-b ~]# drbdadm create-md r0
...
The server's response is:
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
success
[root@nfs-b ~]#
[root@nfs-b ~]# drbdadm up r0
[root@nfs-b ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508
[root@nfs-b ~]#
现在两个节点的状态都处于 secondary ,将 nfs-a 的状态手动提升为 primary 。并将 /dev/drbd0 进行格式化以供挂载使用。
[root@nfs-a ~]# drbdadm primary --force r0
[root@nfs-a ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:73776 nr:0 dw:0 dr:74440 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:974732
[>...................] sync'ed: 7.5% (974732/1048508)K
finish: 0:02:51 speed: 5,672 (5,672) K/sec
[root@nfs-a ~]#
[root@nfs-a ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:1048508 nr:0 dw:0 dr:1049172 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@nfs-a ~]#
[root@nfs-a ~]# mkfs.ext2 /dev/drbd0
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
65536 inodes, 262127 blocks
13106 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@nfs-a ~]# tune2fs -c -1 /dev/drbd0
tune2fs 1.41.12 (17-May-2010)
Setting maximal mount count to -1
[root@nfs-a ~]#
4、挂载测试:
[root@nfs-a ~]# mkdir /data
[root@nfs-a ~]# mount -t ext2 /dev/drbd0 /data
[root@nfs-a ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.4G 4.9G 23% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 28M 153M 16% /boot
/dev/drbd0 1008M 1.3M 956M 1% /data
[root@nfs-a ~]# touch /data/drbd.test
[root@nfs-a ~]# ls /data
drbd.test lost+found
[root@nfs-a ~]#
5、手动切换两个节点状态,并进行挂载测试:
[root@nfs-a ~]# umount /data
[root@nfs-a ~]# drbdadm secondary r0
[root@nfs-a ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:1065460 nr:0 dw:16952 dr:1049941 al:7 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@nfs-a ~]#
[root@nfs-b ~]# drbdadm primary r0
[root@nfs-b ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:1065460 dw:1065460 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@nfs-b ~]#
[root@nfs-b ~]# mkdir /data
[root@nfs-b ~]# mount -t ext2 /dev/drbd0 /data
[root@nfs-b ~]# ls /data/
drbd.test lost+found
[root@nfs-b ~]#
[root@nfs-b ~]# umount /data
[root@nfs-b ~]# drbdadm secondary r0
到此 DRBD 两个节点已配置成功,并做了手动挂载测试。DRBD 两个节点,同一时间只有 primary 状态的节点提供服务。
二、配置 NFS:
NFS 共享的目录为 /data 而 /data 挂载的是 /dev/drbd0 。
1、安装 NFS:
[root@nfs-a ~]# yum -y install rpmbind nfs-utils
[root@nfs-b ~]# yum -y install rpmbind nfs-utils
[root@web ~]# yum -y install rpcbind nfs-uti
2、将 nfs-a drbd 状态提升为 primary 并将 /dev/drbd0 挂载至 /data 编辑 /etc/export 文件 并本地测试
[root@nfs-a ~]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R6, 2015-04-09 14:35:00
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:1065460 nr:20 dw:16972 dr:1049941 al:7 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@nfs-a ~]#
[root@nfs-a ~]# drbdadm primary r0
[root@nfs-a ~]# mount -t ext2 /dev/drbd0 /data
[root@nfs-a ~]# cat /etc/exports
/data 172.16.1.0/24(rw,sync,all_squash)
[root@nfs-a ~]#
[root@nfs-a ~]# /etc/init.d/rpcbind start
[root@nfs-a ~]# /etc/init.d/nfs start
[root@nfs-a ~]# chown -R nfsnobody:nfsnobody /data/
[root@nfs-a ~]# showmount -e 172.16.1.110
Export list for 172.16.1.110:
/data 172.16.1.0/24
[root@nfs-a ~]# mount -t nfs 172.16.1.110:/data /mnt
[root@nfs-a ~]# ls /mnt
drbd.test lost+found
[root@nfs-a ~]# touch /mnt/nfs-a.test
[root@nfs-a ~]# ls /mnt/
drbd.test lost+found nfs-a.test
[root@nfs-a ~]#
3、将 nfs-a drbd 状态降为 secondary 将 nfs-b drbd 状态提升为 primary 并将 /dev/drbd0 挂载至 /data 编辑 /etc/export 文件 并本地测试
[root@nfs-a ~]# umount /mnt
[root@nfs-a ~]# /etc/init.d/nfs stop
[root@nfs-a ~]# umount /data
[root@nfs-a ~]# drbdadm secondary r0
[root@nfs-b ~]# drbdadm primary r0
[root@nfs-b ~]# mount -t ext2 /dev/drbd0 /data
[root@nfs-b ~]# cat /etc/exports
/data 172.16.1.0/24(rw,sync,all_squash)
[root@nfs-b ~]# /etc/init.d/rpcbind start
[root@nfs-b ~]# /etc/init.d/nfs start
[root@nfs-b ~]# chown -R nfsnobody:nfsnobody /data/
[root@nfs-b ~]# showmount -e 172.16.1.120
Export list for 172.16.1.120:
/data 172.16.1.0/24
[root@nfs-b ~]# mount -t nfs 172.16.1.120:/data /mnt
[root@nfs-b ~]# ls /mnt
drbd.test lost+found nfs-a.test
[root@nfs-b ~]# touch /mnt/nfs-b.test
[root@nfs-b ~]# ls /mnt
drbd.test lost+found nfs-a.test nfs-b.test
[root@nfs-b ~]#
[root@nfs-b ~]# umount /mnt
[root@nfs-b ~]# /etc/init.d/nfs stop
[root@nfs-b ~]# umount /data
[root@nfs-b ~]# drbdadm secondary r0
到此 DRBD NFS 手动切换挂载已测试成功。
三、配置 Heartbeat :
配置 Heartbeat 以达到 NFS 主节点失效自动将所有资源切换至备节点。
1、在两台 NFS 服务器安装 heartbeat:
[root@nfs-a ~]# yum -y install heartbeat
[root@nfs-b ~]# yum -y install heartbeat
2、将默认的配置文件拷贝至 /etc/ha.d/ 并作更改
[root@nfs-a ~]# rpm -qd heartbeat|grep doc
/usr/share/doc/heartbeat-3.0.4/AUTHORS
/usr/share/doc/heartbeat-3.0.4/COPYING
/usr/share/doc/heartbeat-3.0.4/COPYING.LGPL
/usr/share/doc/heartbeat-3.0.4/ChangeLog
/usr/share/doc/heartbeat-3.0.4/README
/usr/share/doc/heartbeat-3.0.4/apphbd.cf
/usr/share/doc/heartbeat-3.0.4/authkeys
/usr/share/doc/heartbeat-3.0.4/ha.cf
/usr/share/doc/heartbeat-3.0.4/haresources
[root@nfs-a ~]#
[root@nfs-a ~]# cp /usr/share/doc/heartbeat-3.0.4/ha.cf /usr/share/doc/heartbeat-3.0.4/haresources /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/
[root@nfs-a ~]# grep -Ev '#|^$' /etc/ha.d/ha.cf
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 60
mcast eth0 225.0.0.1 694 1 0
auto_failback on
node nfs-a.z-dig.com
node nfs-b.z-dig.com
[root@nfs-a ~]#
[root@nfs-a ~]# scp /etc/ha.d/ha.cf nfs-b.z-dig.com:/etc/ha.d/
[root@nfs-a ~]# grep -Ev '#|^$' /etc/ha.d/authkeys
auth 1
1 sha1 c4f9375f9834b4e7f0a528cc65c055702bf5f24a
[root@nfs-a ~]# chmod 600 /etc/ha.d/authkeys
[root@nfs-a ~]# ll /etc/ha.d/authkeys
-rw------- 1 root root 700 Oct 14 06:38 /etc/ha.d/authkeys
[root@nfs-a ~]#
[root@nfs-a ~]# scp -p /etc/ha.d/authkeys nfs-b.z-dig.com:/etc/ha.d/
[root@nfs-a ~]# grep -Ev '#|^$' /etc/ha.d/haresources
nfs-a.z-dig.com 172.16.1.100 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext2 nfs
[root@nfs-a ~]#
[root@nfs-a ~]# scp /etc/ha.d/haresources nfs-b.z-dig.com:/etc/ha.d/
3、启动 heartbeat:
[root@nfs-a ~]# /etc/init.d/heartbeat start
[root@nfs-b ~]# /etc/init.d/heartbeat start
4、测试 heartbeat:
[root@nfs-a ~]# ip addr|grep 172.16.1.100
inet 172.16.1.100/24 brd 172.16.1.255 scope global secondary eth0
[root@nfs-a ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.5G 4.8G 24% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 28M 153M 16% /boot
/dev/drbd0 1008M 1.3M 956M 1% /data
[root@nfs-a ~]# showmount -e 172.16.1.100
Export list for 172.16.1.100:
/data 172.16.1.0/24
[root@nfs-a ~]#
[root@nfs-a ~]# /etc/init.d/heartbeat stop
[root@nfs-a ~]# ip addr|grep 172.16.1.100
[root@nfs-a ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.5G 4.8G 24% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 28M 153M 16% /boot
[root@nfs-a ~]#
[root@nfs-b ~]# ip addr|grep 172.16.1.100
inet 172.16.1.100/24 brd 172.16.1.255 scope global secondary eth0
[root@nfs-b ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.5G 4.8G 24% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 28M 153M 16% /boot
/dev/drbd0 1008M 1.3M 956M 1% /data
[root@nfs-b ~]# showmount -e 172.16.1.100
Export list for 172.16.1.100:
/data 172.16.1.0/24
[root@nfs-b ~]#
到此,heartbeat 已经可以自动接管 drbd nfs VIP 等资源。
四、终极测试:
测试思路,将 nfs-a.z-dig.com 配置为主,由 VIP 172.16.1.100 向 web.z-dig.com 提供 nfs 服务。web.z-dig.com 挂载 nfs 共享的目录,并使用 for 循环来不断的往挂载的目录里写入文件。此时手动关闭 nfs-b.z-dig.com ,heartbeat 会自动将服务切换至 nfs-b.z-dig.com 。最后查看创建文件丢失了多少。
调整两个NFS服务器的状态,使 nfs-a 为主提供服务,nfs-b 为备,web 通过 VIP 100 挂载nfs共享目录。
[root@nfs-a ~]# ip addr|grep 172.16.1.100
inet 172.16.1.100/24 brd 172.16.1.255 scope global secondary eth0
[root@nfs-a ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.5G 4.8G 24% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 28M 153M 16% /boot
/dev/drbd0 1008M 1.3M 956M 1% /data
[root@nfs-a ~]#
[root@web ~]# mkdir -p /www/upload
[root@web ~]# showmount -e 172.16.1.100
Export list for 172.16.1.100:
/data 172.16.1.0/24
[root@web ~]# mount -t nfs 172.16.1.100:/data /www/upload
[root@web ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 6.6G 1.4G 4.9G 23% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 190M 27M 153M 16% /boot
172.16.1.100:/data 1008M 1.3M 956M 1% /www/upload
[root@web ~]#
[root@web ~]# rm -rf /www/upload/
[root@web ~]# ls /www/upload/
在web运行脚本,运行十几秒后,手动将nfs-a关机,再过段时间停止脚本。
[root@web ~]# for i in `seq 10000`;do touch /www/upload/$i&&ls /www/upload/ >/tmp/nfs.test;done
^C
[root@web ~]#
检查测试结果:
[root@web ~]# ls /www/upload/|wc -l
577
[root@web ~]# sort -n /tmp/nfs.test|tail -n 1
577
[root@web ~]#