CentOS 6.5 x64 RHCS GFS配置

原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 、作者信息和本声明。否则将追究法律责任。http://koumm.blog.51cto.com/703525/1598367

本文实验环境:   
CentOS 6.5 x64 RHCS GFS

配置说明: 本文出自: http://koumm.blog.51cto.com

1. 通过Openfiler实现iscsi共享存储   
2. 通过VMware ESXi5 虚拟fence实现fence功能。    
3. 结合Centos 6.5 vmware-fence-soap实现RHCS fence设备功能。    
4. 通过搭建RHCS实验环境测试GFS2功能。

 

RHEL/CentOS5.X版本RHCS配置相关参考:

IBM x3650M3+GFS+IPMI fence生产环境配置一例
http://koumm.blog.51cto.com/703525/1544971

Redhat 5.8 x64 RHCS Oracle 10gR2 HA实践配置(注:vmware-fence-soap)
http://koumm.blog.51cto.com/703525/1161791

 

 

一、准备基础环境

1. 网络环境准备

node01,node02节点

# cat /etc/hosts

192.168.0.181  node01.abc.com node01    
192.168.0.182  node01.abc.com node02

2. 配置YUM安装源

(1) 挂载光盘ISO

# mount /dev/cdrom /mnt

(2) 配置YUM客户端

说明: 通过本地光盘做为yum安装源。

# vi /etc/yum.repos.d/rhel.repo

[rhel]   
name=rhel6    
baseurl=file:///mnt    
enabled=1    
gpgcheck=0

(3) openfiler iscsi存储配置

配置略,规划磁盘空间如下:   
qdisk 100MB    
data  150GB

(4) node01,node02挂载存储

# yum install iscsi-initiator-utils   
# chkconfig iscsid on    
# service iscsid start

# iscsiadm -m discovery -t sendtargets -p 192.168.0.187   
192.168.0.187:3260,1 iqn.2006-01.com.openfiler:tsn.dea898a36535

# iscsiadm -m node -T iqn.2006-01.com.openfiler:tsn.dea898a36535 -p 192.168.0.187 –l

 

二、RHCS软件包的安装

1. 在node01上安装luci及RHCS软件包

1) node01(管理节点)安装RHCS软件包,luci是管理端软件包,只在管理端安装。

yum -y install luci cman odcluster ricci gfs2-utils rgmanager lvm2-cluster

2) node02 安装RHCS软件包

yum -y install cman odcluster ricci gfs2-utils rgmanager lvm2-cluster

3) node01, node02 更改个节点ricci用户密码

passwd ricci

4) 配置RHCS服务开机启动

chkconfig ricci on   
chkconfig rgmanager on    
chkconfig cman on    
service ricci start    
service rgmanager start    
service cman start

#启动过程如下:

正在启动 oddjobd:                                         [确定]   
generating SSL certificates...  done    
Generating NSS database...  done    
启动 ricci:                                               [确定]    
Starting Cluster Service Manager:                          [确定]    
Starting cluster:     
   Checking if cluster has been disabled at boot...        [确定]    
   Checking Network Manager...                             [确定]    
   Global setup...                                         [确定]    
   Loading kernel modules...                               [确定]    
   Mounting configfs...                                    [确定]    
   Starting cman... xmlconfig cannot find /etc/cluster/cluster.conf    
                                                           [失败]    
Stopping cluster:     
   Leaving fence domain...                                 [确定]    
   Stopping gfs_controld...                                [确定]    
   Stopping dlm_controld...                                [确定]    
   Stopping fenced...                                      [确定]    
   Stopping cman...                                        [确定]    
   Unloading kernel modules...                             [确定]    
   Unmounting configfs...                                  [确定]    
#

2. 在node01管理节点上安装启动luci服务

1) 启动luci服务

chkconfig luci on   
service luci start

Adding following auto-detected host IDs (IP addresses/domain names), corresponding to `node01' address, to the configuration of self-managed certificate `/var/lib/luci/etc/cacert.config' (you can change them by editing `/var/lib/luci/etc/cacert.config', removing the generated certificate `/var/lib/luci/certs/host.pem' and restarting luci):   
        (none suitable found, you can still do it manually as mentioned above)

Generating a 2048 bit RSA private key   
writing new private key to '/var/lib/luci/certs/host.pem'    
Start luci...                                              [确定]    
Point your web browser to https://node01:8084 (or equivalent) to access luci

2) 配置管理地址, RHCS6版本采用root用户密码登录。

https://node01:8084    
root/111111

 

三、RHCS集群配置

1. 添加集群

登录进管理界面,点击Manage Clusters --> Create 填入如下内容:

Cluster Name: gfs

NodeName         Password     RicciHostname      Ricci Port   
node01.abc.com   111111       node01.abc.com     11111    
node02.abc.com   111111       node01.abc.com     11111

选中如下选项,然后提交   
Use locally installed packages.

说明:这步会生成集群配置文件/etc/cluster/cluster.conf

2. 添加Fence Devices

说明:    
RHCS要实现完整的集群功能,必须要实现fence功能。由于非物理服务器配置等条件限制,特使用VMware ESXi5.X的虚拟fence来实现fence设备的功能。    
正是由于有了fence设备可以使用,才得以完整测试RHCS功能。

(1)登录进管理界面,点击cluster-> Fence Devices->   
(2)选择"Add 选择VMware Fencing(SOAP Interface)    
(3)Name "ESXi_fence"    
(4)IP Address or Hostname "192.168.0.21"(ESXi地址)    
(5)Login "root"    
(6)Password "111111"

3. 节点绑定Fence设备

添加节点一fence

1) 点击node01.abc.com节点,Add Fence Method,这里填node01_fence;   
2) 添加一个fence instance,选择"ESXi_fence" VMware Fencing(SOAP Interface)    
3) VM NAME "kvm_node01"    
4) VM UUID "564d6fbf-05fb-1dd1-fb66-7ea3c85dcfdf"  选中ssl

说明: VMNAME: 虚拟机名称,VM UUID: 虚拟机.vmx文件中"   
uuid.location"值, 采用下面的字符串的格式。

# /usr/sbin/fence_vmware_soap -a 192.168.0.21 -z -l root -p 111111 -n kvm_node2 -o list   
kvm_node2,564d4c42-e7fd-db62-3878-57f77df2475e    
kvm_node1,564d6fbf-05fb-1dd1-fb66-7ea3c85dcfdf

添加节点二fence

1) 点击node02.abc.com节点,Add Fence Method,这里填node02_fence;   
2) 添加一个fence instance,选择"ESXi_fence" VMware Fencing(SOAP Interface)    
3) VM NAME "kvm_node02"    
4) VM UUID "564d4c42-e7fd-db62-3878-57f77df2475e" 选中ssl

#手动测试fence功能示例:

# /usr/sbin/fence_vmware_soap -a 192.168.0.21 -z -l root -p 111111 -n kvm_node02 -o reboot    
Status: ON

选项:   
-o : list,status,reboot等参数

4. 添加Failover Domains配置

Name "gfs_failover"   
Prioritized    
Restricted    
node01.abc.com    1     
node02.abc.com    1

5. 配置GFS服务

(1) GFS服务配置

分别在node01,node02启动CLVM的集成cluster锁服务

lvmconf --enable-cluster    
chkconfig clvmd on

service clvmd start    
Activating VG(s):   No volume groups found      [  OK  ]

(2) 在任意一节点对磁盘进行分区,划分出sdc1。然后格式化成gfs2.

node01节点上:

# pvcreate /dev/sdc1   
  Physical volume "/dev/sdc1" successfully created

# pvs   
  PV         VG       Fmt  Attr PSize   PFree  
  /dev/sda2  vg_node01 lvm2 a--   39.51g      0     
  /dev/sdc1           lvm2 a--  156.25g 156.25g

# vgcreate gfsvg /dev/sdc1   
  Clustered volume group "gfsvg" successfully created

# lvcreate -l +100%FREE -n data gfsvg   
  Logical volume "data" created

node02节点上:   
# /etc/init.d/clvmd start

(3) 格式化GFS文件系统

node01节点上:

[root@node01 ~]# mkfs.gfs2 -p lock_dlm -t gfs:gfs2 -j 2 /dev/gfsvg/data   
This will destroy any data on /dev/gfsvg/data.    
It appears to contain: symbolic link to `../dm-2'

Are you sure you want to proceed? [y/n] y

Device:                    /dev/gfsvg/data   
Blocksize:                 4096    
Device Size                156.25 GB (40958976 blocks)    
Filesystem Size:           156.25 GB (40958975 blocks)    
Journals:                  2    
Resource Groups:           625    
Locking Protocol:          "lock_dlm"    
Lock Table:                "gfs:gfs2"    
UUID:                      e28655c6-29e6-b813-138f-0b22d3b15321

说明:    
gfs:gfs2这个gfs就是集群的名字,gfs2是定义的名字,相当于标签。    
-j是指定挂载这个文件系统的主机个数,不指定默认为1即为管理节点的。    
这里实验有两个节点

6. 挂载GFS文件系统

node01,node02 上创建GFS挂载点

# mkdir /vmdata

(1)node01,node02手动挂载测试,挂载成功后,创建文件测试集群文件系统情况。   
# mount.gfs2 /dev/gfsvg/data /vmdata

(2)配置开机自动挂载   
# vi /etc/fstab    
/dev/gfsvg/data   /vmdata gfs2 defaults 0 0

[root@node01 vmdata]# df -h

Filesystem                    Size  Used Avail Use% Mounted on   
/dev/mapper/vg_node01-lv_root   36G  3.8G   30G  12% /    
tmpfs                         1.9G   32M  1.9G   2% /dev/shm    
/dev/sda1                     485M   39M  421M   9% /boot    
/dev/gfsvg/data               157G  259M  156G   1% /vmdata

7. 配置表决磁盘

说明:   
#表决磁盘是共享磁盘,无需要太大,本例采用/dev/sdc1 100MB来进行创建。

[root@node01 ~]# fdisk -l

Disk /dev/sdb: 134 MB, 134217728 bytes   
5 heads, 52 sectors/track, 1008 cylinders    
Units = cylinders of 260 * 512 = 133120 bytes    
Sector size (logical/physical): 512 bytes / 512 bytes    
I/O size (minimum/optimal): 512 bytes / 512 bytes    
Disk identifier: 0x80cdfae9

   Device Boot      Start         End      Blocks   Id  System   
/dev/sdb1               1        1008      131014   83  Linux

(1) 创建表决磁盘

[root@node01 ~]# mkqdisk -c /dev/sdb1 -l myqdisk   
mkqdisk v0.6.0    
Writing new quorum disk label 'myqdisk' to /dev/sdc1.    
WARNING: About to destroy all data on /dev/sdc1; proceed [N/y] ? y    
Initializing status block for node 1...    
Initializing status block for node 2...    
Initializing status block for node 3...    
Initializing status block for node 4...    
Initializing status block for node 5...    
Initializing status block for node 6...    
Initializing status block for node 7...    
Initializing status block for node 8...    
Initializing status block for node 9...    
Initializing status block for node 10...    
Initializing status block for node 11...    
Initializing status block for node 12...    
Initializing status block for node 13...    
Initializing status block for node 14...    
Initializing status block for node 15...    
Initializing status block for node 16...

(2) 查看表决磁盘信息

[root@node01 ~]# mkqdisk -L   
mkqdisk v3.0.12.1

/dev/block/8:17:   
/dev/disk/by-id/scsi-14f504e46494c455242553273306c2d4b72697a2d544e6b4f-part1:    
/dev/disk/by-path/ip-192.168.0.187:3260-iscsi-iqn.2006-01.com.openfiler:tsn.dea898a36535-lun-0-part1:    
/dev/sdb1:    
        Magic:                eb7a62c2    
        Label:                myqdisk    
        Created:              Thu Jan  1 23:42:00 2015    
        Host:                 node02.abc.com    
        Kernel Sector Size:   512    
        Recorded Sector Size: 512

(3) 配置表决磁盘qdisk

# 进入管理界面Manage Clusters -->  gfs  -->  Configure  -->  QDisk

Device       : /dev/sdc1

Path to program : ping -c3 -t2 192.168.0.253   
Interval        : 3    
Score           : 2    
TKO             : 10    
Minimum Score   : 1

# 点击apply

(4) 启动qdisk服务

chkconfig qdiskd on   
service qdiskd start     
clustat -l

[root@node01 ~]# clustat -l

Cluster Status for gfs @ Thu Jan  1 23:50:53 2015   
Member Status: Quorate

Member Name                                                     ID   Status   
------ ----                                                     ---- ------    
node01.abc.com                                                      1 Online, Local    
node02.abc.com                                                      2 Online    
/dev/sdb1                                                           0 Online, Quorum Disk

[root@node01 ~]#

 

8. 测试GFS

1)node02节点上执行

# echo c > /proc/sysrq-trigger

2)node01节点上查看日志记录

# tail -f /var/log/messages    
Jan  2 01:37:47 node01 ricci: startup succeeded    
Jan  2 01:37:47 node01 rgmanager[2196]: I am node #1    
Jan  2 01:37:47 node01 rgmanager[2196]: Resource Group Manager Starting    
Jan  2 01:37:47 node01 rgmanager[2196]: Loading Service Data    
Jan  2 01:37:49 node01 rgmanager[2196]: Initializing Services    
Jan  2 01:37:49 node01 rgmanager[2196]: Services Initialized    
Jan  2 01:37:49 node01 rgmanager[2196]: State change: Local UP    
Jan  2 01:37:49 node01 rgmanager[2196]: State change: node02.abc.com UP    
Jan  2 01:37:52 node01 polkitd[3125]: started daemon version 0.96 using authority implementation `local' version `0.96'    
Jan  2 01:37:52 node01 rtkit-daemon[3131]: Sucessfully made thread 3129 of process 3129 (/usr/bin/pulseaudio) owned by '42' high priority at nice level -11.    
Jan  2 01:40:52 node01 qdiskd[1430]: Assuming master role    
Jan  2 01:40:53 node01 qdiskd[1430]: Writing eviction notice for node 2    
Jan  2 01:40:54 node01 qdiskd[1430]: Node 2 evicted    
Jan  2 01:40:55 node01 corosync[1378]:   [TOTEM ] A processor failed, forming new configuration.    
Jan  2 01:40:57 node01 corosync[1378]:   [QUORUM] Members[1]: 1    
Jan  2 01:40:57 node01 corosync[1378]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.    
Jan  2 01:40:57 node01 kernel: dlm: closing connection to node 2    
Jan  2 01:40:57 node01 corosync[1378]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.0.181) ; members(old:2 left:1)    
Jan  2 01:40:57 node01 corosync[1378]:   [MAIN  ] Completed service synchronization, ready to provide service.    
Jan  2 01:40:57 node01 kernel: GFS2: fsid=gfs:gfs2.1: jid=0: Trying to acquire journal lock...    
Jan  2 01:40:57 node01 fenced[1522]: fencing node node02.abc.com    
Jan  2 01:40:57 node01 rgmanager[2196]: State change: node02.abc.com DOWN    
Jan  2 01:41:11 node01 fenced[1522]: fence node02.abc.com success    
Jan  2 01:41:12 node01 kernel: GFS2: fsid=gfs:gfs2.1: jid=0: Looking at journal...    
Jan  2 01:41:12 node01 kernel: GFS2: fsid=gfs:gfs2.1: jid=0: Done    
Jan  2 01:41:30 node01 corosync[1378]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.    
Jan  2 01:41:30 node01 corosync[1378]:   [QUORUM] Members[2]: 1 2    
Jan  2 01:41:30 node01 corosync[1378]:   [QUORUM] Members[2]: 1 2    
Jan  2 01:41:30 node01 corosync[1378]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.0.181) ; members(old:1 left:0)    
Jan  2 01:41:30 node01 corosync[1378]:   [MAIN  ] Completed service synchronization, ready to provide service.    
Jan  2 01:41:38 node01 qdiskd[1430]: Node 2 shutdown    
Jan  2 01:41:50 node01 kernel: dlm: got connection from 2    
Jan  2 01:41:59 node01 rgmanager[2196]: State change: node02.abc.com UP

说明:fence功能正常,期间GFS文件系统正常。

9. 配置文件

cat /etc/cluster/cluster.conf

<?xml version="1.0"?>   
<cluster config_version="9" name="gfs">    
        <clusternodes>    
                <clusternode name="node01.abc.com" nodeid="1">    
                        <fence>    
                                <method name="node01_fence">    
                                        <device name="ESXi_fence" port="kvm_node1" ssl="on" uuid="564d6fbf-05fb-1dd1-fb66-7ea3c85dcfdf"/>    
                                </method>    
                        </fence>    
                </clusternode>    
                <clusternode name="node02.abc.com" nodeid="2">    
                        <fence>    
                                <method name="node02_fence">    
                                        <device name="ESXi_fence" port="kvm_node2" ssl="on" uuid="564d4c42-e7fd-db62-3878-57f77df2475e"/>    
                                </method>    
                        </fence>    
                </clusternode>    
        </clusternodes>    
        <cman expected_votes="3"/>    
        <fencedevices>    
                <fencedevice agent="fence_vmware_soap" ipaddr="192.168.0.21" login="root" name="ESXi_fence" passwd="111111"/>    
        </fencedevices>    
        <rm>    
                <failoverdomains>    
                        <failoverdomain name="gfs_failover" nofailback="1" ordered="1">    
                                <failoverdomainnode name="node01.abc.com" priority="1"/>    
                                <failoverdomainnode name="node02.abc.com" priority="1"/>    
                        </failoverdomain>    
                </failoverdomains>    
        </rm>    
        <quorumd device="/dev/sdb1" min_score="1">    
                <heuristic interval="3" program="ping -c3 -t2 192.168.0.253" score="2" tko="10"/>    
        </quorumd>    
</cluster>

 

配置小结:

1,  如果采用FC存储,就不会出现IP SAN因网络出现的问题。

2,  IP SAN配置时,最好采用专用的网卡与网段。在安装成功后,因配置KVM虚拟环境,调整网卡为桥接接口,虽然网络是通的,但是无法启动GFS, 最终在国外网站上找到一句话,才解释了原因,最终连接IPSAN网卡接口为固定接口,GFS环境恢复正常。

https://www.mail-archive.com/linux-cluster@redhat.com/msg03800.html

You'll need to check the routing of the interfaces. The most common cause of this sort of error is having two interfaces on the same physical (or internal) network.

3,  实际环境可以采用真实fence设备来带替fence_vmware_soap,实现fence功能。

本文出自 “koumm的linux技术博客” 博客,请务必保留此出处http://koumm.blog.51cto.com/703525/1598367

时间: 2024-11-17 08:49:51

CentOS 6.5 x64 RHCS GFS配置的相关文章

compile httpd 2.4.9, perl, php in CentOS 6.x x64

httpd 2.4.9 在CentOS 6.x x64上的安装过程. 一. apr安装 http://apr.apache.org/download.cgi # wget http://mirror.bit.edu.cn/apache//apr/apr-1.5.1.tar.bz2 # tar -jxvf apr-1.5.1.tar.bz2 # cd apr-1.5.1 # ./configure --prefix=/opt/apr1.5.1 # make && make test # ma

Joomla, PHP, Nginx, PostgreSQL, fpm on CentOS 6.4 x64

本文介绍一下Joomla的安装和配置. 环境 :  CentOS  6.x x64 PHP # rpm -qa|grep php php-pdo-5.4.30-1.el6.remi.x86_64 php-gd-5.4.30-1.el6.remi.x86_64 php-fpm-5.4.30-1.el6.remi.x86_64 php-common-5.4.30-1.el6.remi.x86_64 php-5.4.30-1.el6.remi.x86_64 php-extras-debuginfo-5

mediawiki, nginx, PHP, PostgreSQL, zhparser on CentOS 6.4 x64

本文介绍一下如何部署wiki网站. 目前wikimedia网站的架构 :    本文测试环境 :  CentOS 6.x x64 mediawiki 1.23.1 php 5.5.14 nginx 1.6.0 PostgreSQL 9.3.x zhparser 中文分词 选择mediawiki 的 lts版本.  # wget http://releases.wikimedia.org/mediawiki/1.23/mediawiki-1.23.1.tar.gz # tar -zxvf medi

use process&#039;s network device namespace on CentOS 6.5+ x64 by openstack modified iproute package

在阅读docker 高级网络时, 发现原来还可以设置进程级别的网络设备namespace. 我这里的环境是CentOS 6.5 x64, 这个版本的iproute包还比较老, 不支持ip netns指令, 所以需要更新一下, 参考本文末尾, 如果你使用的也是CentOS 6.5, 请务必更新iproute后再来做这个实验. 使用ip netns自定义docker container的网络配置例子. 1. 启动2个container, 并且使用--net=none, 即不分配网络设备. # Sta

compile nginx 1.6.0 with all modules in CentOS 6.x x64 (when with openssl encount bug)

本文讲一下nginx 1.6.0的源码安装, 遇到一个bug, 开启openssl支持时, 因为make文件的问题, 导致make错误, 后面会有如何避免这个错误的方法, 需要修改nginx代码中openssl的conf文件. 编译环境 :  CentOS 6.x x64 依赖包大部分通过yum安装(除了google performance a tools,pcre和zlib, zlib通过nginx配置的makefile来安装). 安装依赖包. 安装google performance ana

oVirt engine 3.4 installed on CentOS 6.x x64

oVirt是RHEV的上游开源产品, 管理也和RHEV非常相似, 主打KVM的虚拟机管理平台. 相比OpenStack更加轻量化. 本文先介绍一下oVirt engine在CentOS 6.x x64平台下的安装. 除了数据库我们使用自己编译的PostgreSQL 9.3.5, 因为从依赖安装的版本实在太老了. 其他都使用依赖安装. 导入yum源 # wget http://resources.ovirt.org/pub/yum-repo/ovirt-release34.rpm # rpm -i

CentOS 6.7 x64上编译安装ffmpeg的教程

系统信息 [root@LookBack ~]# getconf LONG_BIT 64 [root@LookBack ~]# cat /etc/redhat-release CentOS release 6.7 (Final) yum源信息,这里就不再说epel和rpmforge源的安装了 [root@LookBack ~]# yum repolist 已加载插件:fastestmirror Loading mirror speeds from cached hostfile  * base:

CentOS 6.5 x64系统中安装MongoDB 2.6.0二进制发行版教程_MongoDB

MongoDB的国外镜像访问非常慢,以至于选择MongoDB官网的在线安装很不靠谱.那么,我们可以选择安装MongoDB 2.6的二进制发布包. 下面我们在CentOS 6.5 x64系统上安装最新的MongoDB 2.6.0二进制发行版. 1.下载MongoDB 2.6.0二进制发行版 复制代码 代码如下: $ curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.6.0.tgz 2.解压MongoDB的压缩包 复制

Centos 6.4安装pptp同时配置debian gnome桌面vpn客户端连接

 pptp vpn我相信很多的朋友都有听过吧,今天我就为各位介绍Centos 6.4安装pptp同时配置debian gnome桌面vpn客户端连接的例子,希望下文对各位有帮助.     以下是基于Linode VPS Centos 6.4下安装pptp服务,记住Linode VPS是Xen虚拟的,所以请看清楚环境配置. 快速安装,当然少不了yum: # rpm -Uvh http://poptop.sourceforge.net/yum/stable/rhel6/pptp-release-cu