keepalived modify notify.c & vrrp.c to enforce waiting notify script execute success

我们在设计数据库HA的时候, 为了防止主备同时读写一份数据, 或者主备同时对外提供服务. 备机在切换成主机前, 需要有一个fence操作, 需先将主节点fence掉, 然后再激活备库为主库.

例如我们这边用的一个HA脚本 : 

https://raw.githubusercontent.com/digoal/sky_postgresql_cluster/master/INSTALL.txt

或者RHEL的HA套件, 都有类似的FENCE操作.

但是keepalived目前没有切换前的自定义脚本, 只有切换后的自定义脚本(notify), 并且不等待脚本执行.

如图 : 

keepalived如果用于数据库的HA切换还需要完善一下.

本文将讲解一下, 如何修改keepalived代码, 来实现这方面的功能, 即备切换成主之前, 必须先等待fence主节点.

最终目的如图 : 

首先要了解一下keepalived notify的机制 : 

顺序是这样的 : 起vip, 起静态路由, fork进程调用notify脚本(并且不等待). 

注意调用notify脚本在起VIP之后.

我们的目的是把它放到最前面, 并且需要等待notify脚本调用完成再起VIP.

keepalived有几项配置和notify有关, 即当路由器的角色转换后, 调用用户配置的脚本.

doc/keepalived.conf.SYNOPSIS

    notify_master <STRING>|<QUOTED-STRING> # Script to run during MASTER transit
    notify_backup <STRING>|<QUOTED-STRING> # Script to run during BACKUP transit
    notify_fault <STRING>|<QUOTED-STRING>  # Script to run during FAULT transit
    notify <STRING>|<QUOTED-STRING>        # Script to run during ANY state transit (1)
    smtp_alert           # Send email notif during state transit
}

(1) The "notify" script is called AFTER the corresponding notify_* script has
    been called, and is given exactly 4 arguments (the whole string is interpreted
    as a litteral filename so don't add parameters!):

    $1 = A string indicating whether it's a "GROUP" or an "INSTANCE"
    $2 = The name of said group or instance
    $3 = The state it's transitioning to ("MASTER", "BACKUP" or "FAULT")
    $4 = The priority value

    $1 and $3 are ALWAYS sent in uppercase, and the possible strings sent are the
    same ones listed above ("GROUP"/"INSTANCE", "MASTER"/"BACKUP"/"FAULT").

当路由器在进入对应的角色后, 会调用对应的脚本如notify_master, notify_backup, notify_fault.

当以上脚本调用完后, 再调用notify脚本.

相关代码举例 : 

进入master角色的代码

keepalived/vrrp/vrrp.c

/* MASTER state processing */
int
vrrp_state_master_tx(vrrp_t * vrrp, const int prio)
{
        int ret = 0;

        if (!VRRP_VIP_ISSET(vrrp)) {
                log_message(LOG_INFO, "VRRP_Instance(%s) Entering MASTER STATE"
                                    , vrrp->iname);
                vrrp_state_become_master(vrrp);   // 进入master角色的操作对应的函数
                ret = 1;
        } else if (vrrp->garp_refresh && timer_cmp(time_now, vrrp->garp_refresh_timer) > 0) {
                vrrp_send_link_update(vrrp);
                vrrp->garp_refresh_timer = timer_add_long(time_now, vrrp->garp_refresh);
        }

        vrrp_send_adv(vrrp,
                      (prio == VRRP_PRIO_OWNER) ? VRRP_PRIO_OWNER :
                                                  vrrp->effective_priority);
        return ret;
}

当进入master角色后, 在启用虚拟IP后, 会调用notify脚本.
/* becoming master */
void
vrrp_state_become_master(vrrp_t * vrrp)
{
        /* add the ip addresses */  // 先启用虚拟IP
        if (!LIST_ISEMPTY(vrrp->vip))
                vrrp_handle_ipaddress(vrrp, IPADDRESS_ADD, VRRP_VIP_TYPE);
        if (!LIST_ISEMPTY(vrrp->evip))
                vrrp_handle_ipaddress(vrrp, IPADDRESS_ADD, VRRP_EVIP_TYPE);
        vrrp->vipset = 1;

        /* add virtual routes */  // 然后添加静态路由
        if (!LIST_ISEMPTY(vrrp->vroutes))
                vrrp_handle_iproutes(vrrp, IPROUTE_ADD);

        /* remotes neighbour update */  // 然后发送免费ARP报文
        vrrp_send_link_update(vrrp);

        /* set refresh timer */
        if (vrrp->garp_refresh) {
                vrrp->garp_refresh_timer = timer_add_long(time_now, vrrp->garp_refresh);
        }

        /* Check if notify is needed */  // 最后是调用自定义的notify脚本
        notify_instance_exec(vrrp, VRRP_STATE_MAST);

#ifdef _WITH_SNMP_  // 触发snmp
        vrrp_snmp_instance_trap(vrrp);
#endif
....

这里要注意的是, notify脚本调用时fork进程通过system函数来调用的, 并且主进程直接返回0, 不等待子进程执行完毕.

对应的代码 : 

lib/notify.c

/* perform a system call */
int
system_call(char *cmdline)
{
        int retval;

        retval = system(cmdline);

        if (retval == 127) {
                /* couldn't exec command */
                log_message(LOG_ALERT, "Couldn't exec command: %s", cmdline);
        } else if (retval == -1) {
                /* other error */
                log_message(LOG_ALERT, "Error exec-ing command: %s", cmdline);
        }

        return retval;
}
/* Execute external script/program */
int
notify_exec(char *cmd)
{
        pid_t pid;
        int ret;

        pid = fork();

        /* In case of fork is error. */
        if (pid < 0) {
                log_message(LOG_INFO, "Failed fork process");
                return -1;
        }

        /* In case of this is parent process */  // 主进程直接退出, 没有wait 子进程.
        if (pid)
                return 0;

        signal_handler_destroy();
        closeall(0);

        open("/dev/null", O_RDWR);

        ret = dup(0);
        if (ret < 0) {
                log_message(LOG_INFO, "dup(0) error");
        }

        ret = dup(0);
        if (ret < 0) {
                log_message(LOG_INFO, "dup(0) error");
        }

        system_call(cmd);

        exit(0);
}

所以, 为了达到目的, 我们需要改2个地方. 为了方便观察, 加几个日志输出.

# cd /opt/soft_bak/keepalived-1.2.13

1. 把调用notify脚本移到vip前面

# vi keepalived/vrrp/vrrp.c
修改函数, 把调用notify脚本从这个函数剥离出来
/* becoming master */
void
vrrp_state_become_master(vrrp_t * vrrp)
{
...
        /* Check if notify is needed */
        //notify_instance_exec(vrrp, VRRP_STATE_MAST);  //注释这里 

修改函数
/* MASTER state processing */
int
vrrp_state_master_tx(vrrp_t * vrrp, const int prio)
{
...
        // if (!VRRP_VIP_ISSET(vrrp)) {  // 改成如下, 即把notify脚本放到vrrp_state_become_master之前, 并且返回成功才会调用vrrp_state_become_master.
        if ((!VRRP_VIP_ISSET(vrrp)) && notify_instance_exec(vrrp, VRRP_STATE_MAST)) {
                log_message(LOG_INFO, "VRRP_Instance(%s) Entering MASTER STATE"
                                    , vrrp->iname);
                vrrp_state_become_master(vrrp);
                ret = 1;

2. 需要让主进程等待notify脚本执行完

# vi lib/notify.c
添加等待需要的头文件
#include <sys/types.h>
#include <sys/wait.h>

修改函数
/* Execute external script/program */
int
notify_exec(char *cmd)
{
        pid_t pid;
        int ret;
        pid_t wait_pid;
        int wait_ret;

        pid = fork();

        /* In case of fork is error. */
        if (pid < 0) {
                log_message(LOG_INFO, "Failed fork process");
                return -1;
        }

        /* child process */  // 子进程执行
        if (pid == 0) {

          signal_handler_destroy();
          closeall(0);

          open("/dev/null", O_RDWR);

          ret = dup(0);
          if (ret < 0) {
                log_message(LOG_INFO, "dup(0) error");
          }

          ret = dup(0);
          if (ret < 0) {
                log_message(LOG_INFO, "dup(0) error");
          }

          log_message(LOG_INFO, "notify executing.");  // 添加日志输出
          system_call(cmd);
          exit(0);
        }

        /* In case of this is parent process */  // 主进程等待
        else {
                log_message(LOG_INFO, "waiting notify execute success.");   // 添加日志输出
                wait_pid = waitpid(pid, &wait_ret, 0);
                if (wait_pid == pid && WIFEXITED(wait_ret)) {
                  log_message(LOG_INFO, "notify execute successed.");    // 添加日志输出
                  return 0;
                }
                return -1;
        }

}

重新编译

# gmake && gmake install

测试

配置文件

[root@192_168_173_203 keepalived-1.2.13]# cd /opt/keepalived/etc/keepalived/
[root@192_168_173_203 keepalived]# cat keepalived.conf
! Configuration File for keepalived  !号或#号开头为注释

global_defs {      # 全局配置除了router_id, 其他都无所谓, 因为我这里的测试只是观察script weight对vrrp的影响.
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id DIGOAL_TEST1   # 随便配置, 因为这个不在VRRP包里面体现, 在VRRP包里体现的是虚拟路由器ID.
}

vrrp_script pos {   # 脚本配置
    script "/root/pos.sh"
    interval 1
    weight 0   # 配置默认weight 0, 后面可以在instance中覆盖
    fall 1    # 从OK到KO需要1次检测失败.
    rise 1    # 从KO到OK需要1次检测成功.
}

vrrp_script nag {   # 脚本配置
    script "/root/nag.sh"
    interval 1
    weight 0   # 配置默认weight 0, 后面可以在instance中覆盖
    fall 1    # 从OK到KO需要1次检测失败.
    rise 1    # 从KO到OK需要1次检测成功.
}

vrrp_script zero {   # 脚本配置
    script "/root/zero.sh"
    interval 1
    weight 0   # 配置默认weight 0, 后面可以在instance中覆盖
    fall 1    # 从OK到KO需要1次检测失败.
    rise 1    # 从KO到OK需要1次检测成功.
}

vrrp_instance vi_1 {
    state MASTER    # 初始状态
    interface eth0
    virtual_router_id 51
    priority 100   # 初始优先级, 注意两个节点配置要不一样. 一高一低, 高的选举为master.
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678  # 最大长度为8, 如果要修改最大长度,
                            # 可参考 http://blog.163.com/digoal@126/blog/static/163877040201472123810816/
    }
    track_script {
        pos weight 3
        zero weight 0
        nag weight -5
    }
    unicast_peer {   # 使用单播代替组播发送vrrp心跳.
        192.168.173.203
        192.168.173.204
    }
    virtual_ipaddress {   # 虚拟地址配置, 和ifconfig兼容
        172.16.173.100/24 brd 172.16.173.255 dev eth0 scope link label eth0:1
    }
    debug   # 打开debug, 方便调试, 本例没有用到
    nopreempt   # 不主动竞争master, 这样的话优先级只用于决定第一次搭建HA时的主库节点, 发生failover后, 不会主动提升.
    notify_master "/root/notify.sh"
}

配置脚本

[root@192_168_173_203 keepalived]# vi /root/notify.sh
#!/bin/bash

sleep 1000
exit 0

# chmod 555 /root/notify.sh

启动keepalived

# keepalived -f /opt/keepalived/etc/keepalived/keepalived.conf -D

观察日志 : 

# tail -f -n 1 /var/log/messages
Aug 26 10:13:31 192_168_173_203 Keepalived[14531]: Starting Keepalived v1.2.13 (08/19,2014)
Aug 26 10:13:31 192_168_173_203 Keepalived[14532]: Starting Healthcheck child process, pid=14533
Aug 26 10:13:31 192_168_173_203 Keepalived[14532]: Starting VRRP child process, pid=14534
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Netlink reflector reports IP 192.168.173.203 added
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Netlink reflector reports IP 192.168.173.156 added
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Netlink reflector reports IP fe80::f6ce:46ff:fe86:4234 added
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Netlink reflector reports IP 192.168.173.203 added
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Netlink reflector reports IP 192.168.173.156 added
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Registering Kernel netlink reflector
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Registering Kernel netlink command channel
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Netlink reflector reports IP fe80::f6ce:46ff:fe86:4234 added
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Registering gratuitous ARP shared channel
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Registering Kernel netlink reflector
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Registering Kernel netlink command channel
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Opening file '/opt/keepalived/etc/keepalived/keepalived.conf'.
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Opening file '/opt/keepalived/etc/keepalived/keepalived.conf'.
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Configuration is using : 7972 Bytes
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Configuration is using : 69370 Bytes
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: Using LinkWatch kernel netlink reflector...
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: VRRP sockpool: [ifindex(2), proto(112), unicast(1), fd(10,11)]
Aug 26 10:13:31 192_168_173_203 Keepalived_healthcheckers[14533]: Using LinkWatch kernel netlink reflector...
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Script(nag) succeeded
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Script(zero) succeeded
Aug 26 10:13:31 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Script(pos) succeeded
Aug 26 10:13:32 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Instance(vi_1) Transition to MASTER STATE
Aug 26 10:13:33 192_168_173_203 Keepalived_vrrp[14534]: waiting notify execute success.
Aug 26 10:13:33 192_168_173_203 Keepalived_vrrp[14564]: notify executing.
... 1000秒后, 进入master, 并且启动VIP, 测试成功.
Aug 26 10:30:13 192_168_173_203 Keepalived_vrrp[14534]: notify execute successed.
Aug 26 10:30:13 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Instance(vi_1) Entering MASTER STATE
Aug 26 10:30:13 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Instance(vi_1) setting protocol VIPs.
Aug 26 10:30:13 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Instance(vi_1) Sending gratuitous ARPs on eth0 for 172.16.173.100
Aug 26 10:30:13 192_168_173_203 Keepalived_healthcheckers[14533]: Netlink reflector reports IP 172.16.173.100 added
Aug 26 10:30:18 192_168_173_203 Keepalived_vrrp[14534]: VRRP_Instance(vi_1) Sending gratuitous ARPs on eth0 for 172.16.173.100

我们只要把fence操作放在这个脚本里面, 就可以实现先fence, 再启VIP的过程.

[注意]

本文仅提供思路, 仅供测试, 用于生产请自行承担风险.

[参考]
1. lib/notify.c

2. keepalived/vrrp/vrrp.c

3. man 2 wait

4. man 3 system

5. doc/keepalived.conf.SYNOPSIS

6. https://raw.githubusercontent.com/digoal/sky_postgresql_cluster/master/INSTALL.txt

7. http://blog.163.com/digoal@126/blog/static/163877040201472123810816/

时间: 2024-09-13 12:28:49

keepalived modify notify.c & vrrp.c to enforce waiting notify script execute success的相关文章

Nginx+Keepalived实现站点高可用

公司内部 OA 系统要做线上高可用,避免单点故障,所以计划使用2台虚拟机通过 Keepalived 工具来实现 nginx 的高可用(High Avaiability),达到一台nginx入口服务器宕机,另一台备机自动接管服务的效果.(nginx做反向代理,实现后端应用服务器的负载均衡)快速搭建请直接跳至 第2节. 1. Keepalived介绍 Keepalived是一个基于VRRP协议来实现的服务高可用方案,可以利用其来避免IP单点故障,类似的工具还有heartbeat.corosync.p

centos系统keepalived安装和配置过程详解

[keepalived安装] 1.下载keepalived. wget http://www.keepalived.org/software/keepalived-1.2.13.tar.gz 2.编译安装 yum install -y gcc openssl openssl-devel ipvsadm libnl-devel tar zxvf keepalived-1.2.13.tar.gz cd keepalived-1.2.13 ./configure --sysconfdir=/etc 

keepalived的原理、安装和配置

什么是Keepalived呢,keepalived观其名可知,保持存活,在网络里面就是保持在线了,也就是所谓的高可用或热备,用来防止单点故障(单点故障是指一旦某一点出现故障就会导致整个系统架构的不可用)的发生,那说到keepalived时不得不说的一个协议就是VRRP协议,可以说这个协议就是keepalived实现的基础,那么首先我们来看看VRRP协议 一,keepalived的原理 1,VRRP协议 学过网络的朋友都知道,网络在设计的时候必须考虑到冗余容灾,包括线路冗余,设备冗余等,防止网络存

Linux高可用(HA)之MySQL多主一从+Keepalived跨机房集群部署

添加host解析.时间同步和ssh互信(注:这里的做ssh互信的时候使用到一个脚本借助expect实现了面交互操作了) [root@DS-CentOS51 ~]# echo "172.16.0.51 mysql-master01 > 172.16.0.60 mysql-master02 > 172.16.0.63 mysql-slave01 > 172.16.0.69 mysql-slave02" >> /etc/hosts [root@DS-CentOS

keepalived track script introduce

前面简单的介绍了一下keepalived track script | interface weight & instance priority & election的关系,  http://blog.163.com/digoal@126/blog/static/163877040201471992259474/ 本文将重点介绍一下keepalived跟踪脚本的配置用法以及参数的解释 :  keepalived VRRP配置中有一个配置项, vrrp_script :  截取自 : kee

haproxy+keepalived实现高可用负载均衡(理论篇)_Linux

HAProxy相比LVS的使用要简单很多,功能方面也很丰富.当 前,HAProxy支持两种主要的代理模式:"tcp"也即4层(大多用于邮件服务器.内部协议通信服务器等),和7层(HTTP).在4层模式 下,HAProxy仅在客户端和服务器之间转发双向流量.7层模式下,HAProxy会分析协议,并且能通过允许.拒绝.交换.增加.修改或者删除请求 (request)或者回应(response)里指定内容来控制协议,这种操作要基于特定规则. 我现在用HAProxy主要在于它有以下优点,这里我

Keepalived+HAProxy实现MySQL高可用负载均衡的配置_Mysql

 Keepalived 由于在生产环境使用了mysqlcluster,需要实现高可用负载均衡,这里提供了keepalived+haproxy来实现.       keepalived主要功能是实现真实机器的故障隔离及负载均衡器间的失败切换.可在第3,4,5层交换.它通过VRRPv2(Virtual Router Redundancy Protocol) stack实现的.       Layer3:Keepalived会定期向服务器群中的服务器.发送一个ICMP的数据包(既我们平时用的Ping程

keepalived实现服务高可用

第1章 keepalived服务说明 1.1 keepalived是什么? Keepalived软件起初是专为LVS负载均衡软件设计的,用来管理并监控LVS集群系统中各个服务节点的状态,后来又加入了可以实现高可用的VRRP功能.因此,Keepalived除了能够管理LVS软件外,还可以作为其他服务(例如:Nginx.Haproxy.MySQL等)的高可用解决方案软件. Keepalived软件主要是通过VRRP协议实现高可用功能的.VRRP是Virtual Router RedundancyPr

PostgreSQL Notify/Listen Like ESB

ESB : 基于消息的调用企业服务的通信模块. 上个礼拜和一位老朋友吃饭,聊到他现在在做的一个产品.才初略的对ESB有点了解. 发现PostgreSQL独有的Notify和Listen与总线实现的东西类似.一般用于监测表的改变,配合触发器使用,通知接收方表的数据有改变,接收方及时采取刷新动作. 下面来简单的了解一下,详细的请参看PG源码src/backend/commands/async.c部分. LISTEN   用来注册一个消息通道.   如果在事务中执行listen, 那么事务必须成功co