早上francs问的一个问题, 在rc.local中配置的如下 :
# add by pg_clusterd
/sbin/service network start
/sbin/ifup eth0
/sbin/ifdown eth0:1
ifdown eth0:1目的是用来关闭虚拟IP, 这样做可能会导致eth0:1的IP会短暂的启用(service network start后).
实际上是配置不当导致, 因为alias不是用ONBOOT来控制是否自动启动的.
参考 /usr/share/doc/initscripts*/sysconfig.txt
ONBOOT=yes|no (not valid for alias devices; use ONPARENT)
我们来分析一下网络配置 :
配置文件如下 :
# less ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="static"
IPADDR="172.16.3.221"
NETMASK="255.255.255.0"
HWADDR="5C:F3:FC:94:12:00"
IPV6INIT="no"
MTU="1500"
NM_CONTROLLED="no"
ONBOOT="yes"
TYPE="Ethernet"
UUID="5904b633-4b24-47d4-b710-292feedda75b"
# less ifcfg-eth0:1
DEVICE="eth0:1"
BOOTPROTO="static"
IPADDR="172.16.3.222"
NETMASK="255.255.255.0"
IPV6INIT="no"
MTU="1500"
NM_CONTROLLED="no"
ONBOOT="no"
TYPE="Ethernet"
注意子接口eth0:1的ONBOOT=no, 并不会导致不启动, 是没有用的, 如果要控制alias接口不要自动启动, 请使用ONPARENT=no.
ONBOOT只被ifup boot参数使用.
SYNOPSIS
ifup IFACE [boot]
ifdown IFACE
DESCRIPTION
The ifup and ifdown commands may be used to configure (or, respec- tively, deconfigure) network interfaces
based on interface definitions in the files /etc/sysconfig/network and /etc/sysconfig/network-
scripts/ifcfg-<configuration>
These scripts take one argument normally: the name of the configuration (e.g. eth0). They are called with a
second argument of "boot" during the boot sequence so that devices that are not meant to be brought up on boot
(ONBOOT=no, see below) can be ignored at that time.
正常情况下配置了ONBOOT=no, 使用network服务来管理的话, 这个接口是不会起来的, 为什么呢?
看看network这个脚本
# less /etc/init.d/network
action $"Bringing up interface $i: " ./ifup $i boot
ifup 带 boot参数只启动ONBOOT=yes的接口.
例如如果配置了eth0:1的ONBOOT=NO, 我们直接使用ifup启动eth0:1是不会起这个接口的.
ifup eth0:1 boot
没有返回.
但是如果没有配置ONPARENT=no的话, 启动network 服务时, 子接口的IP还是起来了, 原因是启eth0时, 会连带起eth0:1, 并且没有探测eth0:1配置的IP.
原因是ifup-post中包含了ifup-aliases, 用来启动子接口.
ifup eth0
这个操作会启动eth0以及所有子接口, 不管子接口的ONBOOT是否为no, 只关心ONPARENT是否为no.
假设广播域中已经存在eth0:1配置的IP, service network start来启动的话, 会导致eth0:1起来, 也就是广播域会出现两个同样的IP,.
这也就是rc.local里面配置ifdown eth0:1的原因.
但是这样短暂的冲突时无法避免的.
解决办法推荐 :
方法1.
可以换一个方式, 不使用子接口来管理虚拟IP, 直接使用arping和ip来管理 VIP.
删除配置文件 ifcfg-eth0:1
启动时不再包含VIP.
要使用VIP时, 直接在link上添加IP即可.
例如 :
启动VIP前最好也先检测一下IP在网络中是否已经存在.
/sbin/arping -q -c 2 -w 3 -D -I eth0 172.16.3.222 && ip addr add 172.16.3.222/24 dev eth0
要删除VIP, 直接删除即可.
ip addr del 172.16.3.222/24 dev eth0
IP存在, arping返回1 :
注意必须要使用正确的接口, 如这里是br0.
[root@db-172-16-3-221 network-scripts]# /sbin/arping -q -c 2 -w 3 -D -I br0 172.16.3.2
[root@db-172-16-3-221 network-scripts]# echo $?
1
[root@db-172-16-3-221 network-scripts]# arp -a
150.sky-mobi.com (172.16.3.150) at 00:22:19:60:77:8f [ether] on br0
? (172.16.3.1) at 00:00:5e:00:01:03 [ether] on br0
? (172.16.3.3) at 68:ef:bd:08:3f:bf [ether] on br0
? (172.16.3.2) at e2:e8:2c:5e:7c:af [ether] on br0
IP不存在, arping返回0
[root@db-172-16-3-221 network-scripts]# /sbin/arping -q -c 2 -w 3 -D -I br0 172.16.3.4
[root@db-172-16-3-221 network-scripts]# echo $?
0
[root@db-172-16-3-221 network-scripts]# arp -a
150.sky-mobi.com (172.16.3.150) at 00:22:19:60:77:8f [ether] on br0
? (172.16.3.1) at 00:00:5e:00:01:03 [ether] on br0
? (172.16.3.3) at 68:ef:bd:08:3f:bf [ether] on br0
? (172.16.3.2) at 00:25:84:66:d0:3f [ether] on br0
? (172.16.3.4) at <incomplete> on br0
以下摘自ifup-aliases, ifup-eth
if ! LC_ALL=C ip addr ls ${REALDEVICE} | LC_ALL=C grep -q "${ipaddr[$idx]}/${prefix[$idx]}" ; then
if [ "${REALDEVICE}" != "lo" ] && [ "${arpcheck[$idx]}" != "no" ] ; then
echo $"Determining if ip address ${ipaddr[$idx]} is already in use for device ${REALDEVICE}..."
if ! /sbin/arping -q -c 2 -w 3 -D -I ${REALDEVICE} ${ipaddr[$idx]} ; then
net_log $"Error, some other host already uses address ${ipaddr[$idx]}."
exit 1
fi
fi
if [ "$setup_this" = "yes" ] ; then
if [ "${parent_device}" != "lo" ] && [ "${ARPCHECK}" != "no" ] && \
is_available ${parent_device} && \
grep -qswi "up" /sys/class/net/${parent_device}/operstate ; then
echo $"Determining if ip address ${IPADDR} is already in use for device ${parent_device}..."
if ! /sbin/arping -q -c 2 -w 3 -D -I ${parent_device} ${IPADDR} ; then
net_log $"Error, some other host already uses address ${IPADDR}."
return 1
fi
fi
方法2.
修改ifup-post脚本, 注释掉启动aliases接口的脚本. 那么就不会自动启动子接口.
# vi /etc/sysconfig/network-scripts/ifup-post
#if [ "$ISALIAS" = no ] ; then
# /etc/sysconfig/network-scripts/ifup-aliases ${DEVICE} ${CONFIG}
#fi
测试 :
启动network服务, 不会再自动启动子接口了.
[root@db6 network-scripts]# cp ifcfg-bond0 ifcfg-bond0:1
[root@db6 network-scripts]# vi ifcfg-bond0:1
DEVICE=bond0:1
ONBOOT=no
BOOTPROTO=none
IPADDR=172.16.3.2
NETMASK=255.255.255.0
BROADCAST=172.16.3.255
TYPE=Ethernet
[root@db6 network-scripts]# service network restart
Shutting down interface bond0: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface bond0: [ OK ]
[root@db6 network-scripts]# ifconfig
bond0 Link encap:Ethernet HWaddr 00:23:7D:A3:F0:0A
inet addr:172.16.3.174 Bcast:172.16.3.255 Mask:255.255.255.0
inet6 addr: fe80::223:7dff:fea3:f00a/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:10 errors:0 dropped:0 overruns:0 frame:0
TX packets:43 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:732 (732.0 b) TX bytes:10573 (10.3 KiB)
手工启动子接口 :
[root@db6 network-scripts]# ifup bond0:1
[root@db6 network-scripts]# ifconfig
bond0 Link encap:Ethernet HWaddr 00:23:7D:A3:F0:0A
inet addr:172.16.3.174 Bcast:172.16.3.255 Mask:255.255.255.0
inet6 addr: fe80::223:7dff:fea3:f00a/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:319 errors:0 dropped:0 overruns:0 frame:0
TX packets:145 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:36732 (35.8 KiB) TX bytes:26445 (25.8 KiB)
bond0:1 Link encap:Ethernet HWaddr 00:23:7D:A3:F0:0A
inet addr:172.16.3.2 Bcast:172.16.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
方法3.
子接口中使用ONPARENT=no, 例如 :
DEVICE=bond0:1
ONPARENT=no
ONBOOT=no
BOOTPROTO=none
IPADDR=172.16.3.2
NETMASK=255.255.255.0
BROADCAST=172.16.3.255
TYPE=Ethernet
GATEWAY=172.16.3.1
最后还有一个问题, ifup-aliases启动 子接口时, 不会检测子接口的IP是否存在.
例如 :
[root@db6 network-scripts]# ifup-aliases bond0
[root@db6 network-scripts]# ifconfig
bond0 Link encap:Ethernet HWaddr 00:23:7D:A3:F0:0A
inet addr:172.16.3.174 Bcast:172.16.3.255 Mask:255.255.255.0
inet6 addr: fe80::223:7dff:fea3:f00a/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:946 errors:0 dropped:0 overruns:0 frame:0
TX packets:788 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:102384 (99.9 KiB) TX bytes:129685 (126.6 KiB)
bond0:1 Link encap:Ethernet HWaddr 00:23:7D:A3:F0:0A
inet addr:172.16.3.150 Bcast:172.16.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
使用ifup会检测IP是否存在.
[root@db6 network-scripts]# ifdown bond0:1
[root@db6 network-scripts]# ifup bond0:1
Error, some other host already uses address 172.16.3.150.
这个问题如何解决?
修改ifup-aliases脚本 :
/etc/sysconfig/network-scripts/ifup-aliases :
##echo "setting up device $DEVICE"
/sbin/ifconfig ${DEVICE} ${IPADDR} netmask ${NETMASK} broadcast ${BROADCAST}
改成
##echo "setting up device $DEVICE"
/sbin/arping -q -c 2 -w 3 -D -I ${DEVICE} ${IPADDR} && /sbin/ifconfig ${DEVICE} ${IPADDR} netmask ${NETMASK} broadcast ${BROADCAST}
[小结]
1. 搞清楚aliases接口的启动控制不是ONBOOT, 而是ONPARENT.
2. 对于子接口启动不检查子接口地址是否在网络中重复的问题, 需要修改/etc/sysconfig/network-scripts/ifup-aliases.
[参考]
1. man ifup
2. /etc/sysconfig/network-scripts/ifup
3. /etc/sysconfig/network-scripts/ifup-eth
4. /etc/init.d/network
5. /usr/share/doc/initscripts-*/sysconfig.txt
6. arping - send ARP REQUEST to a neighbour host
-D Duplicate address detection mode (DAD). See RFC2131, 4.4.1. Returns 0, if DAD succeeded i.e. no replies
are received