接下来测试下测试下如果要实现SERVICE 分离业务在双网段上怎么设置。
一、 测试配置问题
我们使用如下的语句(ORACLE用户下)建立一个在3节点都激活的SERVICE,同时设定自动启动
[oracle@racc ~]$ srvctl add service -d racdb -s pub1 -r racdb1,racdb2,racdb3 -y AUTOMATIC –k1
[oracle@racc ~]$ srvctl add service -d racdb -s pub2 -r racdb1,racdb2,racdb3 -y AUTOMATIC –k2
这里要指定network进行分离,因为我们现在有2套公有网卡,如果其中一套其中一个机器的PUBLIC IP问题过后,这个机器的VIP进行FAILOVER,local listener 也就随之停止,当然SACN VIP也进行了漂移,SCAN LISTENER 也在另外的机器启动,这个时候这套PUBLIC IP是不能进行这个机器的实例的,但是另外一套IP却没有影响。
这里报错
[oracle@racc ~]$ srvctl add service -d racdb -s pub1 -r racdb1,racdb2,racdb3 -y AUTOMATIC –k1
PRCR-1006 : Failed to add resource ora.racdb.testall.svc for testall
PRCR-1071 : Failed to register or update resource ora.racdb.testall.svc
CRS-2566: User 'oracle' does not have sufficient permissions to operate on resource 'scanapp1', which is part of the dependency specification.
明显报错是由于权限问题,这里对比了权限设置发现我们建立的SCANAPP VIP貌似有权限问题
然后通过如下语句进行了修改,给与了other用户的可执行权限。
(PS:这里要对RESOURCE的属性比较熟悉才行)
[root@racc ~]# /oracle/app/grid/product/11.2.0/bin/crsctl modify resource scanapp2 -attr "ACL='owner:root:rwx,pgrp:root:r-x,other::r-x,user:root:r-x'"
[root@racc ~]# /oracle/app/grid/product/11.2.0/bin/crsctl modify resource scanapp2 -attr "ACL='owner:root:rwx,pgrp:root:r-x,other::r-x,user:root:r-x'"
[root@racc ~]# /oracle/app/grid/product/11.2.0/bin/crsctl modify resource scanapp3 -attr "ACL='owner:root:rwx,pgrp:root:r-x,other::r-x,user:root:r-x'"
{这里顺便把上面我们错误的LISTENER2的权限改一下:
启动阶段修改
crsctl modify resource "ora.LISTENER2.lsnr" -attr "ACL='owner:grid:rwx,pgrp:oinstall:rwx,other::r--'"
停止
srvctl stop listener –l LISTENER2
启动
srvctl start listener –l LISTENER2
这里我还遇到了修改顺序错误导致我的的LISTENER2资源在RACC节点为UNKOWN,无法CLEAN
ora.LISTENER2.lsnr
ONLINE ONLINE raca
ONLINE ONLINE racb
ONLINE UNKNOWN racc
随后
crsctl modify resource "ora.LISTENER2.lsnr" -attr "ACL='owner:root:rwx,pgrp:root:r-x,other::r--'"
然后停止,启动后按照正常的顺序进行才可以了,报错如下:
[grid@racc ~]$ crsctl stop res ora.LISTENER2.lsnr -f
CRS-2679: Attempting to clean 'ora.LISTENER2.lsnr' on 'racc'
CRS-2680: Clean of 'ora.LISTENER2.lsnr' on 'racc' failed
CRS-5807: Agent failed to process the message
CRS-4000: Command Stop failed, or completed with errors.
}
然后再次增加SERVICE 看看
[oracle@racc ~]$ srvctl add service -d racdb -s pub1 -r racdb1,racdb2,racdb3 -y AUTOMATIC –k1
[oracle@racc ~]$ srvctl add service -d racdb -s pub2 -r racdb1,racdb2,racdb3 -y AUTOMATIC –k2
完成没有问题。
启动SERVICE
经过检查两个SCAN_LISTENER* LISTENER_SCAN* SCAN_LISTENERAPP* 都有如下的信息
Service "testpub1" has 3 instance(s).
Instance "racdb1", status READY, has 1 handler(s) for this service...
Instance "racdb2", status READY, has 1 handler(s) for this service...
Instance "racdb3", status READY, has 1 handler(s) for this service...
Service "testpub2" has 3 instance(s).
Instance "racdb1", status READY, has 1 handler(s) for this service...
Instance "racdb2", status READY, has 1 handler(s) for this service...
Instance "racdb3", status READY, has 1 handler(s) for this service...
而各个节点的LISTENER1 LISTENER2
都有相关的信息。
但是这个时候我们如果ifconfig bond0 down 会出现这样的情况SCAN_LISTENERAPP* SCAN_LISTENER*都会如下
Service "testpub2" has 3 instance(s).
Instance "racdb1", status READY, has 1 handler(s) for this service...
Instance "racdb2", status READY, has 1 handler(s) for this service...
Instance "racdb3", status READY, has 1 handler(s) for this service...
Service "testpub1" has 2 instance(s).
Instance "racdb1", status READY, has 1 handler(s) for this service...
Instance "racdb2", status READY, has 1 handler(s) for this service...
意思很明显节点3的实例PUB1已经不能进入。所以我们在想用SERVICE进行业务分离的时候建立的SERVICE也必须和NETWORK 对应起来
所以连接我们应该这样写
RACSCAN2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = scantwo.gaopp.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = testpub2 )
)
)
RACSAN1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = scanone.gaopp.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = testpub1)
)
)
是不是觉得很烦。这就是为了做到这个要求需要的,资源很多,3节点对应6个VIP资源,6个LOCAL LISTENER,6个SCAN IP,6个SCAN LISTENER,而且每个库需要建立2个SERVICE资源进行分离,如果在加上DISKGROUP的资源和多个数据库的资源加上默认的资源资源大概会在60-70个左右。管理起来非常麻烦。要求很高。
二、 进行业务分离测试,并且测试SERVICE能否在节点重新启动后随VIP一起漂移回来。
[oracle@racc ~]$ srvctl add service -d racdb -s hrpub1 -r racdb3 -a racdb1,racdb2 -y automatic -k 1
[oracle@racc ~]$ srvctl add service -d racdb -s hrpub2 -r racdb3 -a racdb1,racdb2 -y automatic -k 2
[oracle@racc ~]$ srvctl start service -d racdb -s hrpub1
[oracle@racc ~]$ srvctl start service -d racdb -s hrpub2
这样又多了2个资源,汗水。
ora.racdb.hrpub1.svc
1 ONLINE ONLINE racc
ora.racdb.hrpub2.svc
1 ONLINE ONLINE racc
直接DOWN掉bond1,出现一些问题,让其vip,appvip,app scan listener,都漂移,本地LISTENER offline
ifconfig bond1 down
测试结果SERVICE 不能
三、 最后我恢复到了SCAN+VIP的模式进行测试
这样不仅少了几个资源,application VIP3个没有了,SCAN LISTENER少了3个这样更加简单一些,维护也相对而言方便一些。
[oracle@racc ~]$ srvctl add service -d racdb -s hrpub1 -r racdb3 -a racdb1,racdb2 -y automatic -k 1
[oracle@racc ~]$ srvctl add service -d racdb -s hrpub2 -r racdb3 -a racdb1,racdb2 -y automatic -k 2
这里要说明一下hrpub1是基于network1的,如果说network1的网段是172.16.14.0,而network2的网段是172.16.1.0,如果实例racdb3 172.16.14.113 这个机器的网卡故障,那这个时候随着发生的是VIP故障转移,local listener 关闭,因为在NETWORK1 racdb3 已经不能进入所以service hrpub1进行转移,同时SCAN VIP也漂移,SCAN_LISTENER也进行了漂移,SCAN_LISTENER上的注册实例会取消掉racdb3而注册为起备份的racdb1或者racdb2,但是这个时候NETWROK2却没有问题,hrpub2仍然在racdb3机器上的local listener进行注册,注册的实例任然是racdb3,不影响。
ora.racdb.hrpub1.svc
1 ONLINE ONLINE racc
ora.racdb.hrpub2.svc
1 ONLINE ONLINE racc
正常情况下,
然后我们DOWN 172.16.14.113 所在bond0
[root@racc ~]# ifconfig bond0 down
ora.rac1vip.vip
1 ONLINE ONLINE raca
ora.rac2vip.vip
1 ONLINE ONLINE racb
ora.rac3vip.vip
1 ONLINE ONLINE racc
ora.raca.vip
1 ONLINE ONLINE raca
ora.racb.vip
1 ONLINE ONLINE racb
ora.racc.vip
1 ONLINE INTERMEDIATE racb FAILED OVER
可以看到ora.racc.vip故障转移了但是ora.rac3vip.vip没有影响,
ora.LISTENER.lsnr
ONLINE ONLINE raca
ONLINE ONLINE racb
ONLINE OFFLINE racc
ora.LISTENER2.lsnr
ONLINE ONLINE raca
ONLINE ONLINE racb
ONLINE ONLINE racc
Listener1基于ora.racc.vip,OFFLINE了。
ora.racdb.hrpub1.svc
1 ONLINE ONLINE raca
ora.racdb.hrpub2.svc
1 ONLINE ONLINE racc
也和预计一样。
查看监听
Service "hrpub1" has 1 instance(s).
Instance "racdb1", status READY, has 1 handler(s) for this service...
Service "hrpub2" has 1 instance(s).
Instance "racdb3", status READY, has 2 handler(s) for this service...
当然连接也就是
RACPUB2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.1.114)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.1.115)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.1.116)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = hrpub2)
)
)
RACPUB1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = scanone.gaopp.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = hrpub1)
)
)