参考文档:
11gR2 Clusterware and Grid Home - What You Need to Know (Doc ID 1053147.1)
诊断 Grid Infrastructure 启动问题 (Doc ID 1623340.1)
Oracle 11gR2 中对CRSD资源进行了重新分类: Local Resources 和 Cluster Resources,可以通过命令crsctl查看:
[root@rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
OFFLINE OFFLINE rac2
ora.FRA.dg
ONLINE ONLINE rac1
OFFLINE OFFLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.OCR_VOTE.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac1
ora.oc4j
1 OFFLINE OFFLINE
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac1
ora.test.db
1 ONLINE ONLINE rac1 Open
2 OFFLINE OFFLINE ——这里我故意关掉了rac2节点上的数据库实例
对应起来看:Local Resource就是应用层的东西;而Cluster Resource就是集群层的东西了。
我们可以用以下命令查看ohasd管理的资源:
[root@rac1 ~]# crsctl stat res -init -t ——在节点1上执行
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac1 Started
ora.crsd
1 ONLINE ONLINE rac1
ora.cssd
1 ONLINE ONLINE rac1
ora.cssdmonitor
1 ONLINE ONLINE rac1
ora.ctssd
1 ONLINE ONLINE rac1 OBSERVER
ora.diskmon
1 ONLINE ONLINE rac1
ora.drivers.acfs
1 ONLINE ONLINE rac1
ora.evmd
1 ONLINE ONLINE rac1
ora.gipcd
1 ONLINE ONLINE rac1
ora.gpnpd
1 ONLINE ONLINE rac1
ora.mdnsd
1 ONLINE ONLINE rac1
[root@rac2 ~]# crsctl stat res -init -t 在节点2上执行
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac2 Started
ora.crsd
1 ONLINE ONLINE rac2
ora.cssd
1 ONLINE ONLINE rac2
ora.cssdmonitor
1 ONLINE ONLINE rac2
ora.ctssd
1 ONLINE ONLINE rac2 OBSERVER
ora.diskmon
1 ONLINE ONLINE rac2
ora.drivers.acfs
1 ONLINE ONLINE rac2
ora.evmd
1 ONLINE ONLINE rac2
ora.gipcd
1 ONLINE ONLINE rac2
ora.gpnpd
1 ONLINE ONLINE rac2
ora.mdnsd
1 ONLINE ONLINE rac2
可以发现has进程在每个实例上看到和管理的东西是不一样的,也就是说has只管理自己服务器上的进程。我们接下来尝试关闭has进程:
[root@rac1 bin]# ./crsctl stop has
CRS-2791: Starting shutdown of Oracle HighAvailability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on'rac1'
CRS-2790: Starting shutdown of ClusterReady Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr'on 'rac1'
CRS-2673: Attempting to stop'ora.OCRVOTING.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.sdd.db'on 'rac1'
CRS-2673: Attempting to stop'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.oc4j' on'rac1'
CRS-2673: Attempting to stop 'ora.cvu' on'rac1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr'on 'rac1' succeeded
CRS-2673: Attempting to stop'ora.scan1.vip' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on'rac1' succeeded
CRS-2673: Attempting to stop 'ora.rac1.vip'on 'rac1'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac1'succeeded
CRS-2672: Attempting to start'ora.rac1.vip' on 'rac2'
CRS-2677: Stop of 'ora.scan1.vip' on 'rac1'succeeded
CRS-2672: Attempting to start 'ora.scan1.vip'on 'rac2'
CRS-2676: Start of 'ora.scan1.vip' on'rac2' succeeded
CRS-2676: Start of 'ora.rac1.vip' on 'rac2'succeeded
CRS-2672: Attempting to start'ora.LISTENER_SCAN1.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.sdd.db' on 'rac1'succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg'on 'rac1'
CRS-2673: Attempting to stop 'ora.FRA.dg'on 'rac1'
CRS-2676: Start of'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'rac1'succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'rac1'succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rac1'succeeded
CRS-2672: Attempting to start 'ora.oc4j' on'rac2'
CRS-2677: Stop of 'ora.cvu' on 'rac1'succeeded
CRS-2672: Attempting to start 'ora.cvu' on'rac2'
CRS-2676: Start of 'ora.cvu' on 'rac2'succeeded
CRS-2677: Stop of 'ora.OCRVOTING.dg' on'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on'rac1'
CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded
CRS-2676: Start of 'ora.oc4j' on 'rac2'succeeded
CRS-2673: Attempting to stop 'ora.ons' on'rac1'
CRS-2677: Stop of 'ora.ons' on 'rac1'succeeded
CRS-2673: Attempting to stop'ora.net1.network' on 'rac1'
CRS-2677: Stop of 'ora.net1.network' on'rac1' succeeded
CRS-2792: Shutdown of Cluster ReadyServices-managed resources on 'rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac1'succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on'rac1'
CRS-2673: Attempting to stop 'ora.asm' on'rac1'
CRS-2677: Stop of 'ora.evmd' on 'rac1'succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1'succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1'succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded
CRS-2673: Attempting to stop'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1'succeeded
CRS-2673: Attempting to stop 'ora.crf' on'rac1'
CRS-2677: Stop of 'ora.crf' on 'rac1'succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1'succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1'succeeded
CRS-2793: Shutdown of Oracle HighAvailability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Serviceshas been stopped.
[root@rac1 bin]#
注意:
我这里测试的是Oracle11gR2的环境,我们在节点1上执行该命令,只把节点1上的进程停了,而把相关的资源转移到我们的节点2上了,因此也证实了我们上面的说的,该命令只争对当前服务器有效。
启动HAS
[root@rac1 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Serviceshas been started.
[root@rac1 bin]#
从上面看只是启动了HAS。实际上后面会把Oracle Restart 管理的资源都会启动。这个可以使用crs_stat 命令来进程验证,不过Oracle 11g的进程启动过程比较慢,需要耐心等待。
等关闭has进程后,grid用户下,会有这几个进程被关闭:
[root@rac1 ~]# ps -fu grid
UID PID PPID C STIME TTY TIME CMD
grid 4899 1 0 22:28 ? 00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
grid 4912 1 0 22:28 ? 00:00:00 /u01/app/11.2.0/grid/bin/gipcd.bin
grid 4917 1 0 22:28 ? 00:00:00 /u01/app/11.2.0/grid/bin/mdnsd.bin
grid 4932 1 0 22:28 ? 00:00:00 /u01/app/11.2.0/grid/bin/gpnpd.bin
grid 4992 1 1 22:28 ? 00:00:01 /u01/app/11.2.0/grid/bin/ocssd.bin
grid 5008 1 0 22:28 ? 00:00:00 /u01/app/11.2.0/grid/bin/diskmon.bin -d -f
关于以上进程的解释如下:
(3)Grid Plug and Play (GPNPD):
Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.
(4)Grid Interprocess Communication (GIPC):
A support daemon that enables Redundant Interconnect Usage.
(5)ora.mdns
Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.
(6)Cluster Time Synchronization Service (CTSS):
Provides time management in a cluster for Oracle Clusterware. 在上面的查询结果中,我们看到CTSS 的状态是OBSERVER。即旁观者。
在11gR2中,RAC在安装的时候,时间同步可以用两种方式来实现,一是NTP,还有就是CTSS. 当安装程序发现 NTP 协议处于非活动状态时,安装集群时间同步服务将以活动模式自动进行安装并通过所有节点的时间。如果发现配置了 NTP,则以观察者模式启动集群时间同步服务,Oracle Clusterware 不会在集群中进行活动的时间同步。
(7)Automatic Storage Management Cluster File System (Oracle ACFS):
Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is a multi-platform, scalable file system, and storage management technology that extends Oracle Automatic Storage Management (Oracle ASM) functionality to support customer files maintained outside of Oracle Database. Oracle ACFS supports many database and application files, including executables, database trace files, database alert logs, application reports, BFILEs, and configuration files. Other supported files are video, audio, text, images, engineering drawings, and other general-purpose application file data.
An Oracle ACFS file system is a layer on Oracle ASM and is configured with Oracle ASM storage, as shown in Figure 5-1. Oracle ACFS leverages Oracle ASM functionality that enables:
· Oracle ACFS dynamic file system resizing
· Maximized performance through direct access to Oracle ASM disk group storage
· Balanced distribution of Oracle ACFS across Oracle ASM disk group storage for increased I/O parallelism
· Data reliability through Oracle ASM mirroring protection mechanisms
[root@rac1 u01]# shcrs_stat.sh
Name Target State Host
------------------------------ ------------------- -------
ora.DATA.dg ONLINE ONLINE rac1
ora.FRA.dg ONLINE ONLINE rac1
ora.LISTENER.lsnr ONLINE ONLINE rac1
ora.LISTENER_SCAN1.lsnr ONLINE ONLINE rac2
ora.OCRVOTING.dg ONLINE ONLINE rac1
ora.asm ONLINE ONLINE rac1
ora.cvu ONLINE ONLINE rac2
ora.gsd OFFLINE OFFLINE
ora.net1.network ONLINE ONLINE rac1
ora.oc4j ONLINE ONLINE rac2
ora.ons ONLINE ONLINE rac1
ora.rac1.ASM1.asm ONLINE ONLINE rac1
ora.rac1.LISTENER_RAC1.lsnr ONLINE ONLINE rac1
ora.rac1.gsd OFFLINE OFFLINE
ora.rac1.ons ONLINE ONLINE rac1
ora.rac1.vip ONLINE ONLINE rac1
ora.rac2.ASM2.asm ONLINE ONLINE rac2
ora.rac2.LISTENER_RAC2.lsnr ONLINE ONLINE rac2
ora.rac2.gsd OFFLINE OFFLINE
ora.rac2.ons ONLINE ONLINE rac2
ora.rac2.vip ONLINE ONLINE rac2
ora.scan1.vip ONLINE ONLINE rac2
ora.sdd.db ONLINE ONLINE rac2
2.2.3 禁用HAS(Restart)在server 重启后的自动启动
[root@rac1 bin]# ./crsctl disable has
CRS-4621: Oracle High Availability Servicesautostart is disabled.
[root@rac1 bin]#
2.2.4 查看HAS(Restart)的状态
[root@rac1 bin]# ./crsctl config has
CRS-4621: Oracle High Availability Servicesautostart is disabled.
2.2.5 启用HAS(Restart)在server 重启后的自启动
[root@rac1 bin]# ./crsctl enable has
CRS-4622: Oracle High Availability Servicesautostart is enabled.
--查看has的状态,验证刚才命令的效果:
[root@rac1 bin]# ./crsctl config has
CRS-4622: Oracle High Availability Servicesautostart is enabled.
[root@rac1 bin]#
2.2.6 查看Restart 当前状态
[root@rac1 bin]# ./crsctl check has
CRS-4638: Oracle High Availability Servicesis online
2.2.7 查看Oracle Restart 中由OHASD管理的resource 状态
[root@rac1 bin]# ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.FRA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.OCRVOTING.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.cvu
1 ONLINE ONLINE rac2
ora.oc4j
1 ONLINE ONLINE rac2
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
ora.sdd.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
[root@rac1 bin]#
2.3 使用SRVCTL 命令管理Restart(OHASD)
可以手工的使用SRVCTL 命令来管理Oracle Restart。从Oracle Restart 配置里添加或者删除一些组件。当我们手工的添加一个组件到到Oracle Restart,并使用SRVCTL启用该组件,那么Oracle Restart 就开始管理该组件,并根据需要决定是否对该组件进行重启。
官方文档的说明如下:
SRVCTL Command Reference for Oracle Restart
http://docs.oracle.com/cd/E11882_01/server.112/e25494/restart005.htm
Configuring OracleRestart
http://docs.oracle.com/cd/E11882_01/server.112/e10595/restart002.htm