oracle中11.2中手工kill所有的CRS进程而不导致主机重启方法

我们都知道,在RAC环境中,如果kill ocssd.bin进程,会引起主机重启。
但是有时候系统已经异常了了,且CRS不能正常关闭,而主机可能是几年没重启的老系统,没人敢重启,现在怎么办?
我们只能尝试手工kill进程的方式,然后手工修复CRS(注意,在10.2 RAC中,只有3个d.bin进程)。
测试环境:操作系统是OEL 6.6
[root@lunar1 ~]# cat /etc/oracle-release
Oracle Linux Server release 6.6
[root@lunar1 ~]#
[root@lunar1 ~]# uname -a
Linux lunar1 3.8.13-44.1.1.el6uek.x86_64 #2 SMP Wed Sep 10 06:10:25 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@lunar1 ~]#
这套RAC的CRS版本是11.2.0.4:
[root@lunar1 ~]# crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.4.0]
[root@lunar1 ~]# crsctl query crs releaseversion
Oracle High Availability Services release version on the local node is [11.2.0.4.0]
[root@lunar1 ~]# crsctl query crs softwareversion
Oracle Clusterware version on node [lunar1] is [11.2.0.4.0]
[root@lunar1 ~]#
注意,由于12.1普通RAC(非Flex Cluster)的情况根本文一样,处理思路和过程也一样。
查看当前CRS的状态:
[root@lunar1 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
ora.DATADG1.dg
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
ora.DATADG2.dg
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
ora.LISTENER.lsnr
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
ora.asm
               ONLINE  ONLINE       lunar1                   Started            
               ONLINE  ONLINE       lunar2                   Started            
ora.gsd
               OFFLINE OFFLINE      lunar1                                      
               OFFLINE OFFLINE      lunar2                                      
ora.net1.network
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
ora.ons
               ONLINE  ONLINE       lunar1                                      
               ONLINE  ONLINE       lunar2                                      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       lunar2                                      
ora.cvu
      1        ONLINE  ONLINE       lunar2                                      
ora.lunar.db
      1        ONLINE  ONLINE       lunar1                   Open               
      2        ONLINE  OFFLINE                               STARTING           
ora.lunar1.vip
      1        ONLINE  ONLINE       lunar1                                      
ora.lunar2.vip
      1        ONLINE  ONLINE       lunar2                                      
ora.oc4j
      1        ONLINE  ONLINE       lunar1                                      
ora.scan1.vip
      1        ONLINE  ONLINE       lunar2                                      
[root@lunar1 ~]#
查看当前所有的CRS进程:
[root@lunar1 ~]# ps -ef|grep d.bin
root      3860     1  0 19:31 ?        00:00:12 /u01/app/11.2.0.4/grid/bin/ohasd.bin reboot
grid      3972     1  0 19:31 ?        00:00:04 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      3983     1  0 19:31 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      3994     1  0 19:31 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
root      4004     1  0 19:31 ?        00:00:15 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      4007     1  0 19:31 ?        00:00:12 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      4019     1  0 19:31 ?        00:00:05 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      4032     1  0 19:31 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/cssdmonitor
root      4051     1  0 19:31 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/cssdagent
grid      4063     1  0 19:31 ?        00:00:12 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      4157     1  0 19:31 ?        00:00:06 /u01/app/11.2.0.4/grid/bin/octssd.bin reboot
grid      4180     1  0 19:31 ?        00:00:06 /u01/app/11.2.0.4/grid/bin/evmd.bin
grid      4343  4180  0 19:32 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
root      5385     1  1 19:39 ?        00:00:17 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      5456     1  0 19:39 ?        00:00:04 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      5473     1  0 19:39 ?        00:00:07 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      5475     1  0 19:39 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
grid      6535     1  0 19:50 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
oracle    7132     1  0 20:04 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      7350  7273  0 20:04 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
这么多进程,他们的关系参见:11.2 RAC 的启动过程
好吧,我们开始模拟kill进程。首先kill 掉/u01/app/11.2.0.4/grid/bin/ohasd.bin(会自动重启,参见11.2 RAC 的启动过程)
[root@lunar1 ~]# kill -9 3860
[root@lunar1 ~]# ps -ef|grep d.bin
grid      3983     1  0 19:31 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      3994     1  0 19:31 ?        00:00:03 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
grid      4007     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      4019     1  0 19:31 ?        00:00:05 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      4032     1  0 19:31 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/cssdmonitor
grid      4063     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      4157     1  0 19:31 ?        00:00:06 /u01/app/11.2.0.4/grid/bin/octssd.bin reboot
grid      4180     1  0 19:31 ?        00:00:07 /u01/app/11.2.0.4/grid/bin/evmd.bin
grid      4343  4180  0 19:32 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
root      5385     1  1 19:39 ?        00:00:19 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      5456     1  0 19:39 ?        00:00:04 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      5473     1  0 19:39 ?        00:00:07 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      5475     1  0 19:39 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
grid      6535     1  0 19:50 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
oracle    7132     1  0 20:04 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      7490     1  0 20:06 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
root      7534  2487 14 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ohasd.bin restart
grid      7571     1  6 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      7575     1  8 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
root      7578     1  2 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  3 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
root      7676  7273  0 20:07 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
然后,我们kill cssdmonitor:
[root@lunar1 ~]# kill -9 4032
-bash: kill: (4032) - No such process
[root@lunar1 ~]#
这里没有这个集成,表示cssdmonitor进程被重启过了:
(参见11.2 RAC 的启动过程)
[root@lunar1 ~]# ps -ef|grep d.bin
grid      3983     1  0 19:31 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      3994     1  0 19:31 ?        00:00:03 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
grid      4007     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      4019     1  0 19:31 ?        00:00:05 /u01/app/11.2.0.4/grid/bin/osysmond.bin
grid      4063     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      4157     1  0 19:31 ?        00:00:06 /u01/app/11.2.0.4/grid/bin/octssd.bin reboot
grid      4180     1  0 19:31 ?        00:00:07 /u01/app/11.2.0.4/grid/bin/evmd.bin
grid      4343  4180  0 19:32 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
root      5385     1  1 19:39 ?        00:00:19 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      5456     1  0 19:39 ?        00:00:05 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      5473     1  0 19:39 ?        00:00:07 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      5475     1  0 19:39 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
grid      6535     1  0 19:50 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
oracle    7132     1  0 20:04 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      7490     1  0 20:06 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
root      7534  2487  3 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ohasd.bin restart
grid      7571     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      7575     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
root      7578     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
root      7740  7273  0 20:07 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
上面进程启动时间在20:04~20:07之间的,都是被/u01/app/11.2.0.4/grid/bin/ohasd.bin进程重启后,自动后台重启的。
现在,我们kill mdnsd gpnpd gipcd osysmond。
这4个进程中,前面3个是CRS启动除了ohasd以外,最早启动的几个进程。
如果kill这些进程,ohasd都会重启的:
[root@lunar1 ~]# kill -9 3983 3994 4007 4019
[root@lunar1 ~]# ps -ef|grep d.bin
grid      4063     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/ocssd.bin
grid      6535     1  0 19:50 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
grid      7490     1  0 20:06 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
root      7534  2487  2 20:07 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/ohasd.bin restart
grid      7571     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      7575     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
root      7578     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
grid      7756     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
grid      7758     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
root      7776  7273  0 20:07 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
这里我们看到,刚才kill 的4 进程都没起来,怎么回事?
别急,还没到时间,ohasd需要check后才启动,O(∩_∩)O哈哈~
然后,我们kill 监听:
[root@lunar1 ~]# kill -9 6535 7490
[root@lunar1 ~]# ps -ef|grep d.bin
grid      4063     1  0 19:31 ?        00:00:13 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      7534  2487  2 20:07 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/ohasd.bin restart
grid      7571     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      7575     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
root      7578     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
grid      7756     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
grid      7758     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      7783     1  2 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      7785     1  2 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      7844     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ologgerd -m lunar2 -r -d /u01/app/11.2.0.4/grid/crf/db/lunar1
root      7853     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/octssd.bin
grid      7873     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmd.bin
root      7874     1 14 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      7944  7873  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
grid      7979     1  9 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      7982     1  3 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
oracle    7986     1  4 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      8001     1  3 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      8025  7979  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/lsnrctl status LISTENER
grid      8028  7979  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/lsnrctl status LISTENER_SCAN1
root      8083  7273  0 20:08 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
好吧,看看,刚才kill的进程都被重启了,11.2的RAC真强悍啊。
现在我们kill /etc/init.d/init.ohasd进程:
[root@lunar1 ~]# ps -ef|grep ohasd
root      2487     1  0 19:20 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      7534  2487  1 20:07 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/ohasd.bin restart
root      8191  7273  0 20:08 pts/2    00:00:00 grep ohasd
[root@lunar1 ~]# kill -9 2487 7534
[root@lunar1 ~]# ps -ef|grep ohasd
root      8239     1  0 20:08 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      8257  8239  0 20:08 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      8258  8257  0 20:08 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      8267  7273  0 20:08 pts/2    00:00:00 grep ohasd
[root@lunar1 ~]# ps -ef|grep ohasd
root      8239     1  0 20:08 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      8299  7273  0 20:08 pts/2    00:00:00 grep ohasd
[root@lunar1 ~]#
这里我们看到的就是/etc/init.d/init.ohasd被系统自动重启的过程。这些信息会记录在/var/log/message/中:
[root@lunar1 ~]# tail -f /var/log/messages
Jan 24 19:45:31 lunar1 kernel: e1000 0000:00:03.0 eth0: Reset adapter
Jan 24 20:03:50 lunar1 kernel: e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Jan 24 20:03:52 lunar1 kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Jan 24 20:07:01 lunar1 clsecho: /etc/init.d/init.ohasd: ohasd is restarting 1/10.
Jan 24 20:07:01 lunar1 logger: exec /u01/app/11.2.0.4/grid/perl/bin/perl -I/u01/app/11.2.0.4/grid/perl/lib /u01/app/11.2.0.4/grid/bin/crswrapexece.pl /u01/app/11.2.0.4/grid/crs/install/s_crsconfig_lunar1_env.txt /u01/app/11.2.0.4/grid/bin/ohasd.bin "restart"
Jan 24 20:08:26 lunar1 init: oracle-ohasd main process (2487) killed by KILL signal
Jan 24 20:08:26 lunar1 init: oracle-ohasd main process ended, respawning
Jan 24 20:13:58 lunar1 init: oracle-ohasd main process (8239) killed by KILL signal
Jan 24 20:13:58 lunar1 init: oracle-ohasd main process ended, respawning
Jan 24 20:14:12 lunar1 root: exec /u01/app/11.2.0.4/grid/perl/bin/perl -I/u01/app/11.2.0.4/grid/perl/lib /u01/app/11.2.0.4/grid/bin/crswrapexece.pl /u01/app/11.2.0.4/grid/crs/install/s_crsconfig_lunar1_env.txt /u01/app/11.2.0.4/grid/bin/ohasd.bin "reboot"
^C
[root@lunar1 ~]#
而且他进程都被自动重启了(注意这是crsd进程还没被重启):
[root@lunar1 ~]# ps -ef|grep d.bin
grid      4063     1  0 19:31 ?        00:00:14 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      7578     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
grid      7756     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
grid      7758     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      7783     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      7785     1  1 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      7844     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ologgerd -m lunar2 -r -d /u01/app/11.2.0.4/grid/crf/db/lunar1
root      7853     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/octssd.bin
grid      7873     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmd.bin
root      7874     1  3 20:07 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      7944  7873  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
grid      7979     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      7982     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
oracle    7986     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root      8001     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      8119     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
grid      8120     1  0 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
root      8321  8319  1 20:08 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/crsctl.bin check has
root      8325  7273  0 20:08 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
现在我们依次kill:evmlogger.bin gpnpd.bin mdnsd.bin gipcd.bin evmd.bin oraagent.bin scriptagent.bin oraagent.bin orarootagent.bin和两个lisnterner
[root@lunar1 ~]# kill -9 7944 7756 7758 7783 7873 7979 7982 7986 8001 8119 8120
[root@lunar1 ~]# ps -ef|grep d.bin
grid      4063     1  0 19:31 ?        00:00:14 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      7578     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdagent
root      7588     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/cssdmonitor
root      7785     1  1 20:07 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      7844     1  0 20:07 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ologgerd -m lunar2 -r -d /u01/app/11.2.0.4/grid/crf/db/lunar1
root      8593  8591  0 20:09 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/crsctl.bin check has
root      8597  7273  0 20:09 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
然后,kill osysmond.bin ologgerd cssdmonitor cssdagent :
[root@lunar1 ~]# kill -9 7785 7844 7588 7578 
[root@lunar1 ~]#
好吧,现在就剩下一个ocssd.bin了:
[root@lunar1 ~]# ps -ef|grep d.bin
grid      4063     1  0 19:31 ?        00:00:14 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      8629  7273  0 20:10 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
现在我们kill 传说中一旦被kill就会引起主机重启的进程 ocssd.bin :
[root@lunar1 ~]# kill -9 4063
[root@lunar1 ~]#
好了,我们的系统都还好好的,没有重启,资源也都释放干净了:
[root@lunar1 ~]# ipcs -ma
 
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status     
 
------ Semaphore Arrays --------
key        semid      owner      perms      nsems    
0x00000000 0          root       600        1        
0x00000000 65537      root       600        1        
 
------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages   
 
[root@lunar1 ~]#
[root@lunar1 ~]#
如果要恢复,很简单,只要直接重启crs就ok了:
[root@lunar1 ~]# ps -ef | grep -v grep|grep -E 'init|d.bin|ocls|evmlogger|UID'
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 19:20 ?        00:00:01 /sbin/init
root      2486     1  0 19:20 ?        00:00:00 /bin/sh /etc/init.d/init.tfa run
root      8924     1  0 20:13 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
[root@lunar1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@lunar1 ~]# ps -ef|grep ohasd
root      8924     1  0 20:13 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      8968     1  4 20:14 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/ohasd.bin reboot
root      9187  7273  0 20:14 pts/2    00:00:00 grep ohasd
[root@lunar1 ~]#
[root@lunar1 ~]# ps -ef|grep d.bin
root      8968     1  0 20:14 ?        00:00:08 /u01/app/11.2.0.4/grid/bin/ohasd.bin reboot
grid      9090     1  0 20:14 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      9101     1  0 20:14 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/mdnsd.bin
grid      9112     1  0 20:14 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/gpnpd.bin
root      9122     1  0 20:14 ?        00:00:09 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      9126     1  0 20:14 ?        00:00:08 /u01/app/11.2.0.4/grid/bin/gipcd.bin
root      9139     1  0 20:14 ?        00:00:12 /u01/app/11.2.0.4/grid/bin/osysmond.bin
root      9150     1  0 20:14 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/cssdmonitor
root      9169     1  0 20:14 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/cssdagent
grid      9180     1  0 20:14 ?        00:00:04 /u01/app/11.2.0.4/grid/bin/ocssd.bin
root      9212     1  1 20:14 ?        00:00:28 /u01/app/11.2.0.4/grid/bin/ologgerd -M -d /u01/app/11.2.0.4/grid/crf/db/lunar1
root      9340     1  0 20:18 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/octssd.bin reboot
grid      9363     1  0 20:18 ?        00:00:03 /u01/app/11.2.0.4/grid/bin/evmd.bin
root      9455     1  0 20:18 ?        00:00:09 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
grid      9532  9363  0 20:18 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/evmlogger.bin -o /u01/app/11.2.0.4/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.4/grid/evm/log/evmlogger.log
grid      9569     1  0 20:18 ?        00:00:02 /u01/app/11.2.0.4/grid/bin/oraagent.bin
grid      9572     1  0 20:18 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/scriptagent.bin
root      9591     1  0 20:18 ?        00:00:05 /u01/app/11.2.0.4/grid/bin/orarootagent.bin
grid      9682     1  0 20:18 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER -inherit
grid      9684     1  0 20:18 ?        00:00:00 /u01/app/11.2.0.4/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
oracle    9774     1  0 20:19 ?        00:00:03 /u01/app/11.2.0.4/grid/bin/oraagent.bin
root     10642  7273  0 20:38 pts/2    00:00:00 grep d.bin
[root@lunar1 ~]#
[root@lunar1 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
               ONLINE  ONLINE       lunar1                                      
ora.DATADG1.dg
               ONLINE  ONLINE       lunar1                                      
ora.DATADG2.dg
               ONLINE  ONLINE       lunar1                                      
ora.LISTENER.lsnr
               ONLINE  ONLINE       lunar1                                      
ora.asm
               ONLINE  ONLINE       lunar1                   Started            
ora.gsd
               OFFLINE OFFLINE      lunar1                                      
ora.net1.network
               ONLINE  ONLINE       lunar1                                      
ora.ons
               ONLINE  ONLINE       lunar1                                      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       lunar1                                      
ora.cvu
      1        ONLINE  ONLINE       lunar1                                      
ora.lunar.db
      1        ONLINE  ONLINE       lunar1                   Open               
      2        ONLINE  OFFLINE                                                  
ora.lunar1.vip
      1        ONLINE  ONLINE       lunar1                                      
ora.lunar2.vip
      1        ONLINE  INTERMEDIATE lunar1                   FAILED OVER        
ora.oc4j
      1        ONLINE  ONLINE       lunar1                                      
ora.scan1.vip
      1        ONLINE  ONLINE       lunar1                                      
[root@lunar1 ~]#
这里只显示了节点1,因为节点2我关闭了。
测试证明,只要先kill cssdmonitor 和 cssdagent进程(准确的说是cssagent,从那张CRS启动的经典大图上也可以看到这个关系),再kill ocssd.bin进程,系统是不会重启的。
另外,12.1普通RAC(非Flex Cluster)的情况根本文一样,处理思路和过程也一样。

时间: 2024-08-03 10:18:48

oracle中11.2中手工kill所有的CRS进程而不导致主机重启方法的相关文章

oracle中11.2 RAC安装新主机,识别老存储-3-配置老存储的数据库

安装Oracle 11.2.0.4数据库软件,然后执行root.sh,这个没有特别的东西,略. 之后,我们需要修改ORACLE RDBMS的oracle二进制文件的权限,让oracle 数据库进程可以获取ASM磁盘组. [root@lunar5 ~]# su - grid [grid@lunar5 ~]$ $ORACLE_HOME/bin/setasmgidwrap o=/u01/app/oracle/product/11.2.0.4/dbhome_1/bin/oracle [grid@luna

Oracle 11.2中控制并行的新参数

在Oracle 11.2中引入了几个新的并行查询参数.对于数据仓库应用来说经常利用并行处理来快速有效地处理信息尤其是查询非常大的表或加入了复杂的算式更应该使用并行查询.在Oracle之前的版本中我们不得不或多或秒的来决定自动并行度.决定一个最佳并行度是非常困难的.真实最佳并行度依赖于数据块在磁盘上的物理位置以及服务器的CPU数量(cpu_count)为了解决并行查询的这些问题 在Oracle11.2中引入了以下新的并行查询参数 1.parallel_degree_policy parallel_

nls_timestamp_format参数在11.2中的变化

nls_timestamp_format参数在11.2.0.2及以后版本通过pfile或spfile或都不能进行修改了,在会话级还是能进行修改,Oracle提供若干NLS参数定制数据库和客户机以适应本地格式,例如有NLS_LANGUAGE,NLS_DATE_FORMAT,NLS_CALENDER等,可以通过查询以下数据字典或v$视图查看. NLS_DATABASE_PARAMETERS:显示数据库当前NLS参数取值,包括数据库字符集取值 NLS_SESSION_PARAMETERS:显示由NLS

oracle修改一个表中的主键字段值,与其外键关联的另一个表中的相应字段值也改变

问题描述 oracle修改一个表中的主键字段值,与其外键关联的另一个表中的相应字段值也改变 oracle 中修改一个表中的主键字段值,与其外键关联的另一个表中的相应字段值也改变? 有如下两张表,表a和表b 表a 结构如下: ID Name age 1 lisi 18 2 wangwu 21 3 sunliu 34 4 yiliu 24 ... ... ... 其中ID字段为表a主键且自增 表b结构如下: CID CNAME ID 1 aaaaa 1 2 bbbbb 2 3 cccccc 4 4

Oracle 12c CDB数据库中数据字典架构

数据字典就是元数据的集合,比如创建的表,列,约束,触发器等等这些都是元数据,需要保存到数据库中.除此之外,Oracle自身的一些数据库对象,如目录,PL/SQL代码等等这些都是元数据,都需要存放在数据字典中.随着12c 容器数据的普及,Oracle数据字典发生了哪些变化呢,下文即是具体描述. 一.数据字典及其形成 1.数据字典 数据字典是元数据的集合,从逻辑上和物理上描述了数据库及内容,存储于SYSTEM与SYSAUX表空间内的若干段. SYS用户拥有所有的数据字典表,数据字典基本一般以结尾,如

Oracle RAC安装过程中碰到的“坑”和关键点(二)

(1) 依赖包的安装 Linux下安装Oracle,除了系统配置参数,我觉得依赖包的安装是另一个比较琐碎的操作. 本次安装碰到了几个包的问题: (a) rpm -Uvh gcc-4*提示: 02. error: Failed dependencies:  03.    cloog-ppl >= 0.15 is needed by gcc-4.4.7-4.el6.x86_64 04.    cpp = 4.4.7-4.el6 is needed by gcc-4.4.7-4.el6.x86_64

Oracle中PL/SQL中if语句的写法介绍

以下是对Oracle中PL/SQL中if语句的写法进行了详细的分析介绍,需要的朋友可以过来参考下   复制代码 代码如下: /* If语句: 判断用户输入的数字. */ set serveroutput on --接收键盘输入 accept num prompt '请输入一个数字:'; declare   --将屏幕输入的数字付给变量   pnum number := # begin   if pnum = 0 then dbms_output.put_line('您输入的是0');   end

Oracle或者Pl/Sql中001与1一样吗?

今天写了一个插入语句,往表C中插入数据.insert into 表名(列名) values () where A.列aa=B.列bb;      但半天没反应,select aa from A;select bb from B,发现存在相等的值      A中的aa是1,2,3:B中是001,002,003 解决:把1改为001,2改为002,3改为003后问题解决,从此可以看出001与1在Oracle或者Pl/Sql中是不一样的!     这个问题很可能是由于在数据库中这个字段是字符型的,导致

表空间 数据文件-oracle数据库表空间中的数据文件自动扩展到32G后不再自动扩展

问题描述 oracle数据库表空间中的数据文件自动扩展到32G后不再自动扩展 CSDN移动问答oracle表空间中的数据文件自动扩展到32G后不再自动扩展,报ora-01653错误,我之后手动加了个数据文件,但是不久之后这个数据文件自动扩展到了32G又报错,请问这是什么原因,难道以后只能手动添加数据文件么????