rac 安装完成,关闭之后重启数据库遇到如下错误:
oracle@rac1:/tmp>sqlplus "/as sysdba"
SQL*Plus: Release 11.2.0.1.0 Production on Fri Sep 2 13:17:14 2011
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA1/rac/spfilerac.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA1/rac/spfilerac.ora
ORA-12547: TNS:lost contact
检查alert日志文件,有如下记录:
TNS-12547: TNS:lost contact
ns secondary err code: 12560
nt main err code: 517
TNS-00517: Lost contact
nt secondary err code: 32
nt OS err code: 0
ERROR: Failed to connect with connect string: (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/opt/rac/11.2.0/grid/bin/oracle)(ARGV0=oracle+ASM1_asmb_rac1)(ENVS='ORACLE_HOME=/opt/rac/11.2.0/grid,ORACLE_SID=+ASM1')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(enable=setuser))
Errors in file /opt/rac/oracle/diag/rdbms/rac/rac1/trace/rac1_asmb_21386.trc:
ORA-15055: unable to connect to ASM instance
ORA-12547: TNS:lost contact
Fatal NI connect error 12547, connecting to:
(DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/opt/rac/11.2.0/grid/bin/oracle)(ARGV0=oracle+ASM1_asmb_rac1)(ENVS='ORACLE_HOME=/opt/rac/11.2.0/grid,ORACLE_SID=+ASM1')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(enable=setuser)(CONNECT_DATA=(CID=(PROGRAM=oracle@rac1)(HOST=rac1)(USER=oracle))))
VERSION INFORMATION:
TNS for Linux: Version 11.2.0.1.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.1.0 - Production
Time: 02-SEP-2011 13:31:15
Tracing not turned on.
Tns error struct:
ns main err code: 12547
TNS-12547: TNS:lost contact
ns secondary err code: 12560
nt main err code: 517
TNS-00517: Lost contact
nt secondary err code: 32
nt OS err code: 0
ERROR: Failed to connect with connect string: (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/opt/rac/11.2.0/grid/bin/oracle)(ARGV0=oracle+ASM1_asmb_rac1)(ENVS='ORACLE_HOME=/opt/rac/11.2.0/grid,ORACLE_SID=+ASM1')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(enable=setuser))
Fri Sep 02 13:31:18 2011
Starting background process ASMB
Fri Sep 02 13:31:18 2011
ASMB started with pid=26, OS id=21450
此时检查asm 实例,asm磁盘组是online的:
grid@rac1:/home/grid>crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg ONLINE ONLINE rac1 ========>DG are online
ora.DATA2.dg ONLINE ONLINE rac1
ora.LISTENER.lsnr ONLINE ONLINE rac1
ora.asm ONLINE ONLINE rac1 Started
ora.ons OFFLINE OFFLINE rac1
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS Cluster Resources
--------------------------------------------------------------------------------
ora.cssd1 ONLINE ONLINE rac1
ora.diskmon1 ONLINE ONLINE rac1
ora.evmd1 ONLINE ONLINE rac1
ora.jrdwyf.db1 OFFLINE OFFLINE Instance Shutdown
做如下检查:问题出现在ORACLE_HOME/bin/oracle,GI_HOME/bin/oracle 文件的权限设置上:
grid@rac1:/opt/11.2.0/grid/bin>ls -al oracle
-rwxr-x--x 1 grid oinstall 200678464 Feb 28 14:54 racle =============>incorrect
oracle@rac1:/opt/oracle/11.2.0/jrdwyf/bin>ls -al oracle
-rwxr-x--x 1 oracle asmadmin 228886191 Feb 28 15:41 racle=============>incorrect
测试登录asm实例!
grid@rac1:/home/grid>sqlplus "/ as sysasm"
SQL*Plus: Release 11.2.0.2.0 Production on Fri Mar 11 13:45:35 2011
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Automatic Storage Management option
SQL>
正确的应该是:
$ls -l $GI_HOME/bin/oracle:
Correct permission should be:
-rwsr-s--x 1 grid oinstall 152400480 Sep 2 15:49 oracle
$ls -l $ORACLE_HOME/bin/oracle:
Correct permission should be:
-r-sr-sr-x 1 oracle asmadmin 173389085 Sep 2 15:51 oracle
因此在每个节点上修改$ORACLE_HOME/bin/oracle,$GI_HOME/bin/oracle权限即可:
#cd /opt/11.2.0/grid/bin
#chmod 6751 oracle
#ls -l oracle
Correct permission should be:
-rwsr-s--x 1 grid oinstall 152400480 09-02 16:12 oracle
#cd /opt/oracle/11.2.0/db/bin
#chmod 6555 oracle
#ls -l oracle
Correct permission should be:
-r-sr-sr-x 1 oracle asmadmin 173389085 09-02 16:16 oracle
修改之后,可以startup DB;
附上官方的解释:
The issue is caused by incorrect permissions of GI_HOME/bin/oracle and ORACLE_HOME/bin/oracle, which lead to connections failed from an RDBMS instance (RDBMS) to the ASM instance.
For this customer, the ASM instance works fine and the DG have been mounted, the permissions of GI_HOME/bin/oracle and ORACLE_HOME/bin/oracle are incorrect.
If the permissions are incorrect, this will lead to connections failed from an RDBMS instance (RDBMS) to the ASM instance.$ORACLE_HOME/bin/oracle is the main Oracle RDBMS kernel binary, and implements the functions of all background daemons: PMON, SMON, DBWR, LGWR, CKPT, etc.
Connection functionality implies attach to SGA shared memory, which needs to be initialized at instance startup; this happens with routines inside bin/oracle and (at a very high level) user privileges of who starts up the instance need to match with user privileges of who wants to connect to the instance.
Database open implies opening files; this also happens in code inside bin/oracle. Again, privileges of who starts up the instance and attempts to open the database need to match the privileges of files where the data actually resides.ASM is implemented as an Oracle instance - starting out of the bin/oracle in GI_HOME; so you'll also have connections from an ORACLE_HOME instance (RDBMS) and a GI_HOME instance (ASM) for I/O path translations.