在安装完RAC11.2.0.4,并打上PSU后,在通过DBCA创建数据库实例是出现诡异现象:
弹出来的框为单实例创建,二并非RAC实例。反复试了几次都是同样的问题,检查安装无问题。
在metalink上查看了下,问题原因很多,同事通过sqlplus连接空实例测试出现如下报错:
[oracle@zderp2vprd01 cfgtoollogs]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Nov 17 09:59:10 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
ERROR:
ORA-12547: TNS:lost contact
Enter user-name: ^C
上述报错通常出现在打完PSU后,看到这个错误,怀疑是一个经常遇到的老问题了。目录权限问题,检查如下:
[oracle@zderp2vprd01 11.2.0]$ ls -lrt
total 8
drwxr-xr-x 73 root oinstall 4096 Nov 16 21:29 grid_1
drwxrwxr-x 78 grid oinstall 4096 Nov 17 09:40 db_1
[oracle@zderp2vprd01 11.2.0]$ chown oracle:oinstall db_1
[root@zderp2vprd01 11.2.0]# ls -lrt
total 8
drwxr-xr-x 73 root oinstall 4096 Nov 16 21:29 grid_1
drwxrwxr-x 78 grid oinstall 4096 Nov 17 09:40 db_1
[root@zderp2vprd01 11.2.0]# chown oracle:oinstall db_1
修改后终于可以正常创建RAC实例了。
此处将metalink上的信息也贴出来,供大家参考学习。
Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.4 [Release 10.2 to 11.2]
Information in this document applies to any platform.
SYMPTOMS
After CRS or Grid Infrastructure installation and RDBMS installation, using dbca to create a RAC database, it gives a welcome screen without the 'Oracle Real Applications Clusters' and 'Oracle single instance' database options. (same for netca).
While it supposes to be:
This can happen for the following cases.
Case I
With 11.2 Grid Infrastructure, using dbca to create a RAC database from a 10.x or 11.1.x ORACLE_HOME.
$ORACLE_HOME/cfgtoologs/dbca/trace.log shows:
[main] [10:36:4:10] [OCR.<init>:259] Primary Group of Current user: oracle
[main] [10:36:4:11] [HASContext.getInstance:200] Module init : 24
[main] [10:36:4:11] [HASContext.getInstance:223] Local Module init : 24
[main] [10:36:4:13] [ClusterLock.<init>:56] ClusterLock Instance Allocation Failed: oracle.ops.mgmt.has.HASContextException: OCR Error(Native: prsr_initCLSS:[21])
[main] [10:36:4:20] [Cluster.isCluster:206] oracle.ops.mgmt.cluster.Cluster.isCluster(Cluster.java:260)
oracle.ops.mgmt.cluster.Cluster.isCluster(Cluster.java:197)
oracle.sysman.assistants.util.ClusterUtils.<init>(ClusterUtils.java:185)
oracle.sysman.assistants.util.ClusterUtils.getInstance(ClusterUtils.java:236)
oracle.sysman.assistants.dbca.backend.Host.checkOPS(Host.java:2070)
oracle.sysman.assistants.dbca.backend.Host.startOperation(Host.java:2303)
oracle.sysman.assistants.dbca.Dbca.execute(Dbca.java:115)
oracle.sysman.assistants.dbca.Dbca.main(Dbca.java:180)
[main] [10:36:4:20] [Host.checkOPS:2073] cluster existence:false
[main] [10:36:4:20] [Host.checkOPS:2111] Cluster installed=false
[Finalizer] [10:36:4:25] [ClusterLock.finalize:88] ClusterLock: finalized called for oracle.ops.mgmt.has.ClusterLock@5c7734
Case II
RDBMS is at the same release as CRS or Grid Infrastructure, RDBMS is installed with Real Application Cluster option
Case III
RDBMS is at the same release as CRS or Grid Infrastructure, RDBMS is installed on each individual node without selecting Real Application Cluster option.
Case IV, V, VI & VII
Grid Infrastructure and RDBMS are on the same release, they are installed as different user: grid and oracle. Grid Infrastructure is running on all nodes, RDBMS is installed with RAC option.
CAUSE
Case I
This is caused by the cluster nodes not being pinned in 11.2 Grid Infrastructure.
See Oracle Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Linux Chapter 5.3.3 Pinning Cluster Nodes for Oracle Database Release 10.x or 11.x for more details.
Case II
This is caused by the problem inventory, eg: oraInventory/ContentsXML/inventory.xml does not list RAC nodes for the corresponding RDBMS ORACLE_HOME. For example:
/HOME>
<HOME NAME="OraDb11g_home" LOC="/opt/oracle/product/11.2.0/racdb" TYPE="O" IDX="3">
</HOME>
While it should be similar to:
/HOME>
<HOME NAME="OraDb11g_home" LOC="/opt/oracle/product/11.2.0/racdb" TYPE="O" IDX="3">
<NODE_LIST>
<NODE NAME="racnode1"/>
<NODE NAME="racnode2"/>
</NODE_LIST>
</HOME>
Case III
If the RDBMS software is not installed with RAC option, then dbca will not show RAC option either. To confirm if RDBMS software is installed with RAC option, please refer Note 284785.1 How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC.
Case IV
The problem is caused by RDBMS ORACLE_HOME wrongly owned by grid user. It should be owned by oracle user.
dbca trace shows:
[main] [ 2012-08-09 03:46:46.745 GST ] [OracleHome.isDatabaseOptionOn:1133] Checking Database Option: Real Application Clusters
[main] [ 2012-08-09 03:46:46.745 GST ] [OracleHome.isDatabaseOptionOn:1190] Database Option Real Application Clusters is false
Despite RDBMS ORACLE_HOME has RAC option installed correctly.
Further strace against dbca reveals there are ORA-600 reported before dbca display the option page:
21999 02:45:37.053231 open("/u01/app/oracle/product/11.2.0/dbhome_1/rdbms/log/dbua4535717_ora_21999.trc", O_WRONLY|O_CREAT|O_TRUNC, 0660) = 3
...
21999 02:45:37.058573 write(3, "ORA-00600: internal error code, "..., 139) = 139
21999 02:45:37.214275 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
checking /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/log/dbua4535717_ora_21999.trc, shows:
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
While:
# id grid
uid=1100(grid) gid=1000(oinstall) groups=1000(oinstall),1200(asmadmin),1201(asmdba),1202(asmoper),1300(dba)
# id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1201(asmdba),1300(dba),1301(oper)
$ ls -l /u01/app/oracle/product/11.2.0
drwxrwxr-x 75 grid oinstall 4096 Aug 8 14:21 dbhome_1 << this is wrongly owned by grid user
Case V
The problem is caused by an prior abnormal termination of dbca which left a test instance DBUA0 running on 1 of the cluster nodes, this prevents a new test instance to be created before the RAC option is displayed. dbca trace shows:
[main] [ 2011-03-11 21:21:08.473 GMT ] [OracleHome.initOptions:1226] Initializing Database Options with for dummy sid=DBUA0 using initfile=/data/oracle/product/11.2.0.2/dbs/initDBUA0.ora using pwdfile=/data/oracle/product/11.2.0.2/dbs/orapwDBUA0
[main] [ 2011-03-11 21:21:08.899 GMT ] [OracleHome.initOptions:1240] executing: startup nomount pfile='/data/oracle/product/11.2.0.2/dbs/initDBUA0.ora'
[main] [ 2011-03-11 21:21:32.871 GMT ] [SQLEngine.done:2167] Done called
[main] [ 2011-03-11 21:21:32.872 GMT ] [OracleHome.initOptions:1247] ORA-304: requested INSTANCE_NUMBER is busy <<<<
oracle.sysman.assistants.util.sqlEngine.SQLFatalErrorException: ORA-304: requested INSTANCE_NUMBER is busy
at
oracle.sysman.assistants.util.sqlEngine.SQLEngine.executeImpl(SQLEngine.java:1
...
[main] [ 2011-03-11 21:21:33.499 GMT ] [OracleHome.isDatabaseOptionOn:1129] Checking Database Option: Real Application Clusters
[main] [ 2011-03-11 21:21:33.500 GMT ] [OracleHome.isDatabaseOptionOn:1186] Database Option Real Application Clusters is false <<<<
Case VI
The problem is caused by $ORACLE_HOME/rdbms/audit directory missing or not writable.
$ORACLE_BASE/cfgtoollogs/dbca/trace.log_<ORACLE_HOMENAME>_<date> shows:
[main] [ 2014-06-27 00:15:15.835 EST ] [OracleHome.initOptions:1236] Initializing Database Options with for dummy sid=DBUA1500252 using initfile=/u01/app/oradb1/product/11.2.0/dbhome_1/dbs/initDBUA1500252.ora using pwdfile=/u01/app/oradb1/product/11.2.0/dbhome_1/dbs/orapwDBUA1500252
...
[main] [ 2014-06-27 00:15:38.793 EST ] [OracleHome.initOptions:1285] ORA-09925: Unable to create audit trail file
Case VII
The problem is caused by insufficient semaphores configed which leads to test instance startup failure.
dbca trace.log shows:
[main] [ 2015-08-06 09:28:58.544 EDT ] [OracleHome.initOptions:1253] executing: startup nomount pfile='/oracle/product/11.2.0.4.5/db11g/dbs/initDBUA2857847.ora'
[main] [ 2015-08-06 09:29:00.158 EDT ] [SQLEngine.done:2189] Done called
[main] [ 2015-08-06 09:29:00.159 EDT ] [OracleHome.initOptions:1262] ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpbitsper
Other Possible causes
1. RDBMS software corrupted, oracle binary shows different size for different nodes
2. Grid_home is not relinked properly
SOLUTION
Case I
1. To fix the issue, pin all nodes within the cluster using:
as root user:
crsctl pin css -n<node1> <node2>
2. To list all pinned nodes:
olsnodes -t -n
After this, dbca shows Oracle Real Application Clusters database option correctly.
For 10.x and 11.1.x dbca to work with 11.2 listener, patch 8288940 needs to be applied. Please refer to Note 948456.1 for other potential issues in such mixed environments.
Case II
Fix the inventory.xml via detaching the problem ORACLE_HOME, then re-attaching with correct information.
1. To detach ORACLE_HOME, on each cluster node, issue:
$ORACLE_HOME/oui/bin/runInstaller -detachHome -silent -local ORACLE_HOME=<$ORACLE_HOME path>
eg:
$ORACLE_HOME/oui/bin/runInstaller -detachHome -silent -local ORACLE_HOME=/opt/oracle/product/11.2.0/racdb
2. To attach ORACLE_HOME, on each cluster node, issue:
$ORACLE_HOME/oui/bin/runInstaller -silent -local -ignoreSysPrereqs -attachHome ORACLE_HOME=<$ORACLE_HOME path> ORACLE_HOME_NAME=<$ORACLE_HOME_NAME> CLUSTER_NODES=<local-node>,<remote-node> LOCAL_NODE=<local-node>
For example, on node 1, issue:
$ORACLE_HOME/oui/bin/runInstaller -silent -attachHome -local ORACLE_HOME="/opt/oracle/product/11.2.0/racdb" ORACLE_HOME_NAME="OraDb11g_home" "CLUSTER_NODES={racnode1,racnode2}" LOCAL_NODE='racnode1'
on node 2, issue:
$ORACLE_HOME/oui/bin/runInstaller -silent -attachHome -local ORACLE_HOME="/opt/oracle/product/11.2.0/racdb" ORACLE_HOME_NAME="OraDb11g_home" "CLUSTER_NODES={racnode1,racnode2}" LOCAL_NODE='racnode2'
Case III
1. Relink Oracle binary to include RAC option.
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk rac_on ioracle
2. Fix the inventory using the same procedure as case II.
Case IV
Change RDBMS ORACLE_HOME to be owned by oracle user:
As root user:
cd /u01/app/oracle/product/11.2.0
chown oracle dbhome_1
Case V
Manually shutdown the instance DBUA0.
Please ensure there is no instance with name DBUA0 running on any node of the cluster before start dbca. This is not required if there is no earlier abnormal termination of dbca happened.
Case VI
Create the default audit directory under $ORACLE_HOME/rdbms/audit if it is missing, otherwise correct the permission and ownership:
$ mkdir $ORACLE_HOME/rdbms/audit
It should have permission 755 (drwxr-xr-x) and ownership <oracle>:oinstall
Case VII
Either increase the semaphores setting or clear up left over semaphores. Please refer to Note 825001.1 for solution.
Others
1. Reinstall RDBMS oracle software
2. Relinking Grid Infrastructure binaries, refer to
Oracle? Grid Infrastructure Installation Guide
11g Release 2 (11.2) for Linux
6.3 Relinking Oracle Grid Infrastructure for a Cluster Binaries