以下命令均在Standby端执行
1.如果是使用ARCH传递redo数据,那么执行以下命令:
检查是否有gap archive
SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
如果有则register
ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
实行Failover:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE ACTIVATE STANDBY DATABASE;
ALTER DATABASE MOUNT;
ALTER DATABASE OPEN;
2.如果是使用LGWR传递redo数据,那么执行以下命令:
检查是否有gap archive
SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
如果有则register
ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
如果是由于网络问题而导致需要切换,那么通常standby端的RFS进程并不会意识到primary已经不可访问,所以RFS进程也不会释放当前的standby redo log文件。
如果是primary端的数据库实例由于故障中断,那么一般情况下standby端的RFS进程会立刻意识到primary已经不可访问,也就会立刻释放当前的standby redo log文件。
只要RFS进程没有释放standby redo log文件,那么执行ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH命令就会在alertlog文件中发现如下的报错信息
Warning: log 4 of thread 1 is being archived or modified
Recovery interrupted.
Media Recovery failed with error 261
如果在报上述错误的时候,执行SWITCH,那么将会出现下面的错误:
ORA-16139: media recovery required
所以必须检查alertlog文件,直到发现如下信息才表示RFS进程已经释放了standby redo log文件,这时候才可以作FINISH:
RFS: Possible network disconnect with primary database
促使RFS进程释放standby redo log 文件有两种方法:
1. 等待RFS进程的network timeout,通常需要等待8分钟左右
2. 关闭standby数据库,再重新开启,这样会强制RFS进程释放standby redo log
我们可以通过v$managed_standby视图来监控RFS进程何时释放
实行Failover:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
alertlog中将显示如下信息,表示finish成功:
Terminal Incomplete Recovery: UNTIL CHANGE 3738452
Terminal Incomplete Recovery: End-Of-Redo log allocation
Terminal Incomplete Recovery: log 4 reserved for thread 1 seq# 8772
TERMINAL RECOVERY changing datafile format version from 8.0.0.0.0 to
9.0.0.0.0
Switching logfile format version from 8.0.0.0.0 to 9.0.0.0.0
Terminal Incomplete Recovery: clearing standby redo logs.
Terminal Incomplete Recovery: thread 1 seq# 8772 redo required
Terminal Incomplete Recovery: End-Of-Redo log /global/oradata/ctsdb/stdby_redo04.log
Identified end-of-REDO for thread 1 sequence 8772
Terminal Incomplete Recovery: end checkpoint SCN 3738453
Media Recovery Complete
Switching logfile format version from 9.0.0.0.0 to 8.0.0.0.0
Terminal Incomplete Recovery: successful completion
Begin: Wait for standby logfiles to be archived
Wed Sep 1 13:42:28 2004
ARC1: Evaluating archive log 4 thread 1 sequence 8772
Wed Sep 1 13:42:28 2004
ARC0: Evaluating archive log 4 thread 1 sequence 8772
Wed Sep 1 13:42:28 2004
ARC1: Beginning to archive log 4 thread 1 sequence 8772
Wed Sep 1 13:42:28 2004
ARC0: Unable to archive log 4 thread 1 sequence 8772
Wed Sep 1 13:42:28 2004
Creating archive destination LOG_ARCHIVE_DEST_1: '/global/oradata/ctsdb/archive/arch1_8772.log'
Wed Sep 1 13:42:28 2004
Log actively being archived by another process
Wed Sep 1 13:42:28 2004
ARC1: Completed archiving log 4 thread 1 sequence 8772
Wed Sep 1 13:42:43 2004
End: All standby logfiles have been archived
Resetting standby activation ID 4038461969 (0xf0b60a11)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
FINSH成功之后再执行SWITCH:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
SWITCH成功之后,重新启动数据库:
SHUTDOWN IMMEDIATE;
STARTUP;
使用Data Guard Broker
创建Management Server repository:
emca
启动Management Server:
oemctl start oms
检查Management Server状态:
oemctl status oms sysman/oem_temp@bftest
启动Intelligent Agent:
agentctl start agent
如果启动agent报错,则检查相应的log文件,如果log文件中有如下错误:
Failed while initializing user subsystem
Error initializing subsystems
nmiumini_initializeUM: Unable to initialize UQAgent
则进行如下操作之后,重新启动agent:
rm $ORACLE_HOME/network/agent/*.q
alter system set resource_manager_plan='SYSTEM_PLAN' scope=both;
在所有站点上将BROKER启动。
SQL> ALTER SYSTEM SET DG_BROKER_START=TRUE SCOPE=BOTH;
System altered.
SQL> SHOW PARAMETER DG_BROKER_START
NAME TYPE VALUE
------------------------------------
dg_broker_start boolean TRUE
连接Data Guard Manager,必须使用具有sysdba权限的用户连接到Primary库上
>dgmgrl
DGMGRL> connect sys/dba
创建配置方案
DGMGRL> CREATE CONFIGURATION 'cts' AS
PRIMARY SITE IS 'bftest'
RESOURCE IS 'ctsdb'
HOSTNAME IS 'bftest'
INSTANCE NAME IS 'ctsdb'
SERVICE NAME IS 'ctsdb.primary'
SITE IS MAINTAINED AS PHYSICAL;
创建备用站点方案
DGMGRL> CREATE SITE 'report'
RESOURCE IS 'ctsdb'
HOSTNAME IS 'report'
INSTANCE NAME IS 'ctsdb'
SERVICE NAME IS 'ctsdb.standby'
SITE IS MAINTAINED AS PHYSICAL;
激活配置方案
DGMGRL> ENABLE CONFIGURATION;
激活资源
DGMGRL> ENABLE RESOURCE 'ctsdb';
资源的日志传送模式必须和Primary库的数据保护模式相匹配,比如数据保护模式是maximize availability,那么需要配置资源的LogXptMode属性为SYNC方式。
DGMGRL>ALTER RESOURCE 'ctsdb' ON SITE 'Boston' SET PROPERTY LogXptMode=SYNC;
DGMGRL>ALTER RESOURCE 'report_db' ON SITE 'Beijing' SET PROPERTY LogXptMode=SYNC;
DGMGRL> ALTER CONFIGURATION SET PROTECTION MODE AS MAXAVAILABILITY;
查看资源情况
DGMGRL> show resource verbose 'ctsdb';
查看某个节点上资源中的某一属性
DGMGRL> show resource verbose 'ctsdb' 'LogXptMode' on site 'Boston';
DGMGRL> SHOW RESOURCE 'ctsdb' LogXptStatus;
查看Broker的日志
DGMGRL> show log latest on site 'Boston';
查看数据库告警日志
DGMGRL> show log alert latest on site 'Boston';
查看资源的各种属性
DGMGRL> SHOW RESOURCE 'ctsdb' SendQEntries;
DGMGRL> SHOW RESOURCE 'report_db' SbyLogQueue;
DGMGRL> show resource verbose 'ctsdb' InconsistentLogXptProps;
修改资源属性,将自动修改数据库的相应初始化参数
DGMGRL> ALTER RESOURCE product_db on site v280 SET PROPERTY StandbyArchiveDest = '/global/oradata/ctsdb/archive';
Property "standbyarchivedest" updated.
DGMGRL> ALTER RESOURCE product_db on site v280 SET PROPERTY StandbyFileManagement = 'AUTO';
Property "standbyfilemanagement" updated.
DGMGRL> ALTER RESOURCE product_db on site v280 SET PROPERTY ArchiveLagTarget = '3600';
Property "archivelagtarget" updated.
停止Data Guard环境中的某个节点
DGMGRL> ALTER RESOURCE 'report_db' ON SITE 'Beijing' SETSTATE='offline';
启动Data Guard环境中的某个节点
DGMGRL> ALTER RESOURCE 'report_db' ON SITE 'Beijing' SETSTATE='LOGICAL-APPLY-ON';
修改 Data Guard环境中的某个节点的状态
DGMGRL> ALTER RESOURCE 'report_db' ON SITE 'v480' SETSTATE='READ-ONLY';
先停止连接到备用库上的所有连接
DGMGRL> ALTER RESOURCE 'report_db' ON SITE 'v480' SETSTATE='PHYSICAL-APPLY-ON';
停止Broker
SQL> ALTER SYSTEM SET DG_BROKER_START=FALSE;
作Switchover
DGMGRL> SWITCHOVER TO 'v480';
然后关闭Pirmary和Standby,重新启动
七.在Cluster环境中的主备切换步骤
在应用中cluster环境是很常见的,下面简单介绍一下在Sun Cluster 3.0的环境中,如果作Data Guard主备数据库的Switchover工作。
1.由于Cluster环境的监控,我们要手动关闭数据库的话,必须先关闭cluster,单独起一个节点的oracle。其中listener.ora.sigle的配置文件是预先写好的监听配置,主要不同是用主机的真实IP替换原先Cluster环境中的虚拟IP。
/usr/cluster/bin/scswitch -F -g oracle-rg
mount /global/oradata
cd /export/home/oracle/app/oracle/product/9.2.0/network/admin
cp listener.ora.sigle listener.ora
lsnrctl start
lsnrctl start listener_dg
sqlplus “/ as sysdba”
startup
2.在SQL*Plus中依次进行以下操作,完成切换Primary和Standby的工作
主数据库端:
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY;
SHUTDOWN IMMEDIATE;
STARTUP NOMOUNT;
ALTER DATABASE MOUNTSTANDBY DATABASE;
备用数据库端:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
SHUTDOWN IMMEDIATE;
STARTUP;
主数据库端:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
八.参考文档
Oracle Data Guard Concepts and Administration Release 2 (9.2)
Oracle9i Data Guard Broker Release 2 (9.2)
技术专题总结:standby Database - snowhite、chao_ping
Oracle 9i备用数据库配置使用参考手册 - piner
[作者简介]
张乐奕,通常使用的网名为kamus,也曾用过seraphim,现在任职于北京某大型软件公司,Oracle数据库DBA,主要负责证券行业的核心交易系统数据库管理及维护工作。
热切关注Oracle技术和相关操作系统技术,出没于各大数据库技术论坛,目前是中国最大的Oracle技术论坛www.itpub.net的数据库管理版版主,