oracle报错ORA-32701 error in alert.log M000 hang event ‘not in wait’ during flush AWR

环境是一套11.2.0.3  2nodes RAC on hpux-ia31, alert中出现ora-32701 hangmgr错误, 从trace文件中发现是m000进程是mmon的辅助进程,用于flush AWR相关数据,有一个wait event: enq: WF – contention, 这也是flush AWR数据时相关的enqueue等待,但是blocker进程是not in wait, (另mmon hang是可以直接kill 该进程spid,  通常会在不重启实例的前掉下重启该进程)。 这里简单的记录该问题

# db alert

Thread 2 advanced to log sequence 170773 (LGWR switch)
  Current log# 8 seq# 170773 mem# 0: /dev/tza_oravg02/rtza_redo08
Thu May 26 17:08:11 2016
Errors in file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc  (incident=720137):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720137/anbob2_dia0_29439_i720137.trc
Thu May 26 17:08:12 2016
Sweep [inc][720137]: completed
System State dumped to trace file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720137/anbob2_m001_22862_i720137_a.trc
DIA0 terminating blocker (ospid: 13964 sid: 5649 ser#: 7241) of hang with ID = 3
    requested by master DIA0 process on instance 1
    Hang Resolution Reason: Although the number of affected sessions did not
    justify automatic hang resolution initially, this previously ignored
    hang was automatically resolved.
    by terminating session sid: 5649 ospid: 13964
DIA0 successfully terminated session sid:5649 ospid:13964 with status 31.
Thu May 26 17:09:46 2016
Errors in file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc  (incident=720138):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720138/anbob2_dia0_29439_i720138.trc
Thu May 26 17:09:47 2016
Sweep [inc][720138]: completed
Thu May 26 17:09:47 2016
Sweep [inc2][720138]: completed
Sweep [inc2][720137]: completed
System State dumped to trace file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720138/anbob2_m003_23913_i720138_a.trc
DIA0 terminating blocker (ospid: 13964 sid: 5649 ser#: 7241) of hang with ID = 3
    requested by master DIA0 process on instance 1
    Hang Resolution Reason: Although the number of affected sessions did not
    justify automatic hang resolution initially, this previously ignored
    hang was automatically resolved.
   by terminating the process
DIA0 successfully terminated process ospid:13964.
Thu May 26 17:17:16 2016
Thread 2 advanced to log sequence 170774 (LGWR switch)
  Current log# 9 seq# 170774 mem# 0: /dev/tza_oravg02/rtza_redo09
# trace file: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc

Dump continued from file: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc
ORA-32701: Possible hangs up to hang ID=0 detected

========= Dump for incident 720137 (ORA 32701) ========
----- Beginning of Customized Incident Dump(s) -----
 
There are resolvable hangs on your system.  Hang Manger will
attempt to resolve these hangs.  Some information about these
is output below.  Complete information is available in an
incident trace file on instance 1.
 
If the hang is to be resolved by terminating the session or
process that is the root or victim of the hang, additional
information will be output on the local instance of that
session or process.  Below are the hangs for which resolution
will be attempted.
 

*** 2016-05-26 17:08:11.765
Resolvable Hangs in the System
                      Root       Chain Total               Hang              
   Hang Hang          Inst Root  #hung #hung  Hang   Hang  Resolution        
     ID Type Status   Num  Sess   Sess  Sess  Conf   Span  Action            
  ----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
      3 HANG RSLNPEND    2  5649     2     2   HIGH GLOBAL Terminate Process 
  Hang Resolution Reason: Although the number of affected sessions did not
    justify automatic hang resolution initially, this previously ignored
    hang was automatically resolved.
 
      inst# SessId  Ser#     OSPID PrcNm Event
      ----- ------ ----- --------- ----- -----
          1   6238 38599      3697  M000 enq: WF - contention
          2   5649  7241     13964  M000 not in wait
 
Dumping process info of pid[212.13964] (sid:5649, ser#:7241)
    requested by master DIA0 process on instance 1.
 

*** 2016-05-26 17:08:11.765
Process diagnostic dump for oracle@qdtza2 (M000), OS id=13964,
pid: 212, proc_ser: 66, sid: 5649, sess_ser: 7241
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
  0.000000 secs at [ 17:08:11 ]
    NOTE: scheduling delay has not been sampled for 0.103181 secs  0.000667 secs from [ 17:08:07 - 17:08:12 ], 5 sec avg
  0.003777 secs from [ 17:07:12 - 17:08:12 ], 1 min avg
  0.000976 secs from [ 17:03:12 - 17:08:12 ], 5 min avg
loadavg : 0.19 0.18 0.17                                                                                    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Swapinfo :
        Avail = 380207.03Mb Used = 165771.44Mb
        Swap free = 214385.22Mb Kernel rsvd = 15198.44Mb
        Free Mem  = 83933.84Mb
  F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
1401 R   oracle 13964     1 254 178 20 e000000d77ef2040 102092                - 17:00:09 ?         7:57 ora_m000_anbob2        <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Short stack dump:
ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<-kghalf()+385<-kqlfbctc()+1024<-kgligi()+128<-kglic_cbk()+832<-kglic0()+1376<-kglic()+144<-kqlfbct()+560<-qerfxFetch()+2784<-qerjotRowProc()+2432<-qerhjWalkHashBucket()+768<-qerhjInnerProbeHashTable()+1168<-qerfxFetch()+1680<-rwsfcd()+320<-qerhjFetch()+1040<-qerjotFetch()+3664<-rwsfcd()+320<-qerltcFetch()+1696<-insexe()+1584<-opiexe()+16192<-kpoal8()+4624<-opiodr()+2416<-kpoodrc()+64<-rpiswu2()+1120<-kpoodr()+1008<-upirtrc()+2640<-kpurcsc()+256<-kpuexec()+6672<-OCIStmtExecute()+80<-kewrose_oci_stmt_exec()+96<-kewrgwxf1_gwrsql_exft_1()+576<-kewrgwxf_gwrsql_exft()+928<-kewrftbs_flush_table_by_sql()+352<-kewrfabt_flush_attribute_table()+1120<-kewrfsb_flush_sqlbind()+80<-kewrft_flush_table()+800<-kewrftec_flush_table_ehdlcx()+832<-kewrftsq_flush_topsql()+624<-kewrft_flush_table()+800<-kewrftec_flush_table_ehdlcx()+832<-kewrfat_flush_all_tables()+2144<-kewrfos_flush_onesnap()+272<-kewrfsc_flush_snapshot_c()+832<-kewrafs_auto_flush_slave()+1008<-kebm_slave_main()+944<-ksvrdp()+5440<-opirip()+1360<-opidrv()+1152<-sou2o()+256<-opimai_real()+352<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80
 
-------------------------------------------------------------------------------
Process diagnostic dump actual duration=0.392000 sec
  (max dump time=15.000000 sec)

  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
*** 2016-05-26 17:08:12.156
current sql: insert into wrh$_sql_bind_metadata   (snap_id, dbid,    sql_id, name, position, dup_position,    datatype, datatype_string,    character_sid, precision, scale, max_length)  SELECT /*+ ordered use_nl(bnd) index(bnd sql_id) */      :lah_snap_id, :dbid,      bnd.sql_id, name, position, dup_position,      datatype, dataty
 
                    ----------------------------------------
                    SO: 0xc000002071d7c428, type: 2, owner: 0x0000000000000000, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
                     proc=0xc000002071d7c428, name=process, file=ksu.h LINE:12616 ID:, pg=0
                    (process) Oracle pid:212, ser:66, calls cur/top: 0xc000001fb321ec20/0xc000001fb31300e8
                              flags : (0x2) SYSTEM
                              flags2: (0x30),  flags3: (0x10)
                              intr error: 0, call error: 0, sess error: 0, txn error 0
                              intr queue: empty
                      ksudlp FALSE at location: 0
  (post info) last post received: 0 0 80
              last post received-location: kji.h LINE:3418 ID:kjata: wake up enqueue owner
              last process to post me: c0000020b1d72b30 2 6
              last post sent: 0 0 26
              last post sent-location: ksa2.h LINE:285 ID:ksasnd
              last process posted by me: c000002071d60198 2 6
                      (latch info) wait_event=0 bits=0
                      Process Group: DEFAULT, pseudo proc: 0xc0000020a20d45a0
                      O/S info: user: oracle, term: UNKNOWN, ospid: 13964
                      OSD pid info: Unix process pid: 13964, image: oracle@qdtza2 (M000)
                      ----------------------------------------
                      SO: 0xc0000020505807f8, type: 10, owner: 0xc000002071d7c428, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
                       proc=0xc000002071d7c428, name=FileOpenBlock, file=ksfd.h LINE:6448 ID:, pg=0
                      (FOB) c0000020505807f8 flags=2050 fib=c0000020c8afa058 incno=0 pending i/o cnt=0
                       fname=/dev/tza_oravg04/rtza_lv16_403
                       fno=820 lblksz=16384 fsiz=1105919
       
    service name: SYS$BACKGROUND
                      Current Wait Stack:
                        Not in wait; last wait ended 7 min 48 sec ago           <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                      There are 1 sessions blocked by this session.
                      Dumping one waiter:
                        inst: 1, sid: 6238, ser: 38599
                        wait event: 'enq: WF - contention'
                          p1: 'name|mode'=0x57460006
                          p2: '0'=0x46
                          p3: '0'=0x0
                        row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0         <<<<<<<<<<<<<<<<<<<<<<<<<<
                        min_blocked_time: 445 secs, waiter_cache_ver: 50989
                      Wait State:
                        fixed_waits=0 flags=0x20 boundary=0x0000000000000000/-1
                      Session Wait History:
                          elapsed time of 7 min 48 sec since last wait
                       0: waited for 'gc cr block 2-way'
                          =0x3, =0x115a3, =0x1
                          wait_id=5905 seq_num=5906 snap_id=1
                          wait times: snap=0.000424 sec, exc=0.000424 sec, total=0.000424 sec
                          wait times: max=infinite
                          wait counts: calls=1 os=1
                          occurred after 0.000712 sec of elapsed time
                       1: waited for 'gc current block 2-way'
                          =0x3, =0x13148, =0x1
                          wait_id=5904 seq_num=5905 snap_id=1
                          wait times: snap=0.000325 sec, exc=0.000325 sec, total=0.000325 sec
                          wait times: max=infinite
                          wait counts: calls=1 os=1
                          occurred after 0.001039 sec of elapsed time
                       2: waited for 'enq: WF - contention'
                          name|mode=0x57460006, 0=0x38, 0=0x0
                          wait_id=5903 seq_num=5904 snap_id=1
                          wait times: snap=0.000257 sec, exc=0.000257 sec, total=0.000257 sec
                          wait times: max=infinite
                          wait counts: calls=2 os=2
                          occurred after 1.078632 sec of elapsed time
                       3: waited for 'enq: WF - contention'
                          name|mode=0x57460006, 0=0x46, 0=0x0
                          wait_id=5902 seq_num=5903 snap_id=1
                          wait times: snap=0.000282 sec, exc=0.000282 sec, total=0.000282 sec
                          wait times: max=infinite
                          wait counts: calls=2 os=2
                          occurred after 0.002117 sec of elapsed time
                       4: waited for 'gc current block 2-way'
                          =0x123, =0x31410, =0x2000001
                          wait_id=5901 seq_num=5902 snap_id=1
                          wait times: snap=0.000276 sec, exc=0.000276 sec, total=0.000276 sec
                          wait times: max=infinite
                          wait counts: calls=1 os=1
                          occurred after 0.000529 sec of elapsed time
                       5: waited for 'gc current block 2-way'
                          =0x123, =0x71a2, =0x2000009
                          wait_id=5900 seq_num=5901 snap_id=1
                          wait times: snap=0.000141 sec, exc=0.000141 sec, total=0.000141 sec
                          wait times: max=infinite
                          wait counts: calls=1 os=1
                          occurred after 0.000231 sec of elapsed time
...

解决方法:
有个类似 Bug 20733574 : HANG DETECTED WHILE CREATING AWR SNAPSHOT, 但当时因为收集信息不足,开发未确认为bug;
还有一种解决方法是跳过收集与该表相关的数据(这个案例是wrh$_sql_bind_metadata记录的是SQL Bind Metadata),使用下面的命令:
alter system set “_awr_disabled_flush_tables” = ‘wrh$_sql_bind_metadata’;
该命令是动态的,不需要重启实例, 禁用多个表时参数用逗号分隔;

原理禁用了和上面表相关的统计信息flush ash to awr, 除了上述的bug ,还有其它一些原因时如空间不足、hang、也可以禁用,除了上面的禁用方法还有另一种禁用方法,先查询AWR Table Info (KEW layer)使用下面的SQL:

select table_id_kewrtb, table_name_kewrtb from x$kewrtb order by 1;

从上面的查询中得到table_id用于下面的level 值,比如我们这里表wrh$_sql_bind_metadata对应的是56

alter session set events ‘immediate trace name awr_flush_table_off level 56’;

后期如果想打开flushing to awr如果使用第一种方法就是把”_awr_disabled_flush_tables” reset掉;如果第二种方法禁用,打开使用

alter session set events ‘immediate trace name awr_flush_table_on level 56’;

时间: 2025-01-20 10:58:23

oracle报错ORA-32701 error in alert.log M000 hang event ‘not in wait’ during flush AWR的相关文章

Oracle报错ORA-00604 ORA-00376 数据库redo undo丢失恢复例子

运维DBA反映数据库存储故障,导致redo undo两个表空间数据文件丢失,数据库无法open启动 某集团的ebs系统因磁盘空间不足把redo和undo存放到raid 0之上,而且该库无任何备份.最终悲剧发生了,raid 0异常导致redo undo全部丢失,数据库无法正常启动(我接手之时数据库已经resetlogs过,但是未成功) 1.Oracle报错ORA-00604 ORA-00376 Sun Jul 27 11:31:27 2014 SMON: enabling cache recove

项目从mysql迁移到oracle报错

问题描述 项目从mysql迁移到oracle报错 本来没有一点错误,迁移之后出现了很多,不知道哪位大神见过下面的问题: 17:23:54,967 ERROR BasicPropertyAccessor:118 - IllegalArgumentException in class: com.ebupt.dataWarehouse.entity.dataQualityMonitoring.MonTaskConEntity, setter method of property: TemporaryI

pl/sql developer-PL/SQL Developer 远程连接Oracle 报错无法解析指定的标识符

问题描述 PL/SQL Developer 远程连接Oracle 报错无法解析指定的标识符 环境: 虚拟机中安装CentOS 6.3 CentOS安装Oracle 11gr2 客户机Win7 问题介绍: 已经在CentOS 启动了实例,监听: 开放了1521端口:虚拟机跟客户机网络连通: 使用DBVisualizer可以连接成功目标实例: PL/SQL Dev连接就报错"ORA-12154: TNS:could not resolve the connect identifier specifi

lsnrctl启动报错,Linux Error: 29: Illegal seek

lsnrctl启动报错,Linux Error: 29: Illegal seek lsnrctl启动报错,信息如下 [oracle@sap admin]$ lsnrctl start LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 15-SEP-2014 02:33:08 Copyright (c) 1991, 2009, Oracle.  All rights reserved. Starting /home/oracle/app/

c++-win32 C++应用程序放到Server08下连接oracle报错

问题描述 win32 C++应用程序放到Server08下连接oracle报错 程序是C++的,连接本地Oracle数据库,方法为 CoInitialize(NULL);ConnectionPtr conn; try{ conn.CreateInstance(_uuidof(Connection)); conn->Open("Provider=OraOLEDB.Oracle.1;Data Source=ORACLE;Persist Security Info=false;Password=

网页错误 rror-在虚拟机部署网页报错:Server Error in &amp;amp;#39;/&amp;amp;#39; Application.

问题描述 在虚拟机部署网页报错:Server Error in '/' Application. 在虚拟机部署网页时不时会报此错,重启IIS后暂时没问题,过多一会儿又提示该错! 参数:IIS6.0.asp.net 4.0.Win2003 SP2 备注:此为应用服务器,数据服务器在另外一台机 解决方案 是不是页面本身处理有问题 解决方案二: 在本机(非虚拟机)中部署无出现任何问题,在虚拟机中就报错...

mysql-MYSQL登录报错,提示Error 2002

问题描述 MYSQL登录报错,提示Error 2002 重新安装mysql后,在启动服务登录时报ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) 这样的错误,求解答. 解决方案 查看端口,netstat,看mysql服务是否在对应端口侦听. 解决方案二: 检查网络和防火墙,还有sql的服务器端是否运行 解决方案三: 参考:http://blo

hibernate连接oracle报错

问题描述 hibernate连接oracle报错 Exception in thread "main" java.lang.AbstractMethodError: Method oracle/jdbc/driver/OracleDatabaseMetaData.supportsNamedParameters()Z is abstract 解决方案 PB连接Oracle报错plsql 连接 oracle 报错:ora-12514 解决方案二: 你看看是不是你的驱动包的问题,该异常说的是

oracle 报错 ora-00905

问题描述 oracle 报错 ora-00905 report如下,求大神来看看 在行: 99 上开始执行命令时出错 - ALTER TABLE vet ADD CONSTRAINT vetpractice_vet_fk FOREIGN KEY (vetpractice_no, pet_no, owner_no) REFERENCES vetpractice (vetpractice_no, pet_no, owner_no) ON DELETE NO ACTION ON UPDATE NO A