[20170707]cursor: pin S wait on X(10G)

[20170707]cursor: pin S wait on X(10G).txt

--生产系统遇到1个bug,版本:
EXAM@xxx> @ &r/ver1
PORT_STRING         VERSION    BANNER
------------------- ---------- ----------------------------------------------------------------
x86_64/Linux 2.4.xx 10.2.0.4.0 Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

--//我今天才介入解决问题,就是调用存储过程某个update语句时,应用hang住了.我看等待事件cursor: pin S wait on X很容易联想到
--//bug.

http://www.dba-oracle.com/t_cursor_pin_wait_on_x.htm

Question:  My AWR report shows "cursor: pin S wait on X" responsible for over 90% of my database processing time. How do
I fix the "cursor: pin S wait on X" event?

Answer:  The "cursor: pin S wait on X" is related to the new "mutex" option.  Oracle 10g release 2 and beyond replaced
some latch mechanisms with the mutex approach, claiming that they are faster and more efficient than traditional locking
mechanisms.  To improve cursor execution speed and hard pare time within the library cache, mutexes replace library
cache latches and library cache pins. Oracle claims that mutexes are faster and use less CPU, which is important for
CPU-bound database where large data buffers remove I/O as a primary source of contention.

Beware, Oracle bug 5184776 can affect 11g databases that deploy the mutex latching, causing excessive 'cursor pin s wait
on x' wait events.

If you suspect that you have a bug, you can revert to the pre-mutex method with the consent of Oracle Support using
these commands that bounce your database into the older latching mode, which may result in less deadlocks:

alter system set "_kks_use_mutex_pin"=false scope=spfile;
shutdown immediate
startup

NOTE:  This hidden parameter should only be used with the consent of Oracle support, and this parameter should NOT ever
be set in Oracle 12c

The cursor pin S wait on X wait event is mostly related to mutex and hard parse and it happens when a session waits for
this event when it is requesting a shared mutex pin and another session is holding an exclusive mutex pin on the same
cursor object. 

These are the parameters associated with this mutex wait event:

P1 Hash value of cursor
P2 Mutex value (top 2 bytes contains SID holding mutex in exclusive mode, and bottom two bytes usually hold the value 0)
P3 Mutex where (an internal code locator) OR'd with Mutex Sleeps 

--// 以下根据作者的执行的脚本分析:
SELECT s.inst_id AS inst
      ,s.sid AS blocked_sid
      ,s.username AS blocked_user
      ,sa.sql_id AS blocked_sql_id
      ,TRUNC (s.p2 / 4294967296) AS blocking_sid
      ,b.username AS blocking_user
      ,b.sql_id AS blocking_sql_id
  FROM gv$session s
       JOIN gv$sqlarea sa ON sa.hash_value = s.p1
       JOIN gv$session b
          ON TRUNC (s.p2 / 4294967296) = b.sid AND s.inst_id = b.inst_id
       JOIN gv$sqlarea sa2 ON b.sql_id = sa2.sql_id
WHERE s.event = 'cursor: pin S wait on X';

Record View
As of: 2017/7/7 10:30:03

INST:             1
BLOCKED_SID:      1577
BLOCKED_USER:     EXAM
BLOCKED_SQL_ID:   guv6x80vk33wb
BLOCKING_SID:     1579
BLOCKING_USER:    EXAM
BLOCKING_SQL_ID:  guv6x80vk33wb

--//sid=1579 阻塞了 sid=1577.

select sql_id,loaded_versions,executions,loads,invalidations,parse_calls
from gv$sql
where inst_id=1 and sql_id='guv6x80vk33wb';

SQL_ID        LOADED_VERSIONS EXECUTIONS      LOADS INVALIDATIONS PARSE_CALLS
------------- --------------- ---------- ---------- ------------- -----------
guv6x80vk33wb               0          0          1             1           0

--//注:我前面做了alter system flush shared_pool;,这里看到的EXECUTIONS次数是0.
--//另外我在刷新前执行kill -9 os进程(当前正在执行这条sql语句的会话),我当时以为这样问题就可以解决.脚本如下:
select 'kill -9 '||spid,a.*  from V$PROCESS a where  addr in (
  SELECT paddr
    FROM v$session
   WHERE wait_class <> 'Idle' AND sql_id = 'guv6x80vk33wb'
);

--//查看sid=1579的情况:
select * from v$session where sid=1579;
Record View
As of: 2017/7/7 10:32:19
SADDR:                      000000007C90B688
SID:                        1579
SERIAL#:                    34307
AUDSID:                     54134981
PADDR:                      000000007D545808
USER#:                      58
USERNAME:                   EXAM
COMMAND:                    6
OWNERID:                    2147483644
TADDR:                     
LOCKWAIT:                  
STATUS:                     KILLED
SERVER:                     DEDICATED
SCHEMA#:                    58
SCHEMANAME:                 EXAM
OSUSER:                     SYSTEM
PROCESS:                    3324:10116
MACHINE:                    WORKGROUP\BROKER
TERMINAL:                   BROKER
PROGRAM:                    Kodak.GCGateway.Adapter.Service.exe
TYPE:                       USER
SQL_ADDRESS:                00000000733A0DF8
SQL_HASH_VALUE:             924946315
SQL_ID:                     guv6x80vk33wb
SQL_CHILD_NUMBER:           3
PREV_SQL_ADDR:              0000000074233B68
PREV_HASH_VALUE:            152532753
PREV_SQL_ID:                7wp2a204jfxsj
PREV_CHILD_NUMBER:          5
PLSQL_ENTRY_OBJECT_ID:      116247
PLSQL_ENTRY_SUBPROGRAM_ID:  1
PLSQL_OBJECT_ID:           
PLSQL_SUBPROGRAM_ID:       
MODULE:                     Kodak.GCGateway.Adapter.Service.exe
MODULE_HASH:                163112698
ACTION:                    
ACTION_HASH:                0
CLIENT_INFO:                192.168.29.13
FIXED_TABLE_SEQUENCE:       183286255
ROW_WAIT_OBJ#:              -1
ROW_WAIT_FILE#:             0
ROW_WAIT_BLOCK#:            0
ROW_WAIT_ROW#:              0
LOGON_TIME:                 2017/6/28 13:03:40
LAST_CALL_ET:               768474
PDML_ENABLED:               NO
FAILOVER_TYPE:              NONE
FAILOVER_METHOD:            NONE
FAILED_OVER:                NO
RESOURCE_CONSUMER_GROUP:   
PDML_STATUS:                DISABLED
PDDL_STATUS:                ENABLED
PQ_STATUS:                  ENABLED
CURRENT_QUEUE_DURATION:     0
CLIENT_IDENTIFIER:         
BLOCKING_SESSION_STATUS:    UNKNOWN
BLOCKING_INSTANCE:         
BLOCKING_SESSION:          
SEQ#:                       31
EVENT#:                     254
EVENT:                      single-task message
P1TEXT:                    
P1:                         0
P1RAW:                      00
P2TEXT:                    
P2:                         0
P2RAW:                      00
P3TEXT:                    
P3:                         0
P3RAW:                      00
WAIT_CLASS_ID:              2723168908
WAIT_CLASS#:                6
WAIT_CLASS:                 Idle
WAIT_TIME:                  0
SECONDS_IN_WAIT:            768474
STATE:                      WAITING
SERVICE_NAME:               mid
SQL_TRACE:                  DISABLED
SQL_TRACE_WAITS:            FALSE
SQL_TRACE_BINDS:            FALSE

--//昏,这个漏网了,因为WAIT_CLASS=Idle,而且很奇怪STATUS=KILLED.如果前面执行条件没有wait_class <> 'Idle',估计问题早解决了.
--//另外从时间看2017/6/28 13:03:40,正好也是网络出现环路的时间,正好也可以相互印证.
--//再次说明做事情要认真小心....

SELECT * FROM V$PROCESS WHERE addr IN (SELECT paddr FROM v$session WHERE sid = 1579);

Record View
As of: 2017/7/7 10:34:31

ADDR:              000000007D545808
PID:               20
SPID:              6086
USERNAME:          oracle
SERIAL#:           73
TERMINAL:          UNKNOWN
PROGRAM:           oracle@qyymid
TRACEID:          
BACKGROUND:       
LATCHWAIT:        
LATCHSPIN:        
PGA_USED_MEM:      556053
PGA_ALLOC_MEM:     779293
PGA_FREEABLE_MEM:  0
PGA_MAX_MEM:       779293

--//删除进程号6086.
$ kill -9 6086

--//再测试ok,问题消失.顺便贴分析时的awr报表的前面部分:

WORKLOAD REPOSITORY report for

DB Name         DB Id    Instance     Inst Num Release     RAC Host       
------------ ----------- ------------ -------- ----------- --- ------------
MID           2812601929 mid                 1 10.2.0.4.0  NO  qyymid     

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     44914 07-Jul-17 09:00:11        50       3.8
  End Snap:     44915 07-Jul-17 09:29:29        57       4.7
   Elapsed:               29.30 (mins)
   DB Time:                1.46 (mins)

--//很奇怪这个版本看到DB Time很小,仅仅1.46 (mins)

Cache Sizes
~~~~~~~~~~~                       Begin        End
                             ---------- ----------
               Buffer Cache:       136M       136M  Std Block Size:         8K
           Shared Pool Size:       320M       320M      Log Buffer:     6,152K

--//不是我配的(数据库运行在虚拟机器上的),明显Buffer Cache小了.不过这个服务器做的事情很少.

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:              1,698.35             16,406.59
              Logical reads:                477.12              4,609.12
              Block changes:                 22.35                215.91
             Physical reads:                  0.30                  2.87
            Physical writes:                  0.66                  6.38
                 User calls:                 13.05                126.08
                     Parses:                102.20                987.24
                Hard parses:                  0.27                  2.58
                      Sorts:                178.27              1,722.16
                     Logons:                  2.18                 21.09
                   Executes:                103.94              1,004.12
               Transactions:                  0.10

  % Blocks changed per Read:    4.68    Recursive Call %:    98.56
Rollback per transaction %:    4.40       Rows per Sort:     1.86

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:  100.00       Redo NoWait %:  100.00
            Buffer  Hit   %:   99.96    In-memory Sort %:  100.00
            Library Hit   %:   99.61        Soft Parse %:   99.74
         Execute to Parse %:    1.68         Latch Hit %:  100.00
Parse CPU to Parse Elapsd %:   30.64     % Non-Parse CPU:   85.20

Shared Pool Statistics        Begin    End
                              ------  ------
             Memory Usage %:   93.17   91.29
    % SQL with executions>1:   82.12   79.33
  % Memory for SQL w/exec>1:   83.60   80.99

Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
cursor: pin S wait on X           1,786,725      19,351     11 ###### Concurrenc
--//很奇怪,这个版本为什么不把cursor: pin S wait on X计入db time时间.
CPU time                                             56          63.8          
log file sync                           356          24     66   26.9     Commit
control file parallel write             582          21     36   23.7 System I/O
log file parallel write                 749          20     27   22.6 System I/O
          -------------------------------------------------------------

--//查询有问题sql_id=guv6x80vk33wb,也无法在awr报表中体现出来.

时间: 2024-11-10 05:17:11

[20170707]cursor: pin S wait on X(10G)的相关文章

[20130628]关于cursor pin S以及cursor pin S wait on X的相关问题.txt

[20130628]关于cursor pin S以及cursor pin S wait on X的相关问题.txt 生产系统最近一段时间总是出现cursor pin S wait on X,以前从来没有遇到,今天抽空看了一下.了解关于cursor pin S以及cursor pin S wait on X的相关问题,我看了一些链接: http://www.pythian.com/blog/cursor-pin-s-wait-on-x-in-the-top-5-wait-events/http:/

cursor: pin S

cursor: pin S OTN 解释如下:   cursor: pin SA session waits on this event when it wants to update a shared mutex pin and another session is currently in the process of updating a shared mutex pin for the same cursor object. This wait event should rarely b

latch: row cache objects 和cursor: pin S wait on X共同出现

 Snap Id Snap Time Sessions Cursors/Session Begin Snap: 5980 11-Nov-13 16:00:37 1403  39.6 End Snap: 5981 11-Nov-13 16:31:29 1422  42.8 Elapsed:    30.86 (mins)    DB Time:    3,660.60 (mins)    Per Second Per Transaction Redo size:  1,185,514.75  6,

cursor: pin S wait on X等待事件模拟(转)

转自:http://warehouse.itpub.net/post/777/493962 这是10.2版本提出的mutex(互斥)机制用来解决library cache bin latch争夺问题引入的新事件,是否使用这种机制受到隐含参数_kks_use_mutex_pin的限制,从10.2.0.2开始该参数default为true,使用这种机制oracle是为了解决library cache bin latch的串行使用问题,但是mutex貌似还不是很稳定,在很多系统中会出现cursor:

【中亦安图】小机上运行Oracle需要注意的进程调度bug(1)

第一章 技术人生系列 · 我和数据中心的故事--第一期 第一期:技术人生系列 · 我和数据中心的故事(第一期)小机上运行Oracle需要注意的进程调度bug. 2016-01-11 小y 中亦安图   前 言   小y这个名字,是笔者临时想的一个笔名,其实没有什么特殊的含义,就暂且用他来代表我们这些为各个数据中心奉献自己青春的一群默默无闻的IT人吧!   小y今天要和大家分享的是一个疑难杂症的分析过程.如果大家有耐心读完这个案例,一定会或多或少有些收获,也就没浪费小y的一片苦心.   具体来说是

为什么用 PostgreSQL 绑定变量 没有 Oracle pin S 等待问题

早上看到盖国强老师在朋友圈里分享了一篇关于软解析带来的Pin S等待的问题.有感而发,跟大家聊一聊为什么PostgreSQL不存在这个问题. Oracle 在Oracle中多个会话高并发的执行同一条SQL,如果使用了绑定变量的话,会产生pin s的等待事件.原因如下(取自互联网http://www.dbafree.net/?p=778) 每个child cursor(你可以认为是一条SQL的plan tree)下面都有一个mutexes这样的简单内存结构,当有session要执行该SQL而需要p

[20120206]Cursor Invalidation与分析表.txt

在分析表的是否有一个参数no_invalidate:缺省值是DBMS_STATS.AUTO_INVALIDATE.AUTO_INVALIDATE.     10g中默认是AUTO_INVALIDATE,就是说分析表后,游标不会马上invalidate,已经存在的SQL的执行计划不会受新的统计信息影响.可以手工DDL invalidate游标.又或者等待隐藏参数_optimizer_invalidation_period(time window for invalidation of cursor

【MOS】常见问题cursor library cache类型的等待事件

[MOS]常见问题:'cursor:mutex ..'/ 'cursor:pin ..'/ 'library cache:mutex ..'类型的等待事件 (文档 ID 1525791.1) 文档内容 用途 问题和答案   什么是 'cursor: ' 等待事件?   最常见的等待事件是什么?   等待事件最常见的原因是什么?   如何避免这些等待事件?   可以在什么位置找到原因诊断以及关于这些等待事件的更多信息?   有用参考 参考 适用于: Oracle Database - Enterp

Oracle中的游标、硬解析、软解析、软软解析、解析失败

Oracle中的游标.硬解析.软解析.软软解析.解析失败 一. 游标的分类及共享游标 游标(Cursor)是Oracle数据库中SQL解析和执行的载体,它可以分为共享游标(Shared Cursor)和会话游标(Session Cursor).共享游标可以细分为父游标(Parent Cursor)和子游标(Child Cursor),可以通过视图V$SQLAREA和V$SQL来查看当前缓存在库缓存(Library Cache)中的父游标和子游标,其中V$SQLAREA用于查看父游标,V$SQL用