[20150331]生产系统登陆缓慢的问题.txt
--上午生产系统出现登陆缓慢,查看等待事件library cache lock ,还好开发自己发现是11g设置口令大小写问题,及时纠正了错误.
--事后,我想如果开发不发现,我估计要诊断一段时间才能定位问题,我自己在测试环境做一个模拟看看,便于以后定位:
1.建立测试环境:
SYS@test> @ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.3.0 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
--建立用户test,略,并且设置profile=MONITORING_PROFILE,MONITORING_PROFILE,因为这个用户口令设置许多参数都是无限制的.
ALTER USER TEST PROFILE MONITORING_PROFILE;
$ cat ay.sql
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
connect test/a
--注意口令是不正确的.
SYS@test> host cat /home/oracle11g/sqllaji/wx.sql
select p1,p2,p3,p1raw,p2raw,p3raw,sid,serial#,seq#,event,state,wait_time_micro,seconds_in_wait,sql_id from v$session where wait_class'Idle' and event not like 'SQL*Net%' order by event ;
2. 开始测试,执行如下:
sqlplus -s /nolog @ay &
sqlplus -s /nolog @ay &
sqlplus -s /nolog @ay &
SCOTT@test> @wx
P1 P2 P3 P1RAW P2RAW P3RAW SID SERIAL# SEQ# EVENT STATE WAIT_TIME_MICRO SECONDS_IN_WAIT SQL_ID
---------- ---------- ---------- ---------------- ---------------- ---------------- ---------- ---------- ---------- ------------------- ------------------- --------------- --------------- -------------
3190141048 3195489736 5177346 00000000BE25B078 00000000BE774DC8 00000000004F0002 9 13 3 library cache lock WAITING 8903409 9
3190141048 2966733664 5177346 00000000BE25B078 00000000B0D4C360 00000000004F0002 411 5 3 library cache lock WAITED KNOWN TIME 9994796 19
3190141048 3189066464 5177347 00000000BE25B078 00000000BE154AE0 00000000004F0003 10 41 7 library cache lock WAITING 9137097 9
--这个时候如果用户登陆,即使口令正确,也许要很长时间.
$ time echo "select sysdate from dual;" | sqlplus -s test/btbtms
SYSDATE
-------------------
2015-03-30 17:11:45
real 0m16.715s
user 0m0.030s
sys 0m0.011s
--16秒才退出.
3.如何定位:
SYS@test> select * from V$EVENT_NAME where lower(name) like lower('%library cache lock%');
EVENT# EVENT_ID NAME PARAMETER1 PARAMETER2 PARAMETER3 WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------- ---------------------------------------- -------------------- -------------------- -------------------- ------------- ----------- --------------------
286 916468430 library cache lock handle address lock address 100*mode+namespace 3875070507 4 Concurrency
--第1个想到的是ash,查看报表,意义不大,看到与上面查不多.
--P1是handle address,可以执行如下:
SCOTT@test> set serveroutput on
SCOTT@test> exec print_table('select * from x$kglob where kglhdadr=''00000000BE25B078''');
ADDR : 0000002A976D7850
INDX : 1125
INST_ID : 1
KGLHDADR : 00000000BE25B078
KGLHDPAR : 00000000BE25B078
KGLHDCLT : 17744
KGLNAOWN :
KGLNAOBJ : 96
KGLFNOBJ : 96
KGLNADLK :
KGLNAHSH : 3565438288
KGLNAHSV : 36212cd0bb9712c7fc79aeb5d4844550
KGLNATIM :
KGLNAPTM :
KGLHDNSP : 79
KGLHDNSD : ACCOUNT_STATUS
KGLHDLMD : 3
KGLHDPMD : 0
KGLHDFLG : 10241
KGLHDOBJ : 00
KGLHDLDC : 0
KGLHDIVC : 0
KGLHDEXC : 0
KGLHDLKC : 1
KGLHDKMK : 0
KGLHDDMK : 0
KGLHDAMK : 0
KGLOBFLG : 0
KGLOBSTA : 0
KGLOBTYP : -1
KGLOBTYD : NONE
KGLOBHMK : 0
KGLOBHS0 : 0
KGLOBHS1 : 0
KGLOBHS2 : 0
KGLOBHS3 : 0
KGLOBHS4 : 0
KGLOBHS5 : 0
KGLOBHS6 : 0
KGLOBHS7 : 0
KGLOBHD0 : 00
KGLOBHD1 : 00
KGLOBHD2 : 00
KGLOBHD3 : 00
KGLOBHD4 : 00
KGLOBHD5 : 00
KGLOBHD6 : 00
KGLOBHD7 : 00
KGLOBPC0 : 0
KGLOBPC6 : 0
KGLOBTP0 : 00
KGLOBT00 : 0
KGLOBT01 : 0
KGLOBT02 : 0
KGLOBT03 :
KGLOBT04 : 0
KGLOBT05 : 0
KGLOBT35 : 0
KGLOBT06 : 0
KGLOBT07 : 0
KGLOBT08 : 0
KGLOBT09 : 0
KGLOBT10 : 0
KGLOBT11 : 0
KGLOBT12 : 0
KGLOBT13 : 0
KGLOBT14 : 0
KGLOBT15 : 0
KGLOBT16 : 0
KGLOBT17 : 0
KGLOBT18 : 0
KGLOBT19 : 0
KGLOBT20 : 0
KGLOBT21 : 0
KGLOBT22 : 0
KGLOBT23 : 62
KGLOBT24 : 0
KGLOBT25 : 0
KGLOBT26 : 0
KGLOBT28 : 0
KGLOBT29 : 0
KGLOBT30 : 0
KGLOBT31 : 0
KGLOBT27 : 0
KGLOBT32 : 0
KGLOBT33 : 0
KGLOBWAP : 0
KGLOBWCC : 0
KGLOBWCL : 0
KGLOBWUI : 0
KGLOBWDW : 0
KGLOBT42 : 0
KGLOBT43 : 0
KGLOBT44 : 0
KGLOBT45 : 0
KGLOBT46 : 0
KGLOBT47 : 0
KGLOBT49 : 0
KGLOBT50 : 0
KGLOBT52 : 0
KGLOBT53 : 0
KGLOBTL0 : 0
KGLOBTL1 : 0
KGLOBTS0 :
KGLOBTS1 :
KGLOBTN0 :
KGLOBTN1 :
KGLOBTN2 :
KGLOBTN3 :
KGLOBTN4 :
KGLOBTN5 :
KGLOBTS2 :
KGLOBTS3 :
KGLOBTS5 :
KGLOBTT0 :
KGLOBCCE :
KGLOBCCEH : 0
KGLOBCLA :
KGLOBCLC : 0
KGLOBCCC : 0
KGLOBTS4 :
KGLOBCBCA :
KGLOBT48 : 0
KGLOBDSO : 0
KGLOBDEX : 0
KGLOBDPX : 0
KGLOBDLD : 0
KGLOBDIV : 0
KGLOBDPS : 0
KGLOBDDR : 0
KGLOBDDW : 0
KGLOBDBF : 0
KGLOBDRO : 0
KGLOBDCP : 0
KGLOBDEL : 0
KGLOBDFT : 0
KGLOBDEF : 0
KGLOBDUI : 0
KGLOBDCL : 0
KGLOBDAP : 0
KGLOBDCC : 0
KGLOBDPL : 0
KGLOBDJV : 0
KGLOBACS : 0
KGLOBTS6 :
KGLOBTS7 :
KGLOBT54 : 0
KGLOBT55 : 0
KGLOBT56 : 0
KGLOBT57 : 0
KGLOBDCO : 0
KGLOBDCI : 0
KGLOBDRR : 0
KGLOBDRB : 0
KGLOBDWR : 0
KGLOBDWB : 0
KGLOBT58 : 0
KGLOBDOR : 0
KGLHDMTX : 00000000BE25B1A8
KGLHDMVL : 0
KGLHDMSP : 7
KGLHDMGT : 733
KGLHDDMTX : 00000000BE25B128
KGLHDDVL : 0
KGLHDDSP : 0
KGLHDDGT : 0
KGLHDBID : 17744
KGLHDBMTX : 00000000B7EF81F8
KGLHDBVL : 0
KGLHDBSP : 0
KGLHDBGT : 420
KGLOBT59 : 0
KGLOBDCU : 0
-----------------
PL/SQL procedure successfully completed.
SCOTT@test> column object format a30
SCOTT@test> select kglnaown "Owner", kglnaobj "Object" from x$kglob where kglhdadr='00000000BE25B078';
Owner Object
------ ------------------------------
96
--从以上输出基本可以确定用户信息相关.
SCOTT@test> select user#,name from sys.user$ where user#=96;
USER# NAME
---------- --------------------
96 TEST
--如果打开登陆审计的,一般错误登陆的信息会记载.执行如下:
SCOTT@test> SELECT username,userhost,timestamp,action_name,returncode FROM DBA_AUDIT_SESSION WHERE returncode != 0 and timestamp>= sysdate-1/1440;
USERNAME USERHOST TIMESTAMP ACTION_NAME RETURNCODE
--------- --------- ------------------- ------------ ----------
TEST xxxdg 2015-03-31 08:16:59 LOGON 1017
TEST xxxdg 2015-03-31 08:17:09 LOGON 1017
TEST xxxdg 2015-03-31 08:17:19 LOGON 1017
TEST xxxdg 2015-03-31 08:17:29 LOGON 1017
$ oerr ora 1017
01017, 00000, "invalid username/password; logon denied"
// *Cause:
// *Action:
--这样基本可以确定用户登陆口令错误.
4.从P2,P3的信息确定:
SCOTT@test> set linesize 2000
SCOTT@test> select * from x$kgllk where kgllkhdl='00000000BE25B078';
ADDR INDX INST_ID KGLLKADR KGLLKUSE KGLLKSES KGLLKSNM KGLLKHDL KGLLKPNC KGLLKPNS KGLLKCNT KGLLKMOD KGLLKREQ KGLLKFLG KGLLKSPN KGLNAHSH KGLLKSQLID KGLHDPAR KGLHDNSP USER_NAME KGLNAOBJ KGLLKEST KGLLKEXC KGLLKCTP
---------------- ---------- ---------- ---------------- ---------------- ---------------- ---------- ---------------- ---------------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ------------- ---------------- ---------- ---------- --------- ------------------- ---------- --------
0000002A97543AF8 79 1 00000000B0BC4F58 00000000BD2D93A0 00000000BD2D93A0 20 00000000BE25B078 00 00 0 0 2 0 4 3565438288 gsydfqra88jah 00000000BE25B078 79 SYS 96 1970-01-01 08:00:04 0 OPEN
0000002A97543BF8 80 1 00000000B091B940 00000000BD2B8768 00000000BD2B8768 31 00000000BE25B078 00 00 0 0 2 0 4 3565438288 gsydfqra88jah 00000000BE25B078 79 SYS 96 1970-01-01 08:00:04 0 OPEN
0000002A97543CF8 81 1 00000000B0CFA8B0 00000000BD75CE40 00000000BD75CE40 404 00000000BE25B078 00 00 1 3 0 1 18 3565438288 gsydfqra88jah 00000000BE25B078 79 SYS 96 1970-01-01 08:00:18 0 OPEN
SCOTT@test> @wx
P1 P2 P3 P1RAW P2RAW P3RAW SID SERIAL# SEQ# EVENT STATE WAIT_TIME_MICRO SECONDS_IN_WAIT SQL_ID
---------- ---------- ---------- ---------------- ---------------- ---------------- ---------- ---------- ---------- ------------------- ------------------- --------------- --------------- -------------
3190141048 2965131096 5177346 00000000BE25B078 00000000B0BC4F58 00000000004F0002 20 99 3 library cache lock WAITING 4705883 5
3190141048 2966398896 5177346 00000000BE25B078 00000000B0CFA7B0 00000000004F0002 404 157 4 library cache lock WAITED KNOWN TIME 9981035 15
3190141048 2962340160 5177346 00000000BE25B078 00000000B091B940 00000000004F0002 31 63 7 library cache lock WAITING 4757373 5
--x$kgllk的KGLLKADR就是对应P2,因为会变化,我还是选择查询handle address.
--很奇怪查询v$sql视图无法找到KGLLKSQLID=gsydfqra88jah的记录.
SCOTT@test> select sql_text from v$sql where sql_id='gsydfqra88jah';
no rows selected
SCOTT@test> select * from X$KGLOB where kglobt03='gsydfqra88jah';
no rows selected
-- 从P3看,参数格式是PARAMETER3=100*mode+namespace,注意这个是16进制的.
-- mode=4f(10进制79),namespace=2 或者 3.
SYS@test> select distinct KGLHDNSP,KGLHDNSD from x$kglob;
KGLHDNSP KGLHDNSD
---------- ----------------------------------------------------------------
0 SQL AREA
4 INDEX
1 TABLE/PROCEDURE
52 SCHEDULER EARLIEST START TIME
3 TRIGGER
64 EDITION
48 MULTI-VERSION OBJECT FOR INDEX
69 DBLINK
2 BODY
79 ACCOUNT_STATUS
45 MULTI-VERSION OBJECT FOR TABLE
28 SUBSCRIPTION
10 QUEUE
23 RULESET
73 SCHEMA
24 RESOURCE MANAGER
51 SCHEDULER GLOBAL ATTRIBUTE
74 DBINSTANCE
38 RULE EVALUATION CONTEXT
82 SQL AREA BUILD
75 SQL AREA STATS
5 CLUSTER
18 PUB SUB INTERNAL INFORMATION
23 rows selected.
--ACCOUNT_STATUS说明library cache lock是在account上,可能是用错误的用户名密码登录.。
5.转储library cache看看,实际上我觉得这个是没有办法的办法,这个实际在生产系统的共享池都很大,转储的文件也会很大,
SYS@test> Alter session set events 'immediate trace name library_cache level 10';
Session altered.
--直接查找Address=0xbe25b078.
LibraryHandle: Address=0xbe25b078 Hash=d4844550 LockMode=X PinMode=0 LoadLockMode=0 Status=VALD
ObjectName: Name=96
FullHashValue=36212cd0bb9712c7fc79aeb5d4844550 Namespace=ACCOUNT_STATUS(79) Type=NONE(255) Identifier=0 OwnerIdn=2147483644
Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=8865 TotalPinCount=0
Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=10529 HandleInUse=10529 HandleReferenceCount=0
Concurrency: DependencyMutex=0xbe25b128(0, 0, 0, 0) Mutex=0xbe25b1a8(404, 48039, 1959, 6)
Flags=PIN/TIM/[00002801]
WaitersLists:
Lock=0xbe25b108[0xacc75fa0,0xb09b6528]
Pin=0xbe25b0e8[0xbe25b0e8,0xbe25b0e8]
LoadLock=0xbe25b160[0xbe25b160,0xbe25b160]
Timestamp: Current=NULL
--实际上与查询x$kglob获取的信息基本一致.