Thread Dump 和Java应用诊断(转)

 

Thread Dump 和Java应用诊断
Thread Dump是非常有用的诊断Java应用问题的工具,每一个Java虚拟机都有及时生成显示所有线程在某一点状态的thread-dump的能力。虽然各个Java虚拟机thread dump打印输出格式上略微有一些不同,但是Thread dumps出来的信息包含线程;线程的运行状态、标识和调用的堆栈;调用的堆栈包含完整的类名,所执行的方法,如果可能的话还有源代码的行数。

Thread Dump特点:

?能在各种操作系统下使用
?能在各种Java应用服务器下使用
?可以在生产环境下使用而不影响系统的性能
?可以将问题直接定位到应用程序的代码行上
Thread Dump能诊断的问题包括:

?查找内存泄露,常见的是程序里load大量的数据到缓存
?发现死锁线程
Sun的JVM用下列方法可以产生Thread Dump堆栈信息:

1,Solaris OS
<ctrl>-’/’ (Control-Backslash)
 kill -QUIT <pid>

2, HP-UX/UNIX/Linux
Kill -3 PID
PID通过下面方法获取
ps -efHl | grep 'java' **. **

3,Windows
直接对MSDOS窗口的程序按Ctrl-break

有些Java应用服务器是在控制台上运行,如Weblogic,为了方便获取threaddump信息,在weblogic启动的时候,最好将其标准输出重定向到一个文件,用"nohup sh startWebLogic.sh > start.log &"命令,执行"kill -3 <pid>",Stack trace就会输出到start.log里。Tomcat的Thread Dump会输出到命令行控制台或者logs的catalina.out文件里。为了反映线程状态的动态变化,需要接连多次做thread dump,每次间隔10-20s。

IBM JVM下产生Thread Dump:

在AIX上用IBM的JVM,内存溢出时默认地会产生javacore文件(关于cpu的)和heapdump文件(关于内存的)。如果没有参照下列方法:
1 choose one cluster member, set the following before this server start:
在was启动前设置下面环境变量(可以加在启动脚本中)
export IBM_HEAPDUMP=true
export IBM_HEAP_DUMP=true
export IBM_HEAPDUMP_OUTOFMEMORY=true
export IBM_HEAPDUMPDIR=<directory path>

2 please use set command to make sure you do not have DISABLE_JAVADUMP parameter
then start this cluster member.
用set命令检查参数设置,确保没有设置DISABLE_JAVADUMP,然后启动server

3 when you find free memory < 50% when no heavy access, please run kill -3 <pid>
执行kill -3 <pid>命令可以生成javacore文件和heapdump文件(pid为was java进程的id号,可以用ps -ef|grep java 查到),可以多执行几次,按照下面操作进行

ps -ef > psef1.txt
ps aux > psaux1.txt
vmstat 5 10 > vmstat.txt
kill -3 <app server id>
wait for 2 mins
kill -3 <app server id>
wait for 2 mins
kill -3 <app server id>
netstat -an> netstat2.txt
ps -ef > psef2.txt
ps aux > psaux2.txt
将上面产生的 txt 文件和/usr/WebSphere/AppServer/javacore*文件和heapdump文件拷贝到本地,然后删除这些文件,因为这些文件会占用较大的文件系统空间。
将/usr/WebSphere/AppServer/logs/wlmserver1(或2)目录下当天产生的日志拷贝出来

在IBM JVM产生的javacore或者Threaddump文件中应用服务器Web容器的常见线程状态:

Idle线程:一个已经准备好接受请求的线程,但是没有和插件或者客户端建立连接
Keep-Alive线程:是一个已经准备好接受请求的线程,并且已经和插件或者客户端建立连接
正在接受请求的线程:是一个线程正在读取request的内容或者头部

下面就给出各种线程在javacore或者Threaddump中的表现形式:

Idle线程:
"Servlet.Engine.Transports : 20" (TID:0x427F190, sys_thread_t:0x15D175E8, state:R, native ID:0xBB8) prio=5
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:429)
at com.ibm.ws.util.BoundedBuffer.take(BoundedBuffer.java:161)
at com.ibm.ws.util.ThreadPool.getTask(ThreadPool.java(Compiled Code)) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java(Compiled Code))

Keep-alive线程 (非SSL模式):
"Servlet.Engine.Transports : 20" (TID:0x427F190, sys_thread_t:0x15D175E8, state:R, native ID:0xBB8) prio=5
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:86)
at com.ibm.ws.io.Stream.read(Stream.java)
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java)
at com.ibm.ws.io.ReadStream.read(ReadStream.java)
at com.ibm.ws.http.HttpRequest.readRequestLine(HttpRequest.java)
at com.ibm.ws.http.HttpRequest.readRequest(HttpRequest.java)
at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection.java)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java)
at com.ibm.ws.util.CachedThread.run(ThreadPool.java)

Keep-alive线程 (SSL模式):
"Servlet.Engine.Transports : 12" (TID:0x458DBA18, sys_thread_t:0x60B297C0, state:R, native ID:0x427E) prio=5
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java(Compiled Code))
at com.ibm.sslite.s.a(Unknown Source)(Compiled Code)
at com.ibm.sslite.s.b(Unknown Source)(Compiled Code)
at com.ibm.sslite.s.a(Unknown Source)(Compiled Code)
at com.ibm.sslite.a.read(Unknown Source)(Compiled Code)
at com.ibm.jsse.a.read(Unknown Source)(Compiled Code)
at com.ibm.ws.io.Stream.read(Stream.java(Compiled Code))
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java(Inlined Compiled Code))
at com.ibm.ws.io.ReadStream.read(ReadStream.java(Inlined Compiled Code))
at com.ibm.ws.http.HttpRequest.readRequestLine(HttpRequest.java(Compiled Code))
at com.ibm.ws.http.HttpRequest.readRequest(HttpRequest.java(Compiled Code))
at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java(Compiled Code))
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:672)

正在接受请求的线程:
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:85)
at com.ibm.ws.io.Stream.read(Stream.java:17)
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java:411)
at com.ibm.ws.io.ReadStream.read(ReadStream.java:110)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java:448)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:672)

Sun JVM的常见线程状态:

对于thread dump信息,主要关注的是线程的状态和其执行堆栈
线程的状态一般为三类
Runnable(R):当前可以运行的线程
Waiting on monitor(CW):线程主动wait
Waiting for monitor entry(MW):线程等锁
一般关注的都是第一和第三种状态的线程
Cpu很忙则关注runnable的线程
Cpu闲则关注waiting for monitor entry的线程
一种典型的死锁是由于在server端应用(比如servlet)中请求由同一weblogic实例server的资源
解决办法就是将该servlet放到另外的执行队列里去执行

下面给出一个典型的死锁线程(注意STUCK关键字):

"[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=10 tid=02fe9a18 nid=35 lwp_id=7518924 runnable [440dd000..440db878]
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:134)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.getArrayOfBytesFromSocket(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.readFirstPacketInBuffer(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.readPacket(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.receive(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleNet8NSPTDAPacket.sendRequest(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplStatement.fetchNext(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplStatement.fetchNext2(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplResultset.fetchAtPosition(Unknown Source)
 at weblogic.jdbc.base.BaseImplResultSet.next(Unknown Source)
 at weblogic.jdbc.base.BaseResultSet.next(Unknown Source)
 - locked <55f25550> (a weblogic.jdbc.oracle.OracleConnection)
 at weblogic.jdbc.wrapper.ResultSet_weblogic_jdbc_base_BaseResultSet.next(Unknown Source)
 at org.hibernate.loader.Loader.doQuery(Loader.java:685)

UNIX/Linux下可用top、vmstat或prstat命令观察系统资源状况

Mandy Chung's Blog 有一篇关于Thread Dump and Concurrency Locks的blog,摘来如下:
Thread dumps are very useful for diagnosing synchronization related problems such as deadlocks on object monitors. Ctrl-/ on Solaris/Linux or Ctrl-Break on Windows has been a common way to get a thread dump of a running application. On Solaris or Linux, you can send a QUIT signal to the target application. The target application in both cases prints a thread dump to the standard output and also detects if there is any deadlock involving object monitors.
jstack, a new troubleshooting utility introduced in Tiger (J2SE 5.0), provides another way to obtain a thread dump of an application. Alan Bateman has a nice blog about jstack and its several improvements in Mustang (Java SE 6). Mustang jstack works like a remote Ctrl-/ or Ctrl-Break if you are on Windows.
jconsole is JMX-complaint GUI tool which allows you to get a thread dump on the fly. The "Using JConsole to Monitor Applications" article gives you an overview of the Tiger monitoring and management functionality.
Mustang extends the thread dump, jstack, and jconsole to support java.util.concurrent.locks to improve its diagnosability. For example, the Threads tab in the Mustang jconsole now shows which synchronizer a thread is waiting to acquire when the thread is blocked to lock a ReentrantLock and also which thread is owning that lock.

In addition, it has a new "detect deadlock" button (in the bottom). When you click on the "detect deadlock" button, it will send a request to the target application to perform the deadlock detection operation. If the target application is running on Mustang, it finds deadlocks involving both object monitors as well as the java.util.concurrent.locks. If the target application is running on Tiger, it finds deadlocks involving object monitors only. Each deadlock cycle will be displayed in a separate Deadlock tab.

Click here to see a wider form of this screenshot.
JDK 6 has a nice demo FullThreadDump under $JDK_HOME/demo/management/FullThreadDump where JDK_HOME is the location of your JDK 6. This demo has been included in JDK 5.0 and is updated to use the new Mustang API. It demonstrates the use of the java.lang.management API to get the thread dump and detect deadlock programmatically.

 

时间: 2024-08-07 10:32:08

Thread Dump 和Java应用诊断(转)的相关文章

三个实例演示 Java Thread Dump 日志分析

原文地址: http://www.cnblogs.com/zhengyun_ustc/archive/2013/01/06/dumpanalysis.html jstack Dump 日志文件中的线程状态 dump 文件里,值得关注的线程状态有: 死锁,Deadlock(重点关注)  执行中,Runnable    等待资源,Waiting on condition(重点关注)  等待获取监视器,Waiting on monitor entry(重点关注) 暂停,Suspended 对象等待中,

如何抓取Thread Dump小结(转)

当系统性能出现问题时,需要从各个方面来查看网络环境.主机资源.查看最经变更的代码等.如果是想从代码层面解决问题,那么最有效的方法就是查看相关dump文件.如果是使用IBM JDK(我默认你是在aix环境下),那么可以使用kill -3 "进程号",这种恐吓的方式来生成dump文件.可以用IBM提供的工具jca.jar来查看Thread dump文件.利用IBM 提供的heap分析工具.javacore文件(关于cpu的)和heapdump文件(关于内存的) tips:IBM jdk1.

Thread Dump与Analyzer

线程转储文件中有什么呢? 包含每一个线程的状态及调用堆栈信息. 1.thread dump  jstack pid 可以输出堆栈信息到控制台,可用重定向命令写入到文件中. 2.分析 结果为普通文本,可以直接阅读.形如: 2016-11-04 23:53:16 Full thread dump OpenJDK 64-Bit Server VM (24.95-b01 mixed mode): "JDWP Transport Listener: dt_socket" daemon prio=

Java并发编程相关面试问题

基础概念 1.什么是原子操作?在Java Concurrency API中有哪些原子类(atomic classes)? 原子操作(atomic operation)意为"不可被中断的一个或一系列操作" .处理器使用基于对缓存加锁或总线加锁的方式来实现多处理器之间的原子操作. 在Java中可以通过锁和循环CAS的方式来实现原子操作. CAS操作--Compare & Set,或是 Compare & Swap,现在几乎所有的CPU指令都支持CAS的原子操作. 原子操作是

JournalDev 博客的 Java 教程集合(JournalDev Java Tutorials Collections)

Tutorials I have written a lot of posts here into many categories and as the number of post grows, keeping track of them becomes harder. So I have provided a summary post for most of the categories where you can read them in the order for better unde

怎样使用jstack诊断Java应用程序故障(转)

          最近一段时间,我们的生产系统升级频繁出现故障,具体故障现象是启动后10来分钟就出现交易缓慢,处理线程耗尽等现象,并且故障发生的频率蛮高的.经过详细的诊断和排查,终于发现了问题,是groovy在osgi中运行会出现classloader死锁,最后我们也解决了这个问题.         如果单靠通过查看代码是很难去发现这个问题,在这一次故障排查中,我也学到了怎样更好的使用jvm监控工具来进行诊断,主要用到了jstack和jmap命令,jmap上次已经讲过就不再讲了,下面就一个例子

线上Java应用排查和诊断规范

线上Java应用排查和诊断规范 http://www.iteye.com/topic/1132132 标准做法一:OOM触发HeadpDump 目的: OOM发生时,输出堆栈快照文件,供研发人员分析. 在JVM中,如果98%的时间是用于 GC 且可用的 Heap size 不足2%的时候,将抛出 OOM 异常. 配置操作: Resin/Tomcat 配置文件里追加 -XX:+HeapDumpOnOutOfMemoryError ,当 OutOfMemoryException 错误发生时,会自动生

诊断Java代码

诊断Java代码: Broken Dispatch错误模式 诊断Java代码: Double Descent错误模式 诊断Java代码: Impostor Type错误模式 诊断Java代码: Java编程中的断言和时态逻辑 诊断Java代码: Liar View错误模式 诊断Java代码: Repl提供交互式评价 诊断Java代码: 单元测试与自动化代码分析协同工作 诊断Java代码: 将时态逻辑用于错误模式 诊断Java代码: 进行记录器测试以正确调用方法 诊断Java代码: 空标志错误模式

急求。。。。。看下java的dump文件,会这样

问题描述 急求.....看下java的dump文件,会这样 2015-03-18 22:38:54 Full thread dump Java HotSpot(TM) Client VM (20.1-b02 mixed mode, sharing): "Attach Listener" daemon prio=10 tid=0x089f1c00 nid=0x1df1 waiting on condition [0x00000000] java.lang.Thread.State: RU