480000 millis timeout while waiting for channel to be ready for write异常处理

 2014-08-25 15:35:05,691 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.130.136.136:50010, storageID=DS-1533727399-10.130.136.136-50010-1388038551296, infoPort=50075, ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.136:50010 remote=/10.130.136.136:34264]
        at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
        at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:392)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:490)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:202)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:104)
        at java.lang.Thread.run(Thread.java:724)
2014-08-25 15:35:06,464 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.130.136.136:37121, dest: /10.130.136.136:50010, bytes: 67108864, op: HDFS_WRITE, cliID: DFSClient_hb_rs_xx,60020,1388115177740_1837727868_26, offset: 0, srvID: DS-1533727399-10.130.136.136-50010-1388038551296, blockid: blk_-3628597342762703578_40720686, duration: 6339411379
2014-08-25 15:35:06,464 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block blk_-3628597342762703578_40720686 terminating
2014-08-25 15:35:06,465 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-7509787569548089877_40720689 src: /10.130.136.136:37142 dest: /10.130.136.136:50010
2014-08-25 15:35:06,724 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.130.136.136:50010, dest: /10.130.136.136:33647, bytes: 5921280, op: HDFS_READ, cliID: DFSClient_hb_rs_xx,60020,1388115177740_1837727868_26, offset: 388096, srvID: DS-1533727399-10.130.136.136-50010-1388038551296, blockid: blk_2616588945174162483_33797955, duration: 547889496646
2014-08-25 15:35:06,725 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.130.136.136:50010, storageID=DS-1533727399-10.130.136.136-50010-1388038551296, infoPort=50075, ipcPort=50020):Got exception while serving blk_2616588945174162483_33797955 to /10.130.136.136:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.136:50010 remote=/10.130.136.136:33647]
        at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
        at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:392)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:490)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:202)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:104)
        at java.lang.Thread.run(Thread.java:724)

2014-08-25 15:35:06,725 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.130.136.136:50010, storageID=DS-1533727399-10.130.136.136-50010-1388038551296, infoPort=50075, ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.136:50010 remote=/10.130.136.136:33647]
        at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
        at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:392)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:490)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:202)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:104)
        at java.lang.Thread.run(Thread.java:724)

 解决方法:
  <property>
    <name>dfs.socket.timeout</name>
    <value>900000</value>
  </property>

  <property>
    <name>dfs.datanode.handler.count</name>
    <value>20</value>
  </property>

  <property>
    <name>dfs.namenode.handler.count</name>
    <value>30</value>
  </property>

  <property>
    <name>dfs.datanode.socket.write.timeout</name>
    <value>10800000</value>
    <description>set to 30 minutes ,default 8 * 60 * 1000,just for 480000 millis timeout while waiting for channel to be ready for w
rite</description>
  </property>

时间: 2024-08-02 14:54:49

480000 millis timeout while waiting for channel to be ready for write异常处理的相关文章

Hbase万亿级存储性能优化总结

背景       hbase主集群在生产环境已稳定运行有1年半时间,最大的单表region数已达7200多个,每天新增入库量就有百亿条,对hbase的认识经历了懵懂到熟的过程.为了应对业务数据的压力,hbase入库也由最初的单机多线程升级为有容灾机制的分布式入库,为及早发现集群中的问题,还开发了一套对hbase集群服务和应用全面监控的报警系统.总结下hbase优化(针对0.94版本)方面的一些经验也算对这两年hbase工作的一个描述. 服务端 1.hbase.regionserver.handl

hbase集群写不进去数据的问题追踪过程

hbase从集群中有8台regionserver服务器,已稳定运行了5个多月,8月15号,发现集群中4个datanode进程死了,经查原因是内存 outofMemory了(因为这几台机器上部署了spark,给spark开的-Xmx是32g),然后对从集群进行了恢复并进行了补数据,写负载比较 重,又运行了几天,发现从集群写不进去数据了 ①.regionserver端                         regionserver端现象一. 2014-08-21 15:03:31,011

0基础搭建Hadoop大数据处理-集群安装

经过一系列的前期环境准备,现在可以开始Hadoop的安装了,在这里去apache官网下载2.7.3的版本 http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz 不需要下载最新的3.0版本, 与后续Hive最新版本有冲突,不知道是不是自己的打开方式不对.    hadoop有三种运行方式:单机.伪分布式.完全分布式,本文介绍完全分布式. 安装Hadoop 现在有三个机器,一个Maste

HBase问题诊断 – RegionServer宕机

本来静谧的晚上,吃着葡萄干看着球赛,何等惬意.可偏偏一条报警短信如闪电一般打破了夜晚的宁静,线上集群一台RS宕了!于是倏地从床上坐起来,看了看监控,瞬间惊呆了:单台机器的读写吞吐量竟然达到了5w ops/sec!RS宕机是因为这么大的写入量造成的?如果真是这样,它是怎么造成的?如果不是这样,那又是什么原因?各种疑问瞬间从脑子里一一闪过,甭管那么多,先把日志备份一份,再把RS拉起来.接下来还是Bug排查老套路:日志.监控和源码三管齐下,来看看到底发生了什么! 案件现场篇 下图是使用监控工具Gang

HA模式强制手动切换:IPC&amp;#39;s epoch [X] is less than the last promised epoch [X+1]

016-11-30 21:13:14,637 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal 192.168.58.183:8485 failed to write txns 54560-54560. Will try to write to this JN again after the next log roll. at org.apache.hadoop.hdfs.qjourn

HBase多线程建立HTable的问题

最近在写wormhole的HBase plugin,需要分别实现hbase reader和hbase writer,在测试的时候会报错如下: 2013-07-08 09:30:02,568 [pool-2-thread-1] org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1631) WARN clien

hadoop 2.7.3 集群搭建遇到问题以及解决

系统安装相关 center os安装到系统中需要用软碟通制作一个启动盘,先选择一个镜像,然在在启动菜单选择写入硬盘 出现错误 center os 7 starting timeout报了一长溜后出现dev/root dosnot exist 这是没有找到镜像文件,在报错界面cd dev,可以看到挂载的设备,有几十个,sd开头的是存储相关.我这边出现的是sda sda4 sdb.重启,在选择安装界面按e进入编辑,将vmlinuz initrd=initrd.imginst.stage2=hd:LA

Kafka – kafka consumer

ConsumerRecords<String, String> records = consumer.poll(100);   /** * Fetch data for the topics or partitions specified using one of the subscribe/assign APIs. It is an error to not have * subscribed to any topics or partitions before polling for da

Java NIO Selector

Java Nio  1 Java NIO Tutorial 2 Java NIO Overview 3 Java NIO Channel 4 Java NIO Buffer 5 Java NIO Scatter / Gather 6 Java NIO Channel to Channel Transfers 7 Java NIO Selector 8 Java NIO FileChannel 9 Java NIO SocketChannel 10 Java NIO ServerSocketCha