Namenode主节点停止报错 Error: flush failed for required journal

主节点间歇性报错其他没有问题 ,SNN的NN没有问题,相关的journalNode也都在,就是主节点的NN会停止。

查看hadoop主节点的NN日志。

2016-11-21 22:36:40,908 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19822 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2016-11-21 22:36:41,088 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.58.183:8485, 192.168.58.181:8485, 192.168.58.182:8485], stream=QuorumOutputStream starting at txid 24533))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
	at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
	at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
	at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
	at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
	at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
	at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
	at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2645)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2520)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:579)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
2016-11-21 22:36:41,089 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 24533
2016-11-21 22:36:41,113 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-11-21 22:36:41,122 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave2/192.168.58.182:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave1/192.168.58.181:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: StandByNameNode/192.168.58.183:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20050ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.182:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20052ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.181:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20065ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.183:8485
2016-11-21 22:36:41,145 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at CentOSMaster/192.168.58.180
************************************************************/

  首先保证设置dfs.namenode.edits.dir和dfs.journalnode.edits.dir,然后设置在hdfs-site.xml中超时时间如下:

<property>
   <name>dfs.qjournal.start-segment.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.prepare-recovery.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.accept-recovery.timeout.ms</name>
   <value>600000000</value>
  </property>
  <property>
   <name>dfs.qjournal.prepare-recovery.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.accept-recovery.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.finalize-segment.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.select-input-streams.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.get-journal-state.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.new-epoch.timeout.ms</name>
   <value>600000000</value>
  </property>

  <property>
   <name>dfs.qjournal.write-txns.timeout.ms</name>
   <value>600000000</value>
  </property>

  貌似解决了,至今今天早上没出问题。

时间: 2024-07-31 21:17:23

Namenode主节点停止报错 Error: flush failed for required journal的相关文章

arcgis-ArcSDE 做POST报错Error: Operation Failed (-1).

问题描述 ArcSDE 做POST报错Error: Operation Failed (-1). 同一主机:Windows2008 R2 64位,ArcSDE for 11g 64bit 本机Oracle11g64位服务端,Oracle11g 32位客户端,Oracle10g 32为服务端 环境变量 path C:appAdministratorproduct11.2.0client_1bin;E:appAdministratorproduct11.2.0dbhome_1bin;C:Progra

ArcSDE 做POST报错Error: Operation Failed (-1). SDE release install not completed.

问题描述 同一主机:Windows2008R264位,ArcSDEfor11g64bit本机Oracle11g64位服务端,Oracle11g32位客户端,Oracle10g32为服务端环境变量pathC:appAdministratorproduct11.2.0client_1bin;E:appAdministratorproduct11.2.0dbhome_1bin;C:ProgramFiles(x86)CommonFilesNetSarang;%SystemRoot%system32;%S

spring-tomcat启动报错 Context initialization failed

问题描述 tomcat启动报错 Context initialization failed 可以启动,但是控制台显示有错误,登陆网页也是404错误,控制台信息如下: 信息: Initializing Spring root WebApplicationContext2015-08-10 18:56:14282 ERROR [org.springframework.web.context.ContextLoader] - Context initialization failedjava.lang

lnk1120-vs2010运行程序报错:error LNK2019: 无法解析的外部符号

问题描述 vs2010运行程序报错:error LNK2019: 无法解析的外部符号 如题,我在vs2010环境下做C++练习题时出现该错误.程序代码如下: //array.h#ifndef ARRAY_H#define ARRAY_Htemplate<typename T>class Array{public: Array(int n);//数组首地址不用指定,待会分配 Array(Array &a); ~Array(); T getAt(int i);//返回第i个数组元素 voi

php图片上传报错error=3

问题描述 php图片上传报错error=3 upload: <?php/** Created by PhpStorm. User: rosen Date: 15-11-5 Time: 下午8:43*/print_r($_FILES['file']); ?> 为什么程序运行结果报错Array ( [name] => 2015-10-19 18:47:42屏幕截图.png [type] => [tmp_name] => [error] => 3 [size] => 0

发送邮件程序报错454 Authentication failed以及POP3和SMTP简介

一.发现问题 在测试邮件发送程序的时候,发送给自己的QQ邮箱,程序报错454 Authentication failed, please open smtp flag first.   二.解决问题 进入QQ邮箱-->设置-->账户-->POP3/IMAP/SMTP选择-->开启POP3/SMTP服务. 三.POP3和SMTP是什么 1.基本概念 一般每个提供电子邮件服务的网站都有自己的SMTP和POP服务器地址.POP(Post Office Protocol)邮局通讯协定,PO

linux使用wkhtmltopdf报错error while loading shared libraries:

官网提示 linux需要这些动态库.depends on: zlib, fontconfig, freetype, X11 libs (libX11, libXext, libXrender) 在linux上执行 ./wkhtmltopdf –page-size A4 www.baidu.com pdf.pdf 报错   error while loading shared libraries: libXrender.so.1 root@mag-sit:/home/mag-sit/wkhtmlt

mvel no context-MVEL 使用模板,当并发量大的时候, 报错 [Error: no context]

问题描述 MVEL 使用模板,当并发量大的时候, 报错 [Error: no context] Caused by: java.lang.RuntimeException: no context at org.mvel2.ParserContext.makeVisible(ParserContext.java:684) at org.mvel2.ParserContext.addVariable(ParserContext.java:477) at org.mvel2.ast.TypedVarN

docker rmi报错Error response from daemon

docker commit了一个镜像之后想删除旧的镜像,出现以下报错 Error response from daemon: conflict: unable to delete 6f8214d56bfc (cannot be forced) - image has dependent child images 解决思路: docker save保存容器 docker images docker save REPOSITORY > XX.tar 删除镜像容器 docker ps -a docke