HDFS读取文件失败Incorrect value for packet payload size

一、现象

        Hadoop-2.7.2中,使用hadoop shell命令行读取文件内容时,针对大文件,会有如下报错,小文件则不会。

hadoop fs -cat    /tmp/hue_database_dump4.json
16/09/29 15:13:37 WARN hdfs.DFSClient: Exception while reading from BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 of /tmp/hue_database_dump4.json from DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
16/09/29 15:13:37 WARN hdfs.DFSClient: Exception while reading from BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 of /tmp/hue_database_dump4.json from DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
16/09/29 15:13:37 INFO hdfs.DFSClient: Could not obtain BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 from any node: java.io.IOException: No live nodes contain block BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 after checking nodes = [DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK], DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK] DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] Dead nodes:  DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]. Will get new block locations from namenode and retry...
16/09/29 15:13:37 WARN hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 1086.6056359410977 msec.
16/09/29 15:13:38 WARN hdfs.DFSClient: Exception while reading from BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 of /tmp/hue_database_dump4.json from DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
16/09/29 15:13:38 WARN hdfs.DFSClient: Exception while reading from BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 of /tmp/hue_database_dump4.json from DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
16/09/29 15:13:38 INFO hdfs.DFSClient: Could not obtain BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 from any node: java.io.IOException: No live nodes contain block BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 after checking nodes = [DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK], DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK] DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]
Dead nodes:  DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK]
DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]. Will get new block locations from namenode and retry...
16/09/29 15:13:38 WARN hdfs.DFSClient: DFS chooseDataNode: got # 2 IOException, will wait for 6199.040804985275 msec.
16/09/29 15:13:45 WARN hdfs.DFSClient: Exception while reading from BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 of /tmp/hue_database_dump4.json from DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
16/09/29 15:13:45 INFO hdfs.DFSClient: Could not obtain BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 from any node: java.io.IOException: No live nodes contain block BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 after checking nodes = [DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK], DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK] Dead nodes:  DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]. Will get new block locations from namenode and retry...
16/09/29 15:13:45 WARN hdfs.DFSClient: DFS chooseDataNode: got # 3 IOException, will wait for 13991.541436655532 msec.
16/09/29 15:13:59 WARN hdfs.DFSClient: DFS Read
java.io.IOException: Incorrect value for packet payload size: 57616164
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:201)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
        at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
        at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
cat: Incorrect value for packet payload size: 57616164

二、分析

        仔细上述异常,主要得到两点信息:

        1、Could not obtain BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 from any node: java.io.IOException: No live nodes contain block BP-1776288592-10.7.12.154-1468904160674:blk_1073998236_257465 after checking nodes = [DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK], DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK] Dead nodes:  DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]. Will get new block locations from namenode and retry...

        2、Incorrect value for packet payload size: 57616164

        逐个分析,首先看第一个,通过No live nodes contain block,找到相关代码位置,如下:

  /**
   * Get the best node from which to stream the data.
   * @param block LocatedBlock, containing nodes in priority order.
   * @param ignoredNodes Do not choose nodes in this array (may be null)
   * @return The DNAddrPair of the best node.
   * @throws IOException
   */
  private DNAddrPair getBestNodeDNAddrPair(LocatedBlock block,
      Collection<DatanodeInfo> ignoredNodes) throws IOException {
    DatanodeInfo[] nodes = block.getLocations();
    StorageType[] storageTypes = block.getStorageTypes();
    DatanodeInfo chosenNode = null;
    StorageType storageType = null;
    if (nodes != null) {
      for (int i = 0; i < nodes.length; i++) {
        if (!deadNodes.containsKey(nodes[i])
            && (ignoredNodes == null || !ignoredNodes.contains(nodes[i]))) {
          chosenNode = nodes[i];
          // Storage types are ordered to correspond with nodes, so use the same
          // index to get storage type.
          if (storageTypes != null && i < storageTypes.length) {
            storageType = storageTypes[i];
          }
          break;
        }
      }
    }
    if (chosenNode == null) {
      throw new IOException("No live nodes contain block " + block.getBlock() +
          " after checking nodes = " + Arrays.toString(nodes) +
          ", ignoredNodes = " + ignoredNodes);
    }
    final String dnAddr =
        chosenNode.getXferAddr(dfsClient.getConf().connectToDnViaHostname);
    if (DFSClient.LOG.isDebugEnabled()) {
      DFSClient.LOG.debug("Connecting to datanode " + dnAddr);
    }
    InetSocketAddress targetAddr = NetUtils.createSocketAddr(dnAddr);
    return new DNAddrPair(chosenNode, targetAddr, storageType);
  }

        但是这个异常后面的nodes = [DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK], DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]],和ignoredNodes = null,基本可以确定没有DataNode因为某些原因被排除和忽略。继续往下分析日志,得到如下关键信息:

Dead nodes:  DatanodeInfoWithStorage[10.7.12.155:50010,DS-87568e9c-b339-4b0e-a09f-292118bcb752,DISK] DatanodeInfoWithStorage[10.7.12.156:50010,DS-80fb296c-1085-40ce-9dcf-e3a08327aa0d,DISK]

        两个Node都被标记为Dead了,但是通过Web界面看两个DataNode都是正常的,如下:


        这就有点奇怪了。只能继续往下看第二个异常输出:Incorrect value for packet payload size: 57616164,定位源码位置在PacketReceiver的doRead()方法中,如下:

    if (totalLen < 0 || totalLen > MAX_PACKET_SIZE) {
      throw new IOException("Incorrect value for packet payload size: " +
                            payloadLen);
    }

        也就是说,在接收数据包的过程中,数据包的总大小超过了阈值MAX_PACKET_SIZE,也就是16M,如下:

  /**
   * The max size of any single packet. This prevents OOMEs when
   * invalid data is sent.
   */
  private static final int MAX_PACKET_SIZE = 16 * 1024 * 1024;

        而根据异常输出的57616164,这个数据包的大小达到了54M之多。由此可以想到,小文件本身就很小,数据包也不会大,而大文件数据包就比较大,会超过阈值。但是那个要读的文件,大小也就 50M,为什么会都出4M多呢?继续往下看数据包的结构,如下:

    // Each packet looks like:
    //   PLEN    HLEN      HEADER     CHECKSUMS  DATA
    //   32-bit  16-bit   <protobuf>  <variable length>
    //
    // PLEN:      Payload length
    //            = length(PLEN) + length(CHECKSUMS) + length(DATA)
    //            This length includes its own encoded length in
    //            the sum for historical reasons.
    //
    // HLEN:      Header length
    //            = length(HEADER)
    //
    // HEADER:    the actual packet header fields, encoded in protobuf
    // CHECKSUMS: the crcs for the data chunk. May be missing if
    //            checksums were not requested
    // DATA       the actual block data
    Preconditions.checkState(curHeader == null || !curHeader.isLastPacketInBlock());

    curPacketBuf.clear();
    curPacketBuf.limit(PacketHeader.PKT_LENGTHS_LEN);
    doReadFully(ch, in, curPacketBuf);
    curPacketBuf.flip();
    int payloadLen = curPacketBuf.getInt();

       可以看到,这个数据包时包含Payload length,其值= length(PLEN) + length(CHECKSUMS) + length(DATA),还有Header length,其值= length(HEADER),最后是HEADER、CHECKSUMS和DATA,而除DATA外,其它的都是数据包额外添加的部分,这也就解释了为什么数据包比实际文件大小还大。

        为什么数据包会如此之大呢?我们继续往下看数据包的发送过程,在BlockSender中数据包发送方法sendPacket()中,如下:

  /**
   * Sends a packet with up to maxChunks chunks of data.
   *
   * @param pkt buffer used for writing packet data
   * @param maxChunks maximum number of chunks to send
   * @param out stream to send data to
   * @param transferTo use transferTo to send data
   * @param throttler used for throttling data transfer bandwidth
   */
  private int sendPacket(ByteBuffer pkt, int maxChunks, OutputStream out,
      boolean transferTo, DataTransferThrottler throttler) throws IOException {

        这个ByteBuffer pkt实际上就是存放待发送数据包数据的缓冲区,它的大小决定了发送数据包的大小,那么它的大小是如何设定的呢?继续分析,在doSendBlock()方法中,如下:

      ByteBuffer pktBuf = ByteBuffer.allocate(pktBufSize);

      while (endOffset > offset && !Thread.currentThread().isInterrupted()) {
        manageOsCache();
        long len = sendPacket(pktBuf, maxChunksPerPacket, streamForSendChunks,
            transferTo, throttler);
        offset += len;
        totalRead += len + (numberOfChunks(len) * checksumSize);
        seqno++;
      }

        先根据pktBufSize申请内存,确定缓冲区ByteBuffer pktBuf,然后再sendPacket()发送数据包。很明显,我们只需要知道pktBufSize是如何确定的就行了。如下:

      if (transferTo) {
        FileChannel fileChannel = ((FileInputStream)blockIn).getChannel();
        blockInPosition = fileChannel.position();
        streamForSendChunks = baseStream;
        maxChunksPerPacket = numberOfChunks(TRANSFERTO_BUFFER_SIZE);

        // Smaller packet size to only hold checksum when doing transferTo
        pktBufSize += checksumSize * maxChunksPerPacket;
      } else {
        maxChunksPerPacket = Math.max(1,
            numberOfChunks(HdfsConstants.IO_FILE_BUFFER_SIZE));
        // Packet size includes both checksum and data
        pktBufSize += (chunkSize + checksumSize) * maxChunksPerPacket;
      }

        这个pktBufSize与maxChunksPerPacket相关,而maxChunksPerPacket的大小确定方法如下:

maxChunksPerPacket = Math.max(1,
            numberOfChunks(HdfsConstants.IO_FILE_BUFFER_SIZE));

        也就是和参数IO_FILE_BUFFER_SIZE,即io.file.buffer.size有关,默认为4096,即4KB,如下:

  public static final String  IO_FILE_BUFFER_SIZE_KEY =
    "io.file.buffer.size";
  /** Default value for IO_FILE_BUFFER_SIZE_KEY */
  public static final int     IO_FILE_BUFFER_SIZE_DEFAULT = 4096;

        而实际查看集群中的配置,如下:


        足有125M......这也就是为什么数据包会如此之大的原因。

        修改参数,重启节点后,读取数据正常。

三、答案

        参数io.file.buffer.size配置过大,导致数据包发送超过数据包接收时设定的阈值。

时间: 2024-11-08 22:07:07

HDFS读取文件失败Incorrect value for packet payload size的相关文章

《Hadoop海量数据处理:技术详解与项目实战》一 3.2 HDFS读取文件和写入文件

3.2 HDFS读取文件和写入文件 Hadoop海量数据处理:技术详解与项目实战我们知道在HDFS中,NameNode作为集群的大脑,保存着整个文件系统的元数据,而真正数据是存储在DataNode的块中.本节将介绍HDFS如何读取和写入文件,组成同一文件的块在HDFS的分布情况如何影响HDFS读取和写入速度. 3.2.1 块的分布HDFS会将文件切片成块并存储至各个DataNode中,文件数据块在HDFS的布局情况由NameNode和hdfs-site.xml中的配置dfs.replicatio

app.config文件-App.config读取文件失败问题

问题描述 App.config读取文件失败问题 直接上图吧 读取失败问题 App.config配置文件 读取代码 读取失败了 解决方案 又没人有人瞅 见啊, 大大神呢 解决方案二: 读取并修改App.config文件读取并修改App.config文件(转载)读取并修改App.config文件(转载) 解决方案三: 没人看见吗,我顶,顶,顶,顶 解决方案四: 已经解决了,应该写在UI层的,

Linux下读取文件失败

问题描述 这是我的读取代码: public static List<String> addFileToList(String bKFilePath) {BufferedReader bkFile = null;List<String> list = new ArrayList<String>();try {LOG.debug("bKFilePath:"+bKFilePath);boolean isReadable=new File(bKFilePat

HDFS读文件过程分析:获取文件对应的Block列表

在使用Java读取一个文件系统中的一个文件时,我们会首先构造一个DataInputStream对象,然后就能够从文件中读取数据.对于存储在HDFS上的文件,也对应着类似的工具类,但是底层的实现逻辑却是非常不同的.我们先从使用DFSClient.DFSDataInputStream类来读取HDFS上一个文件的一段代码来看,如下所示: 01 package org.shirdrn.hadoop.hdfs; 02 03 import java.io.BufferedReader; 04 import

HDFS读文件过程分析:读取文件的Block数据

我们可以从java.io.InputStream类中看到,抽象出一个read方法,用来读取已经打开的InputStream实例中的字节,每次调用read方法,会读取一个字节数据,该方法抽象定义,如下所示: public abstract int read() throws IOException; Hadoop的DFSClient.DFSInputStream类实现了该抽象逻辑,如果我们清楚了如何从HDFS中读取一个文件的一个block的一个字节的原理,更加抽象的顶层只需要迭代即可获取到该文件的

hdfs-Android通过HDFS API 上传和读取文件

问题描述 Android通过HDFS API 上传和读取文件 Android通过HDFS API 上传和读取文件需要什么jar包引入,然后要添加什么配置文件到项目,配在哪里,然后怎么连接上,如果可以给出核心的连接代码给我参考下,有劳大家了!! 解决方案 http://www.oschina.net/code/snippet_991164_37901 解决方案二: 解决方案三: 上传本地文件到HDFS上传文件到HDFShdfs 简单的api 读写文件 解决方案四: http://blog.csdn

HDFS写文件过程分析

HDFS是一个分布式文件系统,在HDFS上写文件的过程与我们平时使用的单机文件系统非常不同,从宏观上来看,在HDFS文件系统上创建并写一个文件,流程如下图(来自<Hadoop:The Definitive Guide>一书)所示: 具体过程描述如下: Client调用DistributedFileSystem对象的create方法,创建一个文件输出流(FSDataOutputStream)对象 通过DistributedFileSystem对象与Hadoop集群的NameNode进行一次RPC

php学习笔记--高级教程--读取文件、创建文件、写入文件

打开文件:fopen:fopen(filename,mode);//fopen("test.txt","r"): 打开模式:r 只读方式打开,将文件指针指向文件头 r+ 读写方式打开,将文件指针指向文件头 w 写入方式,指向文件头,如果不存在则尝试创建 w+ 读写方式,指向文件头,如果不存在则尝试创建 a 写入方式打开,指向文件末尾,如果不存在则尝试创建 a+ 读写方式打开,指向文件末尾,如果不存在则尝试创建 读取文件:fread:fread(); readfile

如何有效的使用C#读取文件

你平时是怎么读取文件的?使用流读取.是的没错,C#给我们提供了非常强大的类库(又一次吹捧了.NET一番),里面封装了几乎所有我们可以想到的和我们没有想到的类,流是读取文件的一般手段,那么你真的会用它读取文件中的数据了么?真的能读完全么? 通常我们读取一个文件使用如下的步骤: 1.声明并使用File的OpenRead实例化一个文件流对象,就像下面这样 FileStream fs = File.OpenRead(filename); 或者 FileStream fs = FileStream(fil