Java中的Big/Little问题

问题

Java中的Big/Little问题
1. 解决Endian问题:一个总结
Java二进制文件中的所有东西都以big-endian形式存在,高字节优先,这有时被称为网络顺序。这是一个好的消息,意味着如果你只使用Java。所有文件在所有平台(Mac,PC,Solaris等)上按同样的方式进行处理。可以自由地交换二进制数据,以电子形式在Internet上,或在软盘上,而无需考虑endian问题。存在的问题是当你与那些不是使用Java编写的程序交换数据文件时,会存在一些问题。因为这些程序使用的是little-endian顺序,通常是在PC上使用的C语言。有些平台内部使用big-endian字节顺序(Mac,IBM390);有些平台使用little-endian字节顺序(Intel)。Java对用户隐瞒了endian问题。

在二进制文件中,在域之间没有分割符,文件是二进制形式的,不可读的ASCII。如果你想读的数据不是标准格式,通常由非Java程序准备的。可以由四种选择:

1). 重写提供输入文件的输出程序。它可以直接输出big-endian字节流DataOutputStream或者字符DataOutputSream格式。

2). 写一个独立的翻译程序,读和排列字节。可以用任何语言编写。

3). 以字节形式读数据,并重新安排它们(on the fly)。

4). 最简单的方式是,使用我编写的LEDataInputStream, LEDataOutputStream 和LERandomAccessFile模拟 DataInputStream, DataOutputStream and RandomAccessFile ,它们使用的是little-endian字节流。 You can read about LEDataStream. You can download the code and source free. You can get help from the File I/O Amanuensis to show you how to use the classes. Just tell it you have little-endian binary data.

2.你可能甚至不会有任何问题。
从C来的许多Java新手可能会认为需要考虑它们所依赖的平台内部所使用的是big还是little问题。在Java中这不是一个问题。进一步,不借助于本地类,你无法知道它们是如何存储的。Java has no struct I/O and no unions or any of the other endian-sensitive language constructs.

仅在与遗留的C/C++应用程序通讯时需要考虑endian问题。下列代码在big or little endian机器上都将产生同样的结果:

// take 16-bit short apart into two 8-bit bytes.
short x = 0xabcd;
byte high = (byte) (x >>> 8);
byte low = (byte) x;/* cast implies & 0xff */
System.out.println ("x=" + x + " high=" + high + " low=" + low );

3.读Little-Endian Binary Files
The most common problem is dealing with files stored in little-endian format.

I had to implement routines parallel to those in java.io.DataInputStream which reads raw binary, in my LEDataInputStream and LEDataOutputStream classes. Don't confuse this with the io.DataInput human-readable character-based file-interchange format.

If you wanted to do it yourself, without the overhead of the full LEDataInputStream and LEDataOutputStream classes, here is the basic technique:

Presuming your integers are in 2's complement little-endian format, shorts are pretty easy to handle:

--------------------------------------------------------------------------------

short readShortLittleEndian( )

{
// 2 bytes
int low = readByte() & 0xff;
int high = readByte() & 0xff;
return (short )(high << 8 | low);
}

Or if you want to get clever and puzzle your readers, you can avoid one mask since the high bits will later be shaved off by conversion back to short.

short readShortLittleEndian( )

{
// 2 bytes
int low = readByte() & 0xff;
int high = readByte();
// avoid masking here
return (short )(high << 8 | low);
}

--------------------------------------------------------------------------------

Longs are a little more complicated:

--------------------------------------------------------------------------------

long readLongLittleEndian( )

{
// 8 bytes
long accum = 0;
for ( int shiftBy = 0; shiftBy < 64; shiftBy+ =8 )

{
// must cast to long or shift done modulo 32
accum |= ( long)(readByte () & 0xff) << shiftBy;
}

return accum;
}

--------------------------------------------------------------------------------

In a similar way we handle char and int.

--------------------------------------------------------------------------------

char readCharLittleEndian( )

{
// 2 bytes
int low = readByte() & 0xff;
int high = readByte();
return (char )(high << 8 | low);
}

--------------------------------------------------------------------------------

int readIntLittleEndian( )

{
// 4 bytes
int accum = 0;
for ( int shiftBy = 0; shiftBy < 32; shiftBy+ =8 )

{
accum |= (readByte () & 0xff) << shiftBy;
}

return accum;
}

--------------------------------------------------------------------------------

Floating point is a little trickier. Presuming your data is in IEEE little-endian format, you need something like this:

--------------------------------------------------------------------------------

double readDoubleLittleEndian( )

{
long accum = 0;
for ( int shiftBy = 0; shiftBy < 64; shiftBy+ =8 )

{
// must cast to long or shift done modulo 32
accum |= ( (long)(readByte() & 0xff)) << shiftBy;
}

return Double.longBitsToDouble (accum);
}

--------------------------------------------------------------------------------

float readFloatLittleEndian( )

{
int accum = 0;
for ( int shiftBy = 0; shiftBy < 32; shiftBy+ =8 )

{
accum |= (readByte () & 0xff) << shiftBy;
}

return Float.intBitsToFloat (accum);
}

--------------------------------------------------------------------------------

You don't need a readByteLittleEndian since the code would be identical to readByte, though you might create one just for consistency:

--------------------------------------------------------------------------------

byte readByteLittleEndian( )

{
// 1 byte
return readByte();
}

--------------------------------------------------------------------------------

4.History
In Gulliver's travels the Lilliputians liked to break their eggs on the small end and the Blefuscudians on the big end. They fought wars over this. There is a computer analogy. Should numbers be stored most or least significant byte first? This is sometimes referred to as byte sex.

Those in the big-endian camp (most significant byte stored first) include the Java VM virtual computer, the Java binary file format, the IBM 360 and follow-on mainframes such as the 390, and the Motorola 68K and most mainframes. The Power PC is endian-agnostic.

Blefuscudians (big-endians) assert this is the way God intended integers to be stored, most important part first. At an assembler level fields of mixed positive integers and text can be sorted as if it were one big text field key. Real programmers read hex dumps, and big-endian is a lot easier to comprehend.

In the little-endian camp (least significant byte first) are the Intel 8080, 8086, 80286, Pentium and follow ons and the AMD 6502 popularised by the Apple ][.

Lilliputians (little-endians) assert that putting the low order part first is more natural because when you do arithmetic manually, you start at the least significant part and work toward the most significant part. This ordering makes writing multi-precision arithmetic easier since you work up not down. It made implementing 8-bit microprocessors easier. At the assembler level (not in Java) it also lets you cheat and pass addresses of a 32-bit positive ints to a routine expecting only a 16-bit parameter and still have it work. Real programmers read hex dumps, and little-endian is more of a stimulating challenge.

If a machine is word addressable, with no finer addressing supported, the concept of endianness means nothing since words are fetched from RAM in parallel, both ends first.

5.What Sex Is Your CPU?
Byte Sex Endianness of CPUs

CPU
Endianness Notes

AMD 6502, Duron, Athlon, Thunderird
little
6502 was used in the Apple ][, the Duron, Athlon and Thunderbird in Windows 95/08/ME/NT/2000/XP

Apple ][ 6502
little

Apple Mac 68000
big
Uses Motorola 68000

Apple Power PC
big
CPU is bisexual but stays big in the Mac OS.

Burroughs 1700, 1800, 1900
?
bit addressable. Used different interpreter firmware instruction sets for each language.

Burroughs 7800
?
Algol machine

CDC LGP-30
word-addressable only, hence no endianness
31½ bit words. Low order bit must be 0 on the drum, but can be 1 in the accumulator.

CDC 3300, 6600
word-addressable
?

DEC PDP, Vax
little

IBM 360, 370, 380, 390
big

IBM 7044, 7090
word addressable
36 bits

IBM AS-400
big
?

Power PC
either
The endian-agnostic Power-PC's have a foot in both camps. They are bisexual, but the OS usually imposes one convention or the other. e.g. Mac PowerPCs are big-endian.

Intel 8080, 8080, 8086, 80286, 80386, 80486, Pentium I, II, III, IV
little
Chips used in PCs

Intel 8051
big

MIPS R4000, R5000, R10000
big
Used in Silcon Graphics IRIX.

Motorola 6800, 6809, 680x0, 68HC11
big
Early Macs used the 68000. Amiga.

NCR 8500
big

NCR Century
big

Sun Sparc and UltraSparc
big
Sun's Solaris. Normally used as big-endian, but also has support for operating for little-endian mode, including being able to switch endianness under program control for particular loads and stores.

Univac 1100
word-addressable
36-bit words.

Univac 90/30
big
IBM 370 clone

Zilog Z80
little
Used in CPM machines.

If you know the endianness of other CPUs/OSes/platforms please email me at roedy@mindprod.com.

In theory data can have two different byte sexes but CPUs can have four. Let us give thanks, in this world of mixed left and right hand drive, that there are not real CPUs with all four sexes to contend with.

The Four Possible Byte Sexes for CPUS

Which Byte
Is Stored in the
Lower-Numbered
Address?
Which Byte
Is Addressed?
Used In
LSB
LSB
Intel, AMD, Power PC, DEC.

LSB
MSB
none that I know of.

MSB
LSB
Perhaps one of the old word mark architecture machines.

MSB
MSB
Mac, IBM 390, Power PC

--------------------------------------------------------------------------------

You are visitor number 8680.

时间: 2024-09-30 01:17:57

Java中的Big/Little问题的相关文章

java中为什么有的变量声明而不赋值?

问题描述 java中为什么有的变量声明而不赋值? java中为什么有的变量声明而不赋值?而有的就值,那什么情况下要赋值,什么情况下不赋值 解决方案 比如对象变量,而调用这个变量的构造函数非常耗费时间,所以我们等用到的时候再创建,如果程序运行完都不访问它,就根本不创建,这样可以提高效率. 对于简单变量,比如int float一类的,建议随手给一个初始值. 解决方案二: 你这个问题给你举个例子,你应该就能理解了 例如: int a; 这是只声明不赋值,则只会在内存的栈区创建引用,堆中并无此引用的指向

Java中透明和不规则Swing窗口

支持透明和不规则窗口已经成为 AWT 和 Swing 团队长久以来梦寐以求的功能.尽管本机应用程序在主要操作系统上使用这项功能已经为时 已久,但在核心 Java 中还不能使用它.即将发布的 "Consumer JRE"正在进行修改,也就是对 Java SE 6 进行重大更新.Java SE 6 将为 创建不规则.全透明和每个像素透明的顶级窗口提供 API. 历史 本机应用程序的开发人员通常在开发 UI 应用程序中享受了更高级的灵活性.但是为此而付出的代价是将应用程序限制在某一特定平台上

求大神解答一下-java中对象流objectstream问题

问题描述 java中对象流objectstream问题 输出的为什么不是cyh男20 ym女20求大神解答!!!!!!!!!! 解决方案 你的代码和我这个一样吗?麻烦把你的代码粘全了,我看看 解决方案二: 这个是照片......... 解决方案三: 我和你写的差不多,不知道你为啥会这样,我给你粘出我的代码package lianxi; import java.io.FileInputStream;import java.io.FileOutputStream;import java.io.IOE

java中如何让setText方法读取指定标签数据的时候特意空出一点点空间

问题描述 java中如何让setText方法读取指定标签数据的时候特意空出一点点空间 如何让setText方法读取指定标签数据的时候特意空出一点点空间java当中 解决方案 http://zhidao.baidu.com/link?url=znfx-j9HEz7fJS4EcXcc-gX096uqEKQMTQo4vBNrc9bhRAlFHGGxkAP8cPTOkATWxy3DqxQwhBwFAscWkNPxe_,用空字符串占位置看看可不可以也就是字符串前面有空格,后面有空格. 解决方案二: 使用全

如何在java中实现读取一个txt文档中的随机一行

问题描述 如何在java中实现读取一个txt文档中的随机一行 如题,如何在java中实现读取一个txt文档中的随机一行? 主要就是怎么随机读取 解决方案 根据楼上的说法,来总结一下吧,总体来说,就是将文件全部都读取出来,每一行存储到一个数组或集合中,然后再通过产生随机数,来对这个数组或是 集合进行随机的访问.这样一来就解决了 解决方案二: 文本文件只能顺序读,不能随机读.你的需求只能是读取文本文件每一行到一个arraylist,然后得到下标范围,产生一个随机数,取那一行 解决方案三: http:

java循环集合-java中死循环是什么意思

问题描述 java中死循环是什么意思 java中死循环是什么意思 循环一次不再循环是死循环还是不断循环才是死循环能否简单的举个死循环单身例子 解决方案 死循环就是循环语句的条件是永远为真,那么循环体将一直执行,一楼说的并不对,循环不一定会导致内存溢出的,只是Java程序一直运行.简单的死循环实例while(true){某个操作,但是没有break语句}循环体中也没有终止循环的break,就是死循环了. 解决方案二: 不断循环直到你內存溢出 解决方案三: while(true){System.ou

java中怎样实现矢量图

问题描述 java中怎样实现矢量图 java中怎样实现矢量图的缩放,百度地图中的图片是什么格式的,其是怎样实现缩放的 解决方案 百度地图是在服务器端根据矢量图渲染好图片,传输给客户端的, 换句话说,在客户端,它已经是点阵图了.

java中++a和a++ 在数组实现栈中的小疑问

问题描述 java中++a和a++ 在数组实现栈中的小疑问 package 数组实现栈; public class StackArray implements Stack { public static final int num = 1024;//数组默认容量 public int capacity;//数组实际容量 public Object s[];//对象数组 public int top = -1;//栈顶元素位置 //构建默认容量栈对象 public StackArray() { t

java中关于dismiss方法的使用

问题描述 java中关于dismiss方法的使用 myDialog.dismiss( )比如这条语句中是关闭一个对话框的意思吗dismiss还有哪些方面的应用 解决方案 看下这个函数的源码上面的注释信息,jdk源码上的英文注释就是很好的参考文档的. 解决方案二: 这和java语言没有关系,这只是dialog对象定义的方法罢了.你也可以写一个类,定义一个叫dismiss的方法. 在英文字面看来,dismiss就是消失的意思. 解决方案三: java中waitnotifynotifyAll的使用方法

java导出-请问在java中做导出的时候应该怎么实现下拉列表框

问题描述 请问在java中做导出的时候应该怎么实现下拉列表框 这里我给出了我的java源码 望各位大神仔细看看 小弟在此谢谢了 下面这段代码实现的效果是这样 而我想实现的效果是这样的 如图 在计价方式那 有3个选项 可以进行下拉选择的 ```ruby #这里可以指定高亮语言类型# package com.devsun.action.pm.room; import java.util.List; import org.apache.poi.hssf.usermodel.HSSFCell; impo