Hacker News new | ask | show | jobs
by jcranmer 3476 days ago
The JVM uses an 8-bit opcode, of which 202 are defined. Some opcodes have implicit operands (e.g., iload_1, load the first local variable as integer on the stack or iconst_m1, push -1 on the stack), while others take usually an index that's usually 1 byte (or 2 byte for jump offsets)--although there's a wide instruction that makes the next instruction use 2-byte operands instead.

Instruction decoding for the entire JVM instruction set save tableswitch and lookupswitch amounted to about 50 lines of code, not counting the enum listing every opcode.

1 comments

I more meant - what shape is in the instruction stream? Is it literally just a stream of individual bytes, that have to be decoded when pushing constants etc?
The Code attribute in the JVM spec consists of the number of local variable slots and the maximum stack depth, a byte array, and a collection of attributes to represent things like the stack map or exception handling tables.

Most of the opcodes have zero or one operands, usually a single byte (or two bytes with a prefix) that's an index into the local variable array or the constant pool. There's several ways to load constants: the integer constants -1 through 5, long constants 0 and 1, float and double 0, 1, and 2 all have 0-operand opcodes (e.g., iconst_0); there's an opcode to load a 1-byte immediate, another to load a two-byte immediate, and an opcode to load a constant in the constant pool (which can be a string, a Class<?> reference, an integer, long, double, or float).

It should be noted that the JVM specifies a fixed big-endian format in its bytecode, and as a result, even the two-byte immediates are specified in the manual as two operands of a single byte each.

That's one possible design, but it's common to use a constant pool. Placing constants in the instruction stream is possible but awkward, since there is no guarantee that they are either aligned or of the right endianness (assuming you are trying to write a more or less portable system).

Constants that fit in a byte don't have this problem and can be easily placed in the instruction stream using dedicated byte constant instructions, sometimes including dedicated single-byte instructions for common constants like zero, one, etc.