1. IJVM Instruction Set Notes
These notes elaborate Figure 4-11 "The IJVM instruction set" on page 250 of our text. I also have a
spreadsheet of the IJVM instructions grouped by functionality.
This is a 0-address, stack-based, instruction set architecture.
Memory limit is 4GB, or 1 giga-words. Word = 4 bytes. We're only dealing with
integers, so most things are 4 byte quantities. Opcodes are 1 byte.
Implicit registers:
- Constant pool (CPP) - read-only area, all constants are an offset from this
base address
- Local variable frame (LV) - like the frame pointer in our IA-32 assembly
code, points to the starting point on the stack of local variables for the
current method
- Stack pointer (SP) - top of the stack, just like IA-32
- Program counter (PC) - address of the instruction next to fetch
Because IJVM is all integer:
- CPP, LV, SP register values are words (word=4 bytes)
- Offsets are therefore word offsets, so
CPP+1 refers to the 2nd word in the
constant pool, not the 2nd byte
- PC values are bytes, offsets are therefore bytes
Instruction set notes:
- BIPUSH puts a 1 byte quantity on the stack, but the stack holds 4-byte
words. It looks like BIPUSH pads the byte it pushes on the stack with 3
extra (and wasted) bytes. This is implied in the example stack shown in
Figure 4-15 on page 255.
- Branching offset parameter is 2 bytes. It's added to the current
instruction location. Negative numbers are allowed, and a 2's complement
representation is used.
- Note that the varnum parameter is a 1 byte quantity that indicates
the word offset from the local variable frame.
- The index parameter of
LDC_W is again a word offset from the
CPP
register.
- Note that all memory references are made through the
CPP,
LV or
SP registers, not
directly like we have been doing.
Note that all the shenanigans in the book surrounding
INVOKEVIRTUAL (can you
say "call"... jeez) and IRETURN are just about implementing a stack
pointer and frame pointer that allows function parameters and local variables to
be placed on the same stack. Heck, we already know how to do this manually in
our IA-32 assembly coding, so this is easy.
Note the example in Figure 4-14 on page 254:
- See how easy the encoding of instructions is.
- See the variable offsets from
LV being used, rather than actual memory
references as in IA-32
- See the penalty for how simple an instruction set this is... 4 statements
required to decrement j.
- See the encoding for IF_ICMPEQ L1. Here "L1" is encoded as
0x00 0x0D,
or 13. So, the branch is 13 bytes after the address of the
IF_ICMPEQ
instruction. You should be able to count this out. The same is true of
GOTO
L2. Try it.
|