1. IJVM Instruction Set Notes

These notes elaborate Figure 4-11 "The IJVM instruction set" on page 222 of our text. I also have a spreadsheet of the IJVM instructions grouped by functionality.


This is a 0-address, stack-based, instruction set architecture.

Memory limit is 4GB, or 1 giga-words. Word = 4 bytes. We're only dealing with integers, so most things are 4 byte quantities. Opcodes are 1 byte.

Implicit registers:

  • Constant pool (CPP) - read-only area, all constants are an offset from this base address
  • Local variable frame (LV) - like the frame pointer in our IA32 assembly code, points to the starting point on the stack of local variables for the current method
  • Stack pointer (SP) - top of the stack, just like IA32
  • Program counter (PC) - address of the instruction next to fetch

Because IJVM is all integer:

  • CPP, LV, SP register values are words (word=4 bytes)
  • Offsets are therefore word offsets, so CPP+1 refers to the 2nd word in the constant pool, not the 2nd byte
  • PC values are bytes, offsets are therefore bytes

Instruction set notes:

  • BIPUSH puts a 1 byte quantity on the stack, but the stack holds 4-byte words. It looks like BIPUSH pads the byte it pushes on the stack with 3 extra (and wasted) bytes. This is implied in the example stack shown in Figure 4-15 on page 227.
  • Branching offset parameter is 2 bytes. It's added to the current instruction location. Negative numbers are allowed, and a 2's complement representation is used.
  • Note that the varnum parameter is a 1 byte quantity that indicates the word offset from the local variable frame.
  • The index parameter of LDC_W is again a word offset from the CPP register.
  • Note that all memory references are made through the CPP, LV or SP registers, not directly like we have been doing.

Note that all the shenanigans in the book surrounding INVOKEVIRTUAL (can you say "call"... geez) and IRETURN are just about implementing a stack pointer and frame pointer that allows function parameters and local variables to be placed on the same stack. Heck, we already know how to do this manually in our IA32 assembly coding, so this is easy.

Note the example in Figure 4-14 on page 226:

  • See how easy the encoding of instructions is.
  • See the variable offsets from LV being used, rather than actual memory references as in IA32
  • See the penalty for how simple an instruction set this is... 4 statements required to decrement j.
  • See the encoding for IF_ICMPEQ L1. Here "L1" is encoded as 0x00 0x0D, or 13. So, the branch is 13 bytes after the address of the IF_ICMPEQ instruction. You should be able to count this out. The same is true of GOTO L2. Try it.

Question: With only 1 and 2 byte operand sizes, how do you set something, like a parameter or local variable, to a 4 byte quantity? Maybe multiple BIPUSH instructions? Multiple IINC calls?


Author: William Krieger, Oct 2003 CSC 220 Fall 2003