1. IJVM Instruction Set Notes

These notes elaborate Figure 4-11 "The IJVM instruction set" on page 250 of our text. I also have a spreadsheet of the IJVM instructions grouped by functionality.


This is a 0-address, stack-based, instruction set architecture.

Memory limit is 4GB, or 1 giga-words. Word = 4 bytes. We're only dealing with integers, so most things are 4 byte quantities. Opcodes are 1 byte.

Implicit registers:

  • Constant pool (CPP) - read-only area, all constants are an offset from this base address
  • Local variable frame (LV) - like the frame pointer in our IA-32 assembly code, points to the starting point on the stack of local variables for the current method
  • Stack pointer (SP) - top of the stack, just like IA-32
  • Program counter (PC) - address of the instruction next to fetch

Because IJVM is all integer:

  • CPP, LV, SP register values are words (word=4 bytes)
  • Offsets are therefore word offsets, so CPP+1 refers to the 2nd word in the constant pool, not the 2nd byte
  • PC values are bytes, offsets are therefore bytes

Instruction set notes:

  • BIPUSH puts a 1 byte quantity on the stack, but the stack holds 4-byte words. It looks like BIPUSH pads the byte it pushes on the stack with 3 extra (and wasted) bytes. This is implied in the example stack shown in Figure 4-15 on page 255.
  • Branching offset parameter is 2 bytes. It's added to the current instruction location. Negative numbers are allowed, and a 2's complement representation is used.
  • Note that the varnum parameter is a 1 byte quantity that indicates the word offset from the local variable frame.
  • The index parameter of LDC_W is again a word offset from the CPP register.
  • Note that all memory references are made through the CPP, LV or SP registers, not directly like we have been doing.

Note that all the shenanigans in the book surrounding INVOKEVIRTUAL (can you say "call"... jeez) and IRETURN are just about implementing a stack pointer and frame pointer that allows function parameters and local variables to be placed on the same stack. Heck, we already know how to do this manually in our IA-32 assembly coding, so this is easy.

Note the example in Figure 4-14 on page 254:

  • See how easy the encoding of instructions is.
  • See the variable offsets from LV being used, rather than actual memory references as in IA-32
  • See the penalty for how simple an instruction set this is... 4 statements required to decrement j.
  • See the encoding for IF_ICMPEQ L1. Here "L1" is encoded as 0x00 0x0D, or 13. So, the branch is 13 bytes after the address of the IF_ICMPEQ instruction. You should be able to count this out. The same is true of GOTO L2. Try it.

  


Author: William Krieger, Nov 2005 CSC 220