6. Micro-architecture Speedup Notes
These notes elaborate on the speedups to the Mic-1 micro-architecture in
Chapter 4, section 4.4. These new, improved architectures are:
- Mic-2: improvements based on adding hardware, "The datapath for Mic-2", figure 4-29
on page 280
- Mic-3: a 4-stage pipeline, "The three bus datapath used in the Mic-3", figure 4-31
on page 284
- Mic-4: a 7-stage pipeline, "The main components of the Mic-4", figure 4-35
on page 289
General speed-ups
- Reduce the number of clock cycles per instruction
- Simplify so that the clock cycle is shorter => faster clock
- Execute more than instruction at a time using parallelism and pipelining
Mic-2 speed-up: adding hardware
One easy speed-up in Mic-2 is the elimination of the Main1 micro-instruction. This
"interpreter loop", where Main1 increments PC and fetches the
next operand can be included at the end (or near the end) of most instructions.
The Mic-2 design adds the following components:
- A third bus is added so that we don't have to shuffle parameters over the
H register, possibly losing a cycle.
- Add an Instruction Fetch Unit (IFU) so that we aren't waiting for
instructions and operands to be fetched from memory.
- Concept: get the next few bytes before you need them
- Done in parallel so there's no penalty for doing this
- PC incremented outside of datapath, speeding it up
- See Figure 4-27 on page 277
Mic-3 speed-up: a 4-stage pipeline
We can introduce parallelism by adding a latch controlling the data for each
bus: A, B, C. The impact of this is:
- Slices the datapath into 3 micro-steps:
- Load busses A, B
- Perform ALU, shift operations
- Write registers from bus C
- Each micro-step is faster, so we can increase the clock speed
- Each micro-step is isolated (by latches and registers), so we can execute
them independently
The 4th stage of the pipeline is the already-present IFU.
Mic-4 speed-up: A 7-stage pipeline
The change in Mic-4 is introducing parallelism to micro-instruction fetching.
This is not very different conceptually from the IFU and its fetching of IJVM
instructions. The important bullets are:
- Micro-instructions in the ROM must be in order... no more jumping around
unless your done with all the micro-instructions for an IJVM instruction.
- All the micro-instructions for the current IJVM are loaded into a queue
immediately when the IJVM instruction begins.
- The control words for the micro-instructions are loaded into MIR1, then
MIR2, then MIR3, then MIR4 as the micro-instruction works its way through
the datapath.
|