-
Taxonomy of ISA:
The Instruction:
- Set is a critical interface between: the
- software and the hardware
-
Taxonomy of ISA:
Stack:
both operands are implicit on the top of the stack, a data structure in which items are accessed an a last in, first out fashion.
-
Taxonomy of ISA:
Accumulator:
is implicit in the accumulator, a special-purpose register.
-
Taxonomy of ISA:
General Purpose Register:
all operands are explicit in specified registers or memory locations. Depending on where operands are specified and stored, there are three different ISA groups.
-
The three different ISA groups:
o Register-Memory: one operand in register and one in memory. Examples: IBW 360/370. Intel 80x86 family. Motorola 68000
o Memory-Memory: both operads are in memory. Example: VAX.
- o Register-Register
- (load & store): all operands except for those in load and store instruction, are registers. Examples: SPARC (Sun Microsystems), MIPS, Precision Architecture (HP), PowerPC (IBM), Alpha (DEC), ARM (Apple, Nvidia, etc).
-
-
-
-
-
-
JAVA Virtual Machine:
o The JVM is a stack machine; there are no pointers for local variables or constants. This was crucial in the “safe” design that allows someone else’s program to run on your computer. All addresses have to be specified as an offset from one of these regions.
o All JVM instructions have a 1 byte opcode
- o The JVM has addressing modes – immediate,
- indexed, and stack.
-
Motorola 68HC11:
The Motorola 68HC11 employs an Accumulator ISA
-
MIPS32:
MIPS32 employs a Register to Register ISA
-
ISA Principles about Pipelining:
- o Register-Register RISC (ARM, MIPS etc)
- Load-store machine with no memory references
- Easy to pipeline, higher IC
- o Register-Memory IBM 360
- Harder to pipeline but reduce IC
- o Memory-Memory VAX
- Hardest to pipeline, most compact IC
-
The Processor: Datapath Control:
Simplified to contain only:
- Memory-reference instructions: lw, sw
- Arithmetic-logical instructions: add, sub, and, or slt
- Control flow instructions: beq, j
-
The Processor: Datapath Control:
Generic Implementation:
- Use the program counter (PC) to supply instruction address
- Get the instruction from memory
- Read registers
- Use the instruction to decide exactly what to do
-
The Processor: Datapath Control
All instructions use the ALU after reading the registers. Why?
- Memory-reference: address calculations (lw sw)
- Arithmetic: operation execution (add sub)
- Control Flow: comparison (beq, j)
-
The Processor: Datapath Control
Single Cycle Problems:
Wastful area since clock cycle is determined by the execution delay of the slowest instruction
-
The Processor: Datapath Control:
Mutiplecycle Approach:
“smaller” cycle time by having different instructions take different numbers of cycle.
Reusing functional units and ALU used to compute address and to increment PC and memory used for instruction and data.
-
Pipeline motivation:
- · Need both low CPI and high frequency for best performance. (want a muticycle for high frequency, but need better CPI)
- · Factory assembly line, each task is called a stage and the time period is one clock cycle (cycle time)
-
MIPS 5-stage pipeline:
- · 5 stages for each instruction
- o IF: instruction fetch
- o ID: instruction decode and register file read
- o EX: instruction execution or effective address
- calculation
- o MEM: memory access for load and store
- o WB: write back results to register file
- · Delays of all 5 stages are relative the same
- · Staging registers are used to hold data and control as instructions pass between stages
- · All instructions pass through all 5 stages
- · As an instruction leaves a stage in a particular
- clock period, the next instruction enters it
-
A WAW and WAR hazard:
CANNOT OCCUR in a MIPS 5 Stage pipeline
-
Register results in WB:
WE NEED to write down the register results in WB stage even if we adopt the forwarding technique.
-
The pipelining technique:
INCREASES per instruction execution latency; however, it improves the throughput by allowing more instructions to execute at a time.
-
How the MIPS ISA simplifies pipelining:
Fixed length instruction simplifies:
- Fetch – just get the next 32 bits
- Decode – single step; don’t have to decode opcode before figuring out where to get the rest of the fields
-
How the MIPS ISA simplifies pipelining:
Source register fields always in the same location:
Can read source resisters during decode
-
How the MIPS IS simplifies pipelining:
Load/store architecture:
ALU can be used for both arithmetic and EA calculation
Memory instruction require about same amount of work as arithmetic ones, easing pipelining of the two together
-
How the MIPS IS simplifies pipelining:
Memory data must be aligned:
Read or write accesses can be done in one cycle
-
Pipeline hazards:
- A hazard is a conflict, regarding data, control, or hardware resources
- Data hazards are conflicts for register values
- Control hazards occur due to the delay to execute branch and jump instruction
- Structural hazards are conflicts for hardware resources, such as
- *A single memory for instruction and data
- *A multi-cycle, non-pipelined functional unit (such as a divider)
-
Data dependences:
- A read after write (RAW) dependence occurs when the register written by an instruction us a source register of a subsequent instruction.
-
Eliminate RAW hazards:
- by forwarding.
-
Stalling the stage behind the load:
- Force nop (“no operation”) instruction into EX stage on next clock cycle
- Hold instructions in ID and IF stages for one clock cycle
-
Load-Store:
- Load: produces results in MEM – can forward to an immediately following store instruction
- Goal: avoid a stall
- Solution: add forwarding into the memory access stage of store instruction
|
|