Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Pipelining
CS6133 - Computer Architecture I
Vikram Padman
Polytechnic Institute of New York University
vikram@poly.edu
1 / 25
Agenda
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards 5 Activity 3 4 1 2
Instruction fetch IF Instruction decode/register fetch ID Execution/eective address cycle EX Memory Access MEM Write-back cycle WB
Activity
2 / 25
What is Pipelining?
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards Activity
3 / 25
Pipelined CPU
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards Activity 2
Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. Case for pipelining a CPU:
1
An instruction is executed by many stages within a CPU, sequentially. In an unpiplined CPU only one stage is active at any given clock cycle. Pipelining increases CPUs eciency dramatically by executing subsequent instruction at every clock cycle. In a pipelined CPU every stage could be active every clock cycle.
4 / 25
Simple CPU
Un-Pipelined
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards Activity
6 / 25
The Control Unit decodes the instruction The Register le reads source operands as specied in the instruction Sign-extend the oset eld
7 / 25
Execution Stage EX
Pipelining Vikram Padman Agenda Introduction Simple CPU
Execution Stages Performance
Calculate new PC by adding sign-extended oset to current PC, in case of branch instruction The ALU performs one of the following operation:
1
Memory reference - ALU adds the base register and sign-extended oset to form the eective address Register-Register - ALU operates on operands as specied by ALU-Opcode Register-Immediate - ALU uses the rst operand and sign-extended oset to perform operation specied by ALU-Opcode
8 / 25
Memory MEM
Pipelining Vikram Padman Agenda Introduction Simple CPU
Execution Stages Performance
The memory either write or output data from the eective address calculated in the previous stage For a store instruction causes the memoru to store data present in write data
9 / 25
Write Back WB
Pipelining Vikram Padman Agenda Introduction Simple CPU
Execution Stages Performance
Data results from R-type (ALU) or load instruction is written back to register le.
10 / 25
11 / 25
12 / 25
Pipelined CPU
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU
Performance
Hazards Activity
Hazards Activity
Hazards
Pipelining Vikram Padman Agenda Introduction 1 Simple CPU Pipelined CPU Hazards
Structural Data Control
Hazards are conditions that prevents the execution of an instruction in its designated clock cycle. Hazards could reduces the performance gained from pipeling and could be categorized into three major types:
2 3
Structural hazards arise due to resource conicts Data hazards due to overlapping instructions Control hazards due to ow altering instructions (Branch/Jump)
Activity
Stalls or NOPs are injected into the pipeline to mitigate hazards Compilers re-organize the instruction stream to mitigate hazards Finally, the pipeline by itself could be re-organized or redesigned with additional hardware to prevent hazards and stalls
16 / 25
Hazards
Structural Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
ALU Instruction 1 Mem Reg Mem Reg Load Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8
Typically caused due to resource constraints in a CPU that is not fully pipelined. Assume there was only one memory, instead of two:
Mem
ALU
Reg
Mem
Reg
Activity
Instruction 2 Mem Reg ALU Mem Reg
Instruction 3
Mem
Reg
ALU
Mem
Reg
Instruction 4
Mem
Reg
ALU
Mem
Hazards
Structural Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Structural Hazards are generally xed by adding or modifying components in a CPU In situations were power, size and/or application requirements prevent additional hardware. Software assistance is necessary to inject NOPs or re-order instructions to prevent structural hazards.
Activity
18 / 25
Hazards
Data Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Data Hazards occurs when there are data dependencies within instructions that are in the pipeline. For example:
1 2
Activity
19 / 25
Hazards
Data Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU 1 Pipelined CPU Hazards
Structural Data Control
Data Hazards could be solved by: Forwarding A hardware module that allows data to bypass some modules. Stalls/NOPs Injecting NOPs into the pipeline Instruction Reordering Either a compiler or CPU itself re-order instruction to mitigate data hazards
2 3
Activity
20 / 25
Hazards
Data Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Forwarding :
Stall/NOPs :
Activity
Hazards
Data Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Instruction Reordering :
1 2 3 1 2 3
lw $s0, 20($t1) sub $t2, $s0, $t3 add $s5, $s8, $t6 lw $s0, 20($t1) add $s5, $s8, $t6 sub $t2, $s0, $t3
Activity
22 / 25
Hazards
Data Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Activity
Hazards
Control Hazards
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards
Structural Data Control
Control Hazards are more complex and expensive than the previous once. Expensive in terms of performance penalties it causes and implementation complicity to mitigate control hazards. This hazard occurs from the fact that a branch or Jump depends on the result of another instruction
Activity
Week 10 Activity 1
Pipelining Vikram Padman Agenda Introduction Simple CPU Pipelined CPU Hazards Activity 2 3 1
Read section 2 in Appendix C(Ed. 5) or A (Ed. 4) in Computer Architecture - A Quantitative Approach and answer the following:
1
How could the eects of control hazards be minimized by reordering instructions? Describe static and dynamic branch predication techniques. How could a compiler inuence the branch predication hardware in a CPU?
25 / 25