Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Single-Cycle Processors:
Arvind
M.I.T.
6.823 L5- 2
Arvind
6.823 L5- 3
Arvind
Processor Performance
Time
Program
Instructions
Program *
Cycles
Instruction
Time
Cycle
this lecture
Microarchitecture
CPI
cycle time
Microcoded
>1
short
Single-cycle unpipelined
long
Pipelined
short
6.823 L5- 4
Arvind
Microarchitecture:
status
lines
Implementation of an ISA
Controller
control
points
Data
path
Hardware Elements
Combinational circuits
OpSelect
...
Mux
lg(n)
Demux
Sel
lg(n)
...
O0
O1
A
lg(n)
On-
Decoder
Sel
...
O0
O1
On-1
Result
ALU
Comp?
ff
Q
register
Clk
En
D
Q
En
Clk
D1
D2
ff
ff
ff ...
Q0
Q1
Q2
...
D0
...
Dn-1
ff
Qn-1
6.823 L5- 6
Arvind
Register Files
Clock WE
ReadSel1
ReadSel2
WriteSel
WriteData
ws clk
rs1
rs2
ws
wd
we
Register
file
2R+1W
ReadData1
ReadData2
rd1
rd2
rs1
wd
32
register 1
we
register 0
32
rd1
32
rs2
32
32
register 31
32
rd2
6.823 L5- 7
Arvind
WriteEnable
Clock
Address
WriteData
MAGIC
RAM
ReadData
6.823 L5- 8
Arvind
Implementing MIPS:
6.823 L5- 9
Arvind
Processor State
Data types
6.823 L5- 10
Arvind
Instruction Execution
Execution of an instruction involves
1.
2.
3.
4.
5.
instruction fetch
decode and register fetch
ALU operation
memory operation (optional)
write back
6.823 L5- 11
Arvind
RegWrite
0x4
clk
Add
inst<25:21>
inst<20:16>
PC
clk
addr
inst<15:11>
inst
Inst.
Memory
we
rs1
rs2
rd1
ws
wd rd2
ALU
GPRs
inst<5:0>
ALU
Control
OpCode
6
0
31
26 25
September 26, 2005
5
rs
5
rt
21 20
5
rd
16 15
5
0
11
RegWrite Timing?
6
func
5
6.823 L5- 12
Arvind
RegWrite
0x4
clk
Add
inst<25:21>
PC
clk
addr
inst<20:16>
inst
Inst.
Memory
we
rs1
rs2
rd1
ws
wd rd2
GPRs
inst<15:0>
OpCode
31
26 25
September 26, 2005
5
rs
5
rt
2120
ALU
Control
ExtSel
16
immediate
16 15
Imm
Ext
inst<31:26>
6
opcode
ALU
rt (rs) op immediate
0
6.823 L5- 13
Arvind
RegWrite
0x4
Add
we
rs1
rs2
rd1
ws
wd rd2
inst<25:21>
PC
clk
addr
inst<20:16>
inst<15:11>
inst
Inst.
Memory
inst<15:0>
5
rs
5
rt
rs
rt
Imm
Ext
ALU
Control
ExtSel
OpCode
opcode
ALU
GPRs
inst<31:26>
inst<5:0>
6
0
Introduce
muxes
clk
5
rd
5
0
immediate
6
func
6.823 L5- 14
Arvind
RegWrite
0x4
clk
Add
PC
clk
addr
we
rs1
rs2
rd1
ws
wd rd2
<25:21>
<20:16>
inst
<15:11>
Inst.
Memory
GPRs
<15:0>
6
0
opcode
September 26, 2005
5
rs
5
rt
rs
rt
Imm
Ext
<31:26>, <5:0>
OpCode
ALU
RegDst
rt / rd
5
rd
ALU
Control
ExtSel
5
0
immediate
OpSel
func
BSrc
Reg / Imm
6.823 L5- 15
Arvind
6.823 L5- 16
Arvind
RegWrite
0x4
we
rs1
rs2
rd1
ws
wd rd2
base
addr
inst
Inst.
Memory
clk
WBSrc
ALU / Mem
clk
Add
PC
MemWrite
clk
ALU
GPRs
rdata
Data
Memory
Imm
Ext
disp
we
addr
wdata
ALU
Control
OpCode RegDst
6
opcode
31
26 25
5
rs
ExtSel
5
rt
21 20
OpSel
BSrc
16
displacement
16 15
addressing mode
(rs) + displacement
0
6.823 L5- 17
Arvind
5
rs
16
offset
BEQZ, BNEZ
6
opcode
5
rs
16
JR, JALR
26
target
J, JAL
6.823 L5- 18
Arvind
MemWrite
RegWrite
pc+4
0x4
Add
Add
clk
PC
clk
addr
we
rs1
rs2
rd1
ws
wd rd2
inst
Inst.
Memory
clk
we
addr
ALU
GPRs
Imm
Ext
wdata
ALU
Control
OpCode RegDst
September 26, 2005
ExtSel
rdata
Data
Memory
OpSel
BSrc
zero?
WBSrc
6.823 L5- 19
Arvind
RegWrite
MemWrite
pc+4
0x4
Add
Add
clk
PC
clk
addr
we
rs1
rs2
rd1
ws
wd rd2
inst
Inst.
Memory
clk
we
addr
ALU
GPRs
Imm
Ext
wdata
ALU
Control
OpCode RegDst
September 26, 2005
ExtSel
rdata
Data
Memory
OpSel
BSrc
zero?
WBSrc
6.823 L5- 20
Arvind
RegWrite
MemWrite
pc+4
0x4
Add
Add
clk
PC
clk
addr
inst
31
Inst.
Memory
we
rs1
rs2
rd1
ws
wd rd2
clk
we
addr
ALU
GPRs
Imm
Ext
wdata
ALU
Control
OpCode RegDst
September 26, 2005
ExtSel
rdata
Data
Memory
OpSel
BSrc
zero?
WBSrc
6.823 L5- 21
Arvind
RegWrite
MemWrite
0x4
Add
Add
clk
PC
clk
addr
inst
31
Inst.
Memory
we
rs1
rs2
rd1
ws
wd rd2
clk
we
addr
ALU
GPRs
Imm
Ext
wdata
ALU
Control
OpCode RegDst
September 26, 2005
ExtSel
rdata
Data
Memory
OpSel
BSrc
zero?
WBSrc
6.823 L5- 22
Arvind
RegWrite
MemWrite
0x4
Add
Add
clk
PC
clk
addr
inst
31
Inst.
Memory
we
rs1
rs2
rd1
ws
wd rd2
clk
we
addr
ALU
GPRs
Imm
Ext
wdata
ALU
Control
OpCode RegDst
September 26, 2005
ExtSel
rdata
Data
Memory
OpSel
BSrc
zero?
WBSrc
23
6.823 L5- 24
Arvind
Harvard architecture
We will assume
clock period is sufficiently long for all of
1.
2.
3.
4.
5.
instruction fetch
decode and register fetch
ALU operation
data fetch if required
register write-back setup time
6.823 L5- 25
Arvind
OpSel
combinational
logic
MemWrite
WBSrc
RegDst
RegWrite
PCSrc
6.823 L5- 26
Arvind
Inst<5:0> (Func)
Inst<31:26> (Opcode)
ALUop
+
0?
OpSel
( Func, Op, +, 0? )
Decode Map
ExtSel
( sExt16, uExt16,
High16)
September 26, 2005
6.823 L5- 27
Arvind
Opcode
ExtSel
ALU
BSrc
OpSel
MemW
RegW
WBSrc
RegDst
PCSrc
SW
*
sExt16
uExt16
sExt16
sExt16
Reg
Imm
Imm
Imm
Imm
Func
Op
Op
+
+
no
no
no
no
yes
yes
yes
yes
yes
no
ALU
ALU
ALU
Mem
*
rd
rt
rt
rt
*
pc+4
pc+4
pc+4
pc+4
pc+4
BEQZz=0
sExt16
0?
no
no
br
BEQZz=1
sExt16
*
*
*
*
*
no
no
no
no
no
*
*
*
*
pc+4
jabs
*
*
*
*
0?
*
*
*
*
yes
no
yes
PC
*
PC
R31
*
R31
jabs
rind
rind
ALUi
ALUiu
LW
J
JAL
JR
JALR
no
no
6.823 L5- 28
Arvind
Pipelined MIPS
To pipeline MIPS:
First build MIPS without pipelining with CPI=1
Next, add pipeline registers to reduce cycle
time while maintaining CPI=1
6.823 L5- 29
Arvind
Pipelined Datapath
0x4
Add
PC
addr
rdata
Inst.
Memory
IR
we
rs1
rs2
rd1
ws
wd rd2
GPRs
Imm
Ext
ALU
we
addr
rdata
Data
Memory
wdata
write
fetch
decode & Reg-fetch
execute
memory
-back
phase
phase
phase
phase
phase
Clock period can be reduced by dividing the execution of an
instruction into multiple cycles
tC > max {tIM, tRF, tALU, tDM, tRW} ( = tDM probably)
However, CPI will increase unless instructions are pipelined
September 26, 2005
6.823 L5- 30
Arvind
An Ideal Pipeline
stage
1
stage
2
stage
3
stage
4
6.823 L5- 31
Arvind
tDM
tALU
tRF
tRW
=
=
=
=
=
10 units
10 units
5 units
1 unit
1 unit
6.823 L5- 32
Arvind
Alternative Pipelining
0x4
Add
PC
addr
rdata
Inst.
Memory
fetch
phase
IR
we
rs1
rs2
rd1
ws
wd rd2
GPRs
ALU
we
addr
rdata
Data
Memory
Imm
Ext
wdata
execute
phase
memory
phase
=
= ttDM
DM
DM+ tRW
write
-back
phase
6.823 L5- 33
Arvind
Unpipelined
Pipelined Speedup
tALU = 5,
tRF = tRW= 1
4-stage pipeline
tC
tC
27
10
2.7
25
10
2.5
25
5.0
34
Thank you !