Lect 07

Electronic and Computer Engineering
School of Engineering and Design
Brunel University
Distributed System Engineering & Micro-Electronics Computer Architecture
Distributed System Engineering
Brunel Univ. E&CEng
Distributed System Engineering Computer Architecture (DS2A) lecture 7

RISC vs CISC
Peter van Santen Dept. of Electronic and Computer Engineering 2004
printed 15/11/2005 @ 17:50 PvS 2005
RISC vs CISC DS2A 7-1
Brunel Univ. E&CEng
RISC Overview
Complex Instruction Set Computers Reduced Instruction Set Computers Instruction Analysis RISC machine Analysis RISC strategy Instruction comparison Dynamic performance analysis Advanced topics Delayed branch technique Register windows
printed 15/11/2005 @ 17:50 PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-1
Brunel University
Brunel Univ. E&CEng
Objectives
Historic Review of CISC Introduce RISC architecture Review background to RISC development Able to analyse and compare performance issues Relate concurrentising issues to instruction level parallelism and scheduling
Performance II p, m, k, tt )) c, Performancefactors factors(( c, p, m, k, cyc cyc reduce cc reduceinstruction instructioncount countII reduce rocessor reducep p rocessorcpi cpi reduce cycle time t reduce cycle time cyc t
cyc
printed 15/11/2005 @ 17:50 PvS 2005
Refs.: Refs.: Hen03 Hen03Chapt. Chapt.22 http://books.elsevier.com/ http://books.elsevier.com/ companions/1558605967 companions/1558605967
Brunel Univ. E&CEng
CISC
ISA for Complex Instruction Set Computer Historic Arguments for: 1. Greater variety in instructions would simplify compilers. 2. More sophisticated instructions to reduce software problems. 3. Metrics based on memory size (memory efficiency) and program length. 4. Micro-programming supported higher level functions directly executable by microcode. 5. Closure of semantic gap.
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-2
Brunel University
Brunel Univ. E&CEng
CISC techniques
Memory to memory architecture less complex compiler. greater hw complexity Reduction in cost of HW micro-programming semantic gap closure complex instr. sets (non-orthogonal). Writeable ctrl store user instructions vir. memory problems limited address space multi-process swapping. Performance proportional to prog.. size.
printed 15/11/2005 @ 17:50 PvS 2005
Brunel Univ. E&CEng
Main microprocessor Architecture families

RISC Digital Alpha series: 21064, 21164, 21264 MIPS: R2000, 3000, 4000, 5000, 8000, 10000 Sun SPARC: SPARC, MicroSPARC, SuperSPARC, UltraSPARC HP/PA-RISC PowerPC Intel: i960
CISC Intel x86 generic: 86, 286, 386, 486, Pentium, Pentium Pro, PIII, P4 Motorola: M 68x0 & 680x0 Digital VAX (VLSI)
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-3
Brunel University
Brunel Univ. E&CEng
Instruction Set Analysis

Statement Fortran Assignment IF Call Loop Goto Other 51 10 5 9 9 16 C 38 43 12 3 3 1 Pascal Average 45 29 15 5 0 6 45 27 11 6 4 8
round-off errors in averages
printed 15/11/2005 @ 17:50
PvS 2005
Brunel Univ. E&CEng
CISC and RISC processor Architectures

HW Ctrl unit
Instruction cache
Ctrl unit
Micro-programmed ctrl mem
Instr. & Data path cache Main memory
Data path Data cache

data
Instr.
Main memory RISC with hardwired ctrl unit and split cache
CISC architecture with uCtrl unit and unified Cache.
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-4
Brunel University
Brunel Univ. E&CEng
RISC example Digital Alpha 21164
printed 15/11/2005 @ 17:50
PvS 2005
Brunel Univ. E&CEng
CISC Example Digital VAX
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-5
Brunel University
Brunel Univ. E&CEng
CISC/RISC example Intel Pentium Pro
printed 15/11/2005 @ 17:50
PvS 2005
Brunel Univ. E&CEng
Frequency of Variables
N 0 1 2 3 4 5
Where: Terms Locals % occurrence in assignment statements. % occurrence local variables per procedure/function.
terms 80 15 3 2 0
locals 22 17 20 14 8 20
parameters 41 19 15 9 7 8
Parameters % occurrence of number of params in procedure calls.

printed 15/11/2005 @ 17:50 PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-6
Brunel University
Brunel Univ. E&CEng
RISC vs CISC
RISC RISC
Simple Simplesingle singlecycle cycleinstr. instr. M Mref. ref.by byLd Ldand andSt Stonly only
CISC CISC
Complex Complexinstr. instr.in inm-cycles m-cycles Any Anyinstr. instr.may mayref. ref.mem. mem.
Highly Highlypipelined pipelined(overlap.) (overlap.) Less Lesspipelined pipelinedor ornot not lnstr. Instr. lnstr.exec. exec.by byHW HW Instr.interpreted interpretedby bymicro microprg. prg. Fixed Fixedformat formatinstr. instr. Few Fewinstr. instr.and andmodes modes Multiple Multiplereg. reg.sets sets Complexity Complexityin incompiler compiler
printed 15/11/2005 @ 17:50
Var. Var.instr. instr.format format Many Manyinstr. instr.and andmodes modes Single Singlereg. reg.set set Complexity Complexityin inmicro microprg. prg.
PvS 2005
Brunel Univ. E&CEng
RISC Processor examples

IBM 801 RISC I MIPS RISC 11 HP Prec. SunSPARC MIPS R2000 MIPS R3000 MIPS R4000 Alpha started 1975, publ. Radin 1982 ~ 1980, Patterson et a] (VLSI) ~ 1981, Hennesy (VLSI) ~ 1982, Patterson & Sequin (VLSI) ~ 1985, open architecture ~ 1987, scalable processor arch. ~ micro. without interl. pipe stages ~ superscalar-superpipelined
Early work at IBM on 801 VLSI research at Berkeley and Stanford Berkley use multiple windows others compiler optimisation
Refs. Survey of RISC Architectures http://books.elsevier.com/companions/1558605967/appendices/1558605967-appendix-c.pdf

printed 15/11/2005 @ 17:50 PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-7
Brunel University
Brunel Univ. E&CEng
RISC strategy
Maximise the effective throughput of a design (considering hw and sw) by:1. 2. 3. 4. analysing applications for key instructions, executing key operations in hardware, perform most functions in sw., add hw. features only if they yield a net performance gain, 5. include features only if indicated by detailed analysis of substantial HLL programs.
Observations by John Cocke (1975 IBM ) :CISC computers execute mostly simple instructions.
printed 15/11/2005 @ 17:50 PvS 2005
Brunel Univ. E&CEng
Simple RISC pipeline

overlapped lnstr.
1 2 3 4 5 6 1
IF D/RR ALU MDA WBR Instruction Fetch Decode/Reg to Reg fetch Execution/eff address calc Mem Data Access Write back to Reg.
IF
D/RR ALU MDA WBR IF D/RR ALU MDA WBR IF D/RR ALU MDA WBR IF D/RR ALU MDA WBR IF D/RR ALU MDA WBR IF D/RR ALU MDA WBR
2 3 4 5 6 7 8 9 10 time
MIPS example
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-8
Brunel University
Brunel Univ. E&CEng
Instruction Comparison
metric example, with size in bits:instr. 8, m.addr. 16, reg.addr. 4 and data 32 instr. 8, m.addr. 16, reg.addr. 4 and data 32
op:op:- A:=B+C A:=B+C

RR R2 R3 R1 MA MB MC R2 + R3 R1 MR Acc Acc MA MB Mc + Acc Acc MM MA MB+MC
I = 104 bits
printed 15/11/2005 @ 17:50
I = 72 bits
PvS 2005
I = 56 bits
Brunel Univ. E&CEng
Dynamic Performance analysis

Time to run a given program can be calculated as:-
T = tcyc D CPI T exec exec = t cyc D CPI

where: tcyc = time of single clock cycle D = dynamic instruction count CPI = average cycles per instruction (CPI) For a given technology, C will be comparable for a RISC and a CISC architecture, possibly can be made smaller for RISC.
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-9
Brunel University
Brunel Univ. E&CEng
Dynamic Performance
comparison D is expected to be much greater for RISC, is found not to be so due to instruction distribution advanced code generation techniques 20% > CISC CPI greatly reduced for RISC from 5 - 10 cpi for CISC to 1.6 - 2.0 cpi for RISC including cache and MM overheads. 0.25 cpi for superscalar
printed 15/11/2005 @ 17:50
PvS 2005
Brunel Univ. E&CEng
Dyn. Perf. Comparison VAX vs RISC

VAX-1 1/780:1 Mips ( 106 average VAX instructions per sec.) tcyc = 200 ns. CPI = 5 Texec. = 0.2 5 D sec. (for VAX) Texec. = 0.2 2 1.2 D sec. (for RISC) RISC : CISC Texec. ratio of 0.48 speedup 2
printed 15/11/2005 @ 17:50 PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-10
Brunel University
Brunel Univ. E&CEng
Dyn. Perf. Comparison M68020 vs RISC ex.

M 68020:2 Mips ( 106 average VAX instructions per sec.) tcyc = 60 ns.
Determine the average cycles per instruction (CPI), and Calculate the speedup for RISC based on data given in slide 19.
printed 15/11/2005 @ 17:50
PvS 2005
Brunel Univ. E&CEng
Misc. benefits
Reduced design complexity results in:lower design cost, reduced chip complexity, higher chip yield, reduced time to market, etc. Operand Instructions generally: R op R R or R op I R
Penalty: increased compiler complexity pre-runtime scheduling code optimisation hardware instruction issue units Refs.: VAX ISA
http://books.elsevier.com/companions/1558605967/appendices/1558605967-appendix-e.pdf
printed 15/11/2005 @ 17:50
PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-11
Brunel University
Brunel Univ. E&CEng
RISC summary
RISC designs aim to gain performance by:
reducing the no of cycles/instruction much faster than they lose performance by executing more instructions, fast context switching by using register windows, complex compiler code optimisation to reduce processor stalls, some processors have no interlocks for dependencies, are dealt with by the compiler, super-scalar use multiple instruction streams for Instr. Level Para. super-pipelined use multi phase clocks for enhanced pipelining over standard scalar pipelines. code optimisation for instruction level parallelism and scheduling carried out statically at compile time. Legacy binaries CISC code running on RISC cores (Intel 86, AMD), translation and optimisation by hardware at runtime. Superpipelined supercalar of degree (n, m) cpi < 1 2004 Mutiple cores, Hyper threading
printed 15/11/2005 @ 17:50 PvS 2005
Brunel Univ. E&CEng
Answer Dyn. Perf. Comparison

M68020 vs RISC
M 68020:2 Mips ( 106 average VAX instructions per sec.) tcyc = 60 ns. CPI = (500 / 60) = 8 Texec = tcyc* CPI *D Texec = 0.06* 8 *D (for M68020) Texec = 0.06 * 2 * 1.2 D (for RISC) RISC : CISC; Texec ratio of 0.30 speedup 3.3
printed 15/11/2005 @ 17:50 PvS 2005
printed 15/11/2005 @ 17:50
PvS 2005
DS2A 7-12

Lect 07

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Lect 07

Caricato da

Copyright:

Formati disponibili

Electronic and Computer Engineering

School of Engineering and Design

Distributed System Engineering & Micro-Electronics Computer Architecture

Distributed System Engineering

Brunel Univ. E&CEng

Distributed System Engineering Computer Architecture (DS2A) lecture 7

RISC vs CISC DS2A 7-1

Distributed System Engineering

Brunel Univ. E&CEng

RISC vs CISC DS2A 7-2

printed 15/11/2005 @ 17:50

Electronic and Computer Engineering

School of Engineering and Design

Distributed System Engineering & Micro-Electronics Computer Architecture

Distributed System Engineering

Brunel Univ. E&CEng

Refs.: Refs.: Hen03 Hen03Chapt. Chapt.22 http://books.elsevier.com/ http://books.elsevier.com/ companions/1558605967 companions/1558605967

RISC vs CISC DS2A 7-3

Distributed System Engineering

Brunel Univ. E&CEng

printed 15/11/2005 @ 17:50

RISC vs CISC DS2A 7-4

printed 15/11/2005 @ 17:50

Electronic and Computer Engineering

School of Engineering and Design

Distributed System Engineering & Micro-Electronics Computer Architecture

Distributed System Engineering

Brunel Univ. E&CEng

RISC vs CISC DS2A 7-5

Distributed System Engineering

Brunel Univ. E&CEng

Main microprocessor Architecture families

printed 15/11/2005 @ 17:50

RISC vs CISC DS2A 7-6

printed 15/11/2005 @ 17:50

Electronic and Computer Engineering

School of Engineering and Design

Distributed System Engineering & Micro-Electronics Computer Architecture

Distributed System Engineering

Brunel Univ. E&CEng

Instruction Set Analysis

round-off errors in averages

printed 15/11/2005 @ 17:50

RISC vs CISC DS2A 7-7

Distributed System Engineering

Brunel Univ. E&CEng

CISC and RISC processor Architectures

Instr. & Data path cache Main memory

Data path Data cache

CISC architecture with uCtrl unit and unified Cache.

printed 15/11/2005 @ 17:50

RISC vs CISC DS2A 7-8

printed 15/11/2005 @ 17:50

Electronic and Computer Engineering

School of Engineering and Design

Distributed System Engineering & Micro-Electronics Computer Architecture

Distributed System Engineering

Brunel Univ. E&CEng

RISC example Digital Alpha 21164

printed 15/11/2005 @ 17:50

RISC vs CISC DS2A 7-9

Distributed System Engineering

Brunel Univ. E&CEng