Sei sulla pagina 1di 88

Digital Design

Chapter 5:
Register-Transfer Level
(RTL) Design
Slides to accompany the textbook Digital Design, with RTL Design, VHDL,
and Verilog, 2nd Edition,
by Frank Vahid, John Wiley and Sons Publishers, 2010.
http://www.ddvahid.com

Copyright 2010 Frank Vahid


Instructors of courses requiring Vahid's Digital Design textbook (published by John Wiley and Sons) have permission to modify and use these slides for customary course-related activities,
subject to keeping this copyright notice in place and unmodified. These slides may be posted as unanimated pdf versions on publicly-accessible course websites.. PowerPoint source (or pdf
Digital
2e
with animations)
may Design
not be posted
to publicly-accessible websites, but may be posted for students on internal protected sites or distributed directly to students by other electronic means.
Copyright

2010
1
Instructors may make printouts of the slides available to students for a reasonable photocopying charge, without incurring royalties. Any other use requires explicit permission. Instructors
Frank Vahid
may obtain PowerPoint
source or obtain special use permissions from Wiley see http://www.ddvahid.com for information.

5.1

Chpt 2
Capture Comb. behavior: Equations, truth tables
Convert to circuit: AND + OR + NOT Comb. logic

Chpt 3

Higher levels

Introduction
Registertransfer
level (RTL)

Logic level

Capture sequential behavior: FSMs


Convert to circuit: Register + Comb. logic Controller

Chpt 4
Datapath components, simple datapaths

Transistor level

Levels of digital
design abstraction

Chpt 5
Capture behavior: High-level state machine
Convert to circuit: Controller + Datapath Processor
Known as RTL (register-transfer level) design
Digital Design 2e
Copyright 2010
Frank Vahid

Processors:
Programmable
(microprocessor)
Custom
2

Note: Slides with animation are denoted with a small red "a" near the animated items

5.2

High-Level State Machines (HLSMs)


Some behaviors too complex for
equations, truth tables, or FSMs
Ex: Soda dispenser
c: bit input, 1 when coin deposited
a: 8-bit input having value of
deposited coin
s: 8-bit input having cost of a soda
d: bit output, processor sets to 1
when total value of deposited coins
equals or exceeds cost of a soda

c
d

8-bit input/output
Storage of current total
Addition (e.g., 25 + 10)

0 1 0 1 0
c
d

0 1 0
Digital Design 2e
Copyright 2010
Frank Vahid

Soda
dispenser
processor

FSM cant represent

50

a 25

25

Soda tot:
tot:
dispenser
25
processor 50

HLSMs

s
8

High-level state machine


(HLSM) extends FSM with:

c
d

Numbers:
Single-bit: '0' (single quotes)
Integer: 0 (no quotes)
Multi-bit: 0000 (double quotes)
a

== for equal, := for assignment


Multi-bit outputs must be
registered via local storage
// precedes a comment
Digital Design 2e
Copyright 2010
Frank Vahid

Soda
dispenser
processor

Multi-bit input/output
Local storage
Arithmetic operations
Conventions

Inputs: c (bit), a (8 bits), s (8 bits)


Outputs: d (bit) // '1' dispenses soda
Local storage: tot (8 bits)
c
Init

Wait

d:='0'
tot:=0

c*(tot<s)

tot:=tot+a
c'*(tot<s)
Disp

SodaDispenser

d:='1'
4

Ex: Cycles-High Counter


P = total number (in binary) of cycles that m is 1
Capture behavior as HLSM
Preg required (multibit outputs must be registered)
Inputs: m (bit)
Outputs: P (32 bits)
Local storage: Preg

Preg

32

Use to hold count


CountHigh

m
clk

CountHigh

CountHigh

S_Clr // Clear Preg to 0s


Preg := 0

Inputs: m (bit)
Outputs: P (32 bits)
Local storage: Preg

CountHigh

S_Clr // Clear Preg to 0s


Preg := 0

Inputs: m (bit)
Outputs: P (32 bits)
Local storage: Preg

S_Clr

// Clear Preg to 0s
Preg := 0
a

m'

S_Wt

// Wait for m == '1'

m'

m'

(a)

Digital Design 2e
Copyright 2010
Frank Vahid

(b)

S_Wt

// Wait for m == '1'

m
// Increment Preg
S_Inc Preg := Preg + 1
(c)

Note: Could have designed directly using an up-counter. But, that methodology
5
is ad hoc, and won't work for more complex examples, like the next one. a

Example: Laser-Based Distance Measurer


T (in seconds)
laser

D
Object of
interest

sensor

2D = T sec * 3*108 m/sec

Laser-based distance measurement pulse laser,


measure time T to sense reflection
Laser light travels at speed of light, 3*108 m/sec
Distance is thus D = (T sec * 3*108 m/sec) / 2

Digital Design 2e
Copyright 2010
Frank Vahid

Example: Laser-Based Distance Measurer


T (in seconds)
laser

sensor

from button

D
to display

16

Laser-based
distance
measurer

to laser

S
from sensor

Inputs/outputs

B: bit input, from button, to begin measurement


L: bit output, activates laser
S: bit input, senses laser reflection
D: 16-bit output, to display computed distance

Digital Design 2e
Copyright 2010
Frank Vahid

Example: Laser-Based Distance Measurer


from button B

DistanceMeasurer
Inputs: B (bit), S (bit)
Outputs : L (bit), D (16 bits)
Local storage: Dreg(16)

to display

16

Laserbased
distance
measurer

to laser

from sensor

(required)
a

S0

(first state usually


initializes the system)

L := '0' // laser off


Dreg := 0 // distance is 0

Declare inputs, outputs, and local storage


Dreg required for multi-bit output

Create initial state, name it S0


Initialize laser to off (L:='0')
Initialize displayed distance to 0 (Dreg:=0)

Digital Design 2e
Copyright 2010
Frank Vahid

Recall: '0' means single bit,


0 means integer

Example: Laser-Based Distance Measurer


from button B

DistanceMeasurer
...
B' // button not pressed
to display

S0

S1

L := '0'
Dreg := 0

B
// button
pressed

16

Laserbased
distance
measurer

to laser

from sensor

Add another state, S1, that waits for a button press


B' stay in S1, keep waiting
B go to a new state S2

Q: What should S2 do?

A: Turn on the laser


a

Digital Design 2e
Copyright 2010
Frank Vahid

Example: Laser-Based Distance Measurer


from button B
DistanceMeasurer
...

S0
L := '0'
Dreg := 0

B'

to display
S1

S2

S3

L := '1'
// laser on

L := '0'
// laser off

16

Laserbased
distance
measurer

to laser

from sensor

Add a state S2 that turns on the laser (L:='1')


Then turn off laser (L:='0') in a state S3
Q: What do next? A: Start timer, wait to sense reflection
a
Digital Design 2e
Copyright 2010
Frank Vahid

10

Example: Laser-Based Distance Measurer


B
from button

DistanceMeasurer

Inputs : B (bit), S (bit) Outputs : L (bit), D (16 bits)


Local storage: Dreg, Dctr (16 bits)
B'
S' // no reflection

S0

S1

L := '0'
Dreg := 0

Dctr := 0
// reset cycle
count

D
to display

16

Laser-based
distance
measurer

to laser

from sensor

S // reflection
?

S2

S3

L := '1'

L := '0'
Dctr := Dctr + 1
// count cycles

Stay in S3 until sense reflection (S)


To measure time, count cycles while in S3
To count, declare local storage Dctr
Initialize Dctr to 0 in S1. In S2 would have been O.K. too.
Don't forget to initialize local storagecommon mistake

Increment Dctr each cycle in S3


Digital Design 2e
Copyright 2010
Frank Vahid

11

Example: Laser-Based Distance Measurer


from button

DistanceMeasurer

Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits)


Local storage: Dreg, Dctr (16 bits)
to display

B'

S'

S0

S1

L := '0'
Dreg := 0

Dctr := 0

S2
L := '1'

S3

Laserbased
16

distance
measurer

to laser

S
from sensor

S4

Dreg := Dctr/2
L := '0'
Dctr := Dctr+1 // calculate D

Once reflection detected (S), go to new state S4


Calculate distance
Assuming clock frequency is 3x108, Dctr holds number of meters, so
Dreg:=Dctr/2

After S4, go back to S1 to wait for button again


Digital Design 2e
Copyright 2010
Frank Vahid

12

HLSM Actions: Updates Occur Next Clock Cycle


S'

Local storage updated on clock edges only


Enter state on clock edge
Storage writes in that state occur on next clock edge
Can think of as occurring on outgoing transitions

S3

S
Dctr := Dctr+1

S' / Dctr := Dctr+1

Thus, transition conditions use the OLD value,


not the newly-written value

S3

Example:
Inputs: B (bit)
Outputs: P (bit) // if B, 2 cycles high
Local storage: Jreg (8 bits)
B'

!(Jreg<2)
S0
P := '0'
Jreg := 1

Digital Design 2e
Copyright 2010
Frank Vahid

Jreg<2
S1

P := '1'
Jreg := Jreg + 1

(a)

clk

S0

S1

S1

S0

1
?

2
1

3
2

S/
Dctr := Dctr+1

B
Jreg
P
(b)

13

5.3

RTL Design Process


Capture behavior
Convert to circuit
Need target architecture
Datapath capable of HLSM's
data operations
Controller to control datapath

External
control
...
inputs
External ...
control
outputs

DP
control
inputs
...

External data
inputs
...

Datapath

Controller
...
DP
control
outputs

...
External data
outputs

Digital Design 2e
Copyright 2010
Frank Vahid

14

Ctrl/DP Example for Earlier


Cycles-High Counter

CountHigh
000...00001

A
B
add1
S
32

CountHigh
m
First clear Preg to 0s
Then increment Preg for each
clock cycle that m is 1

Preg

(a)

Preg_clr
Preg_ld

(c)

We
created
this
HLSM
earlier

m'

S_Wt

(b)
Digital Design 2e
Copyright 2010
Frank Vahid

Connect
with
controller

Derive
controller

32

CountHigh
m

//Preg := 0
S_Clr Preg_clr = 1
Preg_ld = 0

m'
m'

//Increment Preg
S_Inc Preg := Preg + 1

Create DP

DP

000...00001

//Wait for m=='1'

m'

clr I
ld Preg
Q

CountHigh Inputs: m (bit)


Outputs: P (32 bits)
LocStr: Preg (32 bits)
//Clear Preg to 0s
S_Clr Preg := 0

//Wait for m=1 Preg_clr


S_Wt Preg_clr = 0
Preg_ld = 0
Preg_ld
m
DP
//Preg:=Preg+1
S_Inc Preg_clr = 0
Preg_ld = 1

A
B
add1
S
32
clr I
ld Preg
Q

Controller

(d)

32

15

RTL Design Process

Digital Design 2e
Copyright 2010
Frank Vahid

16

Example: Soda Dispenser from Earlier


s

Quick overview example.


More details of each step to come.
tot_ld
tot_clr

Inputs: c (bit), a (8 bits), s (8 bits)


Outputs: d (bit) // '1' dispenses soda
Local storage: tot (8 bits)

ld
clr

d:='0'
tot:=0

Wait
c*(tot<s)

Digital Design 2e
Copyright 2010
Frank Vahid

8-bit
adder

8-bit
<

tot_lt_s

tot:=tot+a

Step 2A

c'*(tot<s)

d:='1'

Step 1

Datapath

Disp
SodaDispenser

tot

c
Init

c
tot_ld

tot_clr
Controller

tot_lt_s

Datapath

Step 2B

17

Example: Soda Dispenser


Quick overview example.
More details of each step to come.

c
tot_ld

tot_clr

Inputs: c (bit), a (8 bits), s (8 bits)


Outputs: d (bit) // '1' dispenses soda
Local storage: tot (8 bits)

Controller

tot_lt_s

Datapath

Step 2B

c
Init
d:='0'
tot:=0

Wait
c*(tot<s)

tot:=tot+a
c'*(tot<s)

Inputs : c, tot_lt_s (bit)


Outputs : d, tot_ld , tot_clr (bit)
c
d
Init

Wait

d=0
tot_clr=1

c' *
tot_lt_s

Disp
SodaDispenser

d:='1'

Step 1
Digital Design 2e
Copyright 2010
Frank Vahid

tot_ld

c
Add

tot_clr

tot_ld=1

tot_lt_s

c*tot_lt_s
Disp

Controller

d=1

Step 2C

18

Example: Soda Dispenser


Quick overview example.
More details of each step to come.

Inputs : c, tot_lt_s (bit)


Outputs : d, tot_ld , tot_clr (bit)

Wait
Add
Disp

c
0
0
1

0
0
0
0
1
1

tot_clr

s0
0
0
0

tot_ld

s1
0
0
0

tot_lt_s

Init

1
1
0
0
1

0
0
0
0
0

0
0
0
0
1

0
0
0
0
0

n1
0
0
0

n0
1
1
1

d
0
0
0

1
1
1
1
0

0
0
1
1
0

0
1
0
1
0

1
0
1
1
0

0
1

tot_ld

c
Add

tot_clr

tot_ld=1

tot_lt_s

d
Init

Wait

d=0
tot_clr=1

c *
tot_lt_s

c*tot_lt_s
Disp
d=1

Controller

Step 2C
Use controller design process
(Ch3) to complete the design

a
Digital Design 2e
Copyright 2010
Frank Vahid

19

RTL Design ProcessStep 2A: Create a datapath


Sub-steps
HLSM data inputs/outputs Datapath inputs/outputs.
HLSM local storage item Instantiated register
"Instantiate": Add new component ("instance") to design

Each HLSM state action and transition condition data computation


Datapath components and connections
Also instantiate multiplexors as needed

Need component library from which to choose

clr I
ld reg
Q

A
B
add
S

A
B
cmp
lt eq gt

clk^ and clr=1: Q=0


clk^ and ld=1: Q=I
else Q stays same

S = A+B

(unsigned)
A<B: lt=1
A=B: eq=1
A>B: gt=1

Digital Design 2e
Copyright 2010
Frank Vahid

I
shift<L/R>
Q
shiftL1: <<1
shiftL2: <<2
shiftR1: >>1
...

I1 I0
mux2x1
s0 Q
s0=0: Q=I0
s0=1: Q=I1

20

Step 2A: Create a DatapathSimple Examples


X

Preg = X + Y + Z

Preg = Preg + X

Preg

Preg

Preg

regQ

(a)
X

DP

DP

clr I
ld Preg
Q
P

Digital Design 2e
Copyright 2010
Frank Vahid

A B
add1
S

A
B
add2
S

0 clr I
1 ld Preg
Q

0 clr I
1 ld regQ
Q

A
B
add1
S
0
1

(d)
Y

A
B
add1
S

A
B
add2
S

DP

X+Y

0
1

(c)

A
B
add1
S

A
B
add2
S
X+Y+Z

(b)

k=0: Preg = Y + Z
k=1: Preg = X + Y
Preg

Preg=X+Y; regQ=Y+Z

clr I
ld Preg
Q

I1 I0
mux2x1
s0 Q

DP

0 clr I
1 ld Preg
Q
P

21

Laser-Based Distance MeasurerStep 2A: Create a


Datapath
DistanceMeasurer

Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits)


Local storage: Dreg, Dctr (16 bits)
B'

S'

S0

S1

L := '0'
Dreg := 0

Dctr := 0

S2
L := '1'

S3

S4

L := '0'
Dreg := Dctr/2
Dctr := Dctr+1 // calculate D

1
a

HLSM data I/O DP I/O


HLSM local storage reg
HLSM state action and
transition condition data
computation Datapath
components and connections

Dreg_clr
Dreg_ld
Dctr_clr
Dctr_ld

Datapath

16

A
B
Add1: add(16)
S
16

16

I
Shr1: shiftR1(16)
Q
16

I
clr
ld Dctr: reg(16)

clr
ld

I
Dreg: reg(16)

Q
16

D
Digital Design 2e
Copyright 2010
Frank Vahid

22

Laser-Based Distance MeasurerStep 2B: Connecting


the Datapath to a Controller
from button

to laser

Controller

from sensor

Dreg_clr
Dreg_ld
Dctr_clr

Datapath

Dctr_ld

D
to display
16
300 MHz Clock

Digital Design 2e
Copyright 2010
Frank Vahid

23

Laser-Based Distance MeasurerStep 2C: Derive the


Controller FSM
HLSM
1

DistanceMeasurer

Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits)


Local storage: Dreg, Dctr (16 bits)
B'

Dreg_clr
Dreg_ld

S'

S0

S1

L := '0'
Dreg := 0

Dctr := 0

FSM has same


states,
transitions, and
control I/O
Achieve each
HLSM data
operation using
datapath control
signals in FSM
Digital Design 2e
Copyright 2010
Frank Vahid

S2

Dctr_clr
Dctr_ld

S3

L := '1'

A
B
Add1: add(16)
S
16

16

I
clr
ld Dctr: reg(16)

S4

I
Shr1: shiftR1(16)
Q
16
clr
ld

I
Dreg: reg(16)
Q
16

L := '0'
Dreg := Dctr/2
Dctr := Dctr+1 // calculate D

Inputs: B, S

Controller

Datapath

16

Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_ld

S
a

S0

S1

L=0
Dreg_clr = 1
Dreg_ld = 0
Dctr_clr = 0
Dctr_ld = 0
(laser off)
(clear Dreg)

L=0
Dreg_clr = 0
Dreg_ld = 0
Dctr_clr = 1
Dctr_ld = 0
(clear count)

S2

S3

L=1
Dreg_clr = 0
Dreg_ld = 0
Dctr_clr = 0
Dctr_ld = 0
(laser on)

L=0
Dreg_clr = 0
Dreg_ld = 0
Dctr_clr = 0
Dctr_ld = 1
(laser off)
(count up)

S4
L=0
Dreg_clr = 0
Dreg_ld = 1
Dctr_clr = 0
Dctr_ld = 0
(load Dreg with Dctr/2)
(stop counting)

24

Laser-Based Distance MeasurerStep 2C: Derive the


Controller FSM
Controller

Inputs: B, S

Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_ld

S0

S1

L=0
Dreg_clr = 1
(laser off)
(clear Dreg)

Dctr_clr = 1
(clear count)

Same FSM, using


convention of
unassigned
outputs implicitly
assigned 0
Digital Design 2e
Copyright 2010
Frank Vahid

S2

S3

L=1
(laser on)

L=0
Dctr_ld = 1
(laser off)
(count up)

S4

Dreg_ld = 1
Dctr_ld = 0
(load Dreg with Dctr/2)
(stop counting)

Some assignments to 0 still shown, due to


their importance in understanding
desired controller behavior
25

5.4

More RTL Design


Additional datapath components
(signed)
A
B
sub
S

S = A-B
(signed)

Digital Design 2e
Copyright 2010
Frank Vahid

A
B
mul
P

A
abs
Q

P = A*B
Q = |A|
(unsigned) (unsigned)

clr
inc upcnt
Q

clk^ and clr=1: Q=0


clk^ and inc=1: Q=Q+1
else Q stays same

W_d
W_a
clk^ and W_e=1:
W_e
RF[W_a]= W_d
RF R_e=1:
R_a
R_e
R_d = RF[R_a]
R_d

26

RTL Design Involving Register File or Memory


HLSM array: Ordered list of items
Ex: Local storage: A[4](8-bit) 4 8-bit items
Accessed using notation "A[i]", i is index
A[0] := 9; A[1] := 8; A[2] := 7; A[3] := 22
Array contents now: <9, 8, 7, 22>
X := A[1] will set X to 8
Note: First element's index is 0

Array can be mapped to instantiated register file or memory

Digital Design 2e
Copyright 2010
Frank Vahid

27

ArrayEx Inputs: (none)


Outputs: P (11 bits)
Local storage: A[4](11 bits)
Preg (11 bits)
Init1 Preg := 0
A[0] := 9

Simple Array Example


a

(A[0] == 8)'
Init2 A[1] := 12

12
11

A[0] == 8

A_s

Out1 Preg := A[1]

(a)

A_Wa0
A_Wa1
A_We
A_Ra0
A_Ra1
A_Re

ArrayEx Inputs: A_eq_8


Outputs: A_s, A_Wa0, ...

Init1

Preg_clr = 1
A_s = 0
A_Wa1=0, A_Wa1=0
A_We = 1

Init2

A_eq_8

11

I1 I0
Amux
s0 Q
W_d
W_a
W_e

A
RF[4](11)
R_a
R_e
R_d

8
A
B
Acmp
lt eq gt

A_eq_8

(A_eq_8)'

Preg_clr

A_s = 1
A_Wa1=0, A_Wa0=1
A_We = 1
A_Ra1=0, A_Ra0=0
A_Re = 1

Preg_ld
DP

(b)

clr I
ld Preg
Q

11

Out1 Preg_ld = 1
Digital Design 2e
Copyright 2010
Frank Vahid

Controller

(c)
a

28

RTL Example: Video Compression Sum of Absolute


Only difference: ball moving

Differences

Frame 1

Frame 2

Frame 1

Frame 2

Digitized

Digitized

Digitized

Difference of

frame 1

frame 2

frame 1

2 from 1

1 Mbyte

1 Mbyte

1 Mbyte

0.01 Mbyte

(a)

(b )

Video is a series of frames (e.g., 30 per second)


Most frames similar to previous frame

Just send
difference

Compression idea: just send difference from previous frame


Digital Design 2e
Copyright 2010
Frank Vahid

29

RTL Example: Video Compression Sum of Absolute


Differences

compare
Frame 1

Frame 2

Each is a pixel, assume


represented as 1 byte
(actually, a color picture
might have 3 bytes per
pixel, for intensity of
red, green, and blue
components of pixel)

Need to quickly determine whether two frames are similar


enough to just send difference for second frame
Compare corresponding 16x16 blocks
Treat 16x16 block as 256-byte array

Compute the absolute value of the difference of each array item


Sum those differences if above a threshold, send complete frame
for second frame; if below, can use difference method (using
another technique, not described)
Digital Design 2e
Copyright 2010
Frank Vahid

30

Array Example: Video CompressionSum-of-Absolute


Differences
SAD

A
RF[256](8)
B
RF[256](8)

Inputs: A, B [256](8 bits); go (bit)


Outputs: sad (32 bits)
Local storage: sum, sadreg (32 bits); i (9 bits)
sad
!go

S0
go

go

S1

sum := 0
i := 0
a

S0: wait for go


S1: initialize sum and index
S2: check if done ( (i<256) )
S3: add difference to sum,
increment index
S4: done, write to output sad_reg
Digital Design 2e
Copyright 2010
Frank Vahid

(i<256)

S2
i<256
sum:=sum+abs(A[i]-B[i])
S3
i := i + 1
S4

sadreg := sum
(b)

31

Inputs: A, B [256](8 bits); go (bit)


Outputs: sad (32 bits)
Local storage: sum, sadreg (32 bits); i (9 bits)
S0

!go

go
S1

sum := 0
i := 0

Array Example: Video


CompressionSum-ofAbsolute Differences

!(i<256)

S2
i<256
sum:=sum+abs(A[i]-B[i])
S3
i := i + 1
S4

sadreg := sum

go

S0
go

(i<256) (i_lt_256)

S1

i_lt_256

go
sum=0 sum_clr=1
i=0 i_clr=1

A
cmp B

i_clr
sum_ld

i<256 i_lt_256
sum=sum+abs(A[i]-B[i])
S3
sum_ld=1; AB_rd=1
i=i+1 i_inc=1

sum_clr

sad_reg = sum
sadreg_ld=1

256

A_data B_data

8
a

S2

S4

lt

i_inc

Controller
Digital Design 2e
Copyright 2010
Frank Vahid

AB_addr

AB_rd

sum

32

abs
8

32 32
sadreg_ld
sadreg_clr
Datapath

sadreg

32
sad

32

Circuit vs. Microprocessor


Circuit: Two states (S2 & S3) for each i, 256 is 512 clock cycles
Microprocessor: Loop (for i = 1 to 256), but for each i, must move
memory to local registers, subtract, compute absolute value, add to
sum, increment i say 6 cycles per array item 256*6 = 1536 cycles
Circuit is about 3 times (300%) faster (assuming equal cycle lengths)
Later, well see how to build SAD circuit that is much faster

(i<256)

S2
i<256
sum:=sum+abs(A[i]-B[i])
S3
i:=i+1

Digital Design 2e
Copyright 2010
Frank Vahid

33

Common RTL Design Pitfall Involving Storage Updates


Questions

Local storage: R, Q (8 bits)

Value of Q after state A?


Final state is C or D?

R<100
A

R:=99
Q:=R

R:=R+1

Answers
Q is NOT 99 after state A
Q is 99 in state B, so final state is C
Storage update actions in state
occur simultaneously on next clock
edge
Thus, order actions are written is
irrelevant
A's actions same if:
Q:=R R:=99
R:=99 Q:=R
Digital Design 2e
Copyright 2010
Frank Vahid

(R<100)'
D

R<100

99

100

99

100

clk

C
a

or

34

Common RTL Design Pitfall Involving Storage Updates


New HLSM
using extra
state so read of
R occurs after
write of R

Local storage : R, Q (8 bits)


R<100
A

R:=99
Q:=R

R:=R+1
Q:=R

B2

(R<100)'

R<100 (R<100)'

Digital Design 2e
Copyright 2010
Frank Vahid

clk

B2

99
?

100
99

100

100

99

99

35

RTL Design Involving a Timer


Commonly need explicit time intervals

Ex: Repeatedly blink LED on 1 second, off 1 second

Pre-instantiate timer that HLSM can then use


BlinkLed

BlinkLed

T_M
T_ld

T_en

load
enable

Outputs: L (bit)

32
T_Q'

M
32-bit
1-microsec
Q timer T

T_Q
L

Timer: T

T_Q'
T_Q

T_Q

Init

Off

On

L:='0'
T:=1000000
T_en:='0'

L:='0'
T_en:='1'

L:='1'
T_en:='1'

(a)

(b)

Pre-instantiated timer

HLSM making use of timer

Digital Design 2e
Copyright 2010
Frank Vahid

36

Button Debouncing
Press button
Ideally, output changes to 1
Actually, output bounces

button
B

Due to mechanical reasons


Like ball bouncing when dropped to
floor

Ideal: B

Digital circuit can convert actual


signal closer to ideal signal

Actual: B
bounce

Inputs: Bin (bit) Outputs: Bout (bit)


Timer: T

ButtonDebouncer
Bin'

T_Q'
Bin

Digital Design 2e
Copyright 2010
Frank Vahid

Bin
Bin'

T_Q

Init

WaitBin

Wait20

Bout :='0'
T:=20000
T_en:='0'

Bout:='0'
T_en:='0'

Bout:='1'
T_en:='1'

WhileBin
Bout:='1'
T_en:='0'

37
a

Data Dominated RTL Design Example


Data dominated design: Extensive DP,
simple controller
Control dominated design: Complex
controller, simple DP
Example: Filter
Converts digital input stream to new
digital output stream
Ex: Remove noise

12

digital filter

12

clk

180, 180, 181, 180, 240, 180, 181


240 is probably noise, filter might replace
by 181

Simple filter: Output average of last N


values
Small N: less filtering
Large N: more filtering, but less sharp
output

Digital Design 2e
Copyright 2010
Frank Vahid

38

Data Dominated RTL Design Example: FIR Filter


FIR filter
Finite Impulse Response
Simply a configurable weighted
sum of past input values
y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)
Above known as 3 tap
Tens of taps more common
Very general filter User sets
the constants (c0, c1, c2) to
define specific filter

X
12

12

clk

y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)


Inputs: X (12 bits) Outputs: Y (12 bits)
Local storage: xt0, xt1, xt2, c0, c1, c2 (12 bits);
Yreg (12 bits)
Init

RTL design
Step 1: Create HLSM
Very simple states/transitions
FIR filter
Digital Design 2e
Copyright 2010
Frank Vahid

digital filter

Yreg := 0
xt0 := 0
xt1 := 0
xt2 := 0
c0 := 3
c1 := 2
c2 := 2

FC
Yreg :=
c0*xt0 +
c1*xt1 +
c2*xt2
xt0 := X
xt1 := xt0
xt2 := xt1

Assume constants set to 3, 2, and 2

39

FIR
Filter

Inputs: X (12 bits) Outputs: Y (12 bits)


Local storage: xt0, xt1, xt2, c0, c1, c2 (12 bits);
Yreg (12 bits)
Init

FIR filter

FC

Yreg := 0
xt0 := 0
xt1 := 0
xt2 := 0
c0 := 3
c1 := 2
c2 := 2

Yreg :=
c0*xt0 +
c1*xt1 +
c2*xt2
xt0 := X
xt1 := xt0
xt2 := xt1

3
xt0_clr

c0
xt0

c2_ld
c1

xt1

c2

...

c1_ld

...

xt0_ld

Set clr and ld lines appropriately

c0_ld

Step 2A: Create datapath


Step 2B: Connect Ctrlr/DP (as
earlier examples)
Step 2C: Derive FSM

xt2

12

clk

x(t)

x(t-1)

x(t-2)

*
Yreg_clr
Yreg_ld

Y
Yreg

Datapath for 3-tap FIR filter


Digital Design 2e
Copyright 2010
Frank Vahid

12

40

Circuit vs. Microprocessor

y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)

Comparing the FIR circuit to microprocessor instructions


Microprocessor
100-tap filter: 100 multiplications, 100 additions. Say 2 instructions
per multiplication, 2 per addition. Say 10 ns per instruction.
(100*2 + 100*2)*10 = 4000 ns

Circuit
Assume adder has 2 ns delay, multiplier has 20 ns delay
Longest path goes through one multiplier and two adders
20 + 2 + 2 = 24 ns delay

100-tap filter, following design on previous slide, would have about a


34 ns delay: 1 multiplier and 7 adders on longest path

Circuit is more than 100 times faster (4000/34). Wow.

Digital Design 2e
Copyright 2010
Frank Vahid

41

5.5

Determining Clock Frequency


Designers of digital circuits
often want fastest
performance

clk

Means want high clock


frequency

Frequency limited by longest


register-to-register delay
Known as critical path
If clock is any faster, incorrect
data may be stored into register
Longest path on right is 2 ns

2 ns
delay

+
c

Ignoring wire delays, and


register setup and hold times,
for simplicity

Digital Design 2e
Copyright 2010
Frank Vahid

42

Critical Path
Example shows four paths

Digital Design 2e
Copyright 2010
Frank Vahid

5 ns
delay

2 ns

1 / 7 ns = 142 MHz

Max
(2,7,7,5)
= 7 ns

7 ns

Longest path is thus 7 ns


Fastest frequency

2 ns
delay

7 ns

a to c through +: 2 ns
a to d through + and *: 7 ns
b to d through + and *: 7 ns
b to d through *: 5 ns

2 ns

d
a

43

Critical Path Considering Wire Delays


Real wires have delay too
Must include in critical path

Example shows two paths

clk

Each is 0.5 + 2 + 0.5 = 3 ns

Wire delays may even be greater than


logic delays!

0.5 ns

2 ns
a

0.5 ns

3 ns

1980s/1990s: Wire delays were tiny


compared to logic delays
But wire delays not shrinking as fast as
logic delays

0.5 ns

3 ns

Trend

Must also consider register setup and


hold times, also add to path
Then add some time to the computed
path, just to be safe
e.g., if path is 3 ns, say 4 ns instead
Digital Design 2e
Copyright 2010
Frank Vahid

44

A Circuit May Have Numerous Paths


Paths can exist
In the datapath
In the controller
Between the
controller and
datapath
May be
hundreds or
thousands of
paths

Timing analysis
tools that evaluate
all possible paths
automatically very
helpful
Digital Design 2e
Copyright 2010
Frank Vahid

s
Combinational logic

tot_ld

ld

tot

t ot_clr

clr

(c )

tot_lt_s

n1

8-bit
<

n0

8-bit
adder
8

tot_lt_s

Datapath
s1
clk

s0

(b )

(a)

State register

45

Behavioral Level Design: C to Gates


Inputs : A, B [256](8 bits); go (bit)
Outputs : sad (32 bits)
Local storage : sum, sadreg (32 bits); i (9 bits)
S0
go
S1

!go
sum := 0
i := 0

(i<256)

S2
i<256
S3 sum:=sum+abs(A[i]-B[i])
i := i + 1
S4

sadreg := sum

5.6

C code
int SAD (byte A[256], byte B[256]) // not quite C syntax
{
uint sum; short uint I;
sum = 0;
i = 0;
while (i < 256) {
sum = sum + abs(A[i] B[i]);
i = i + 1;
}
return sum;
}

Earlier sum-of-absolute-differences example


Started with high-level state machine
C code is an even better starting point -- easier to understand
Digital Design 2e
Copyright 2010
Frank Vahid

46

Converting from C to High-Level State Machine


Convert each C construct to
equivalent states and
transitions
Assignment statement
Becomes one state with
assignment

target :=
expression

target = expression;

If-then statement
Becomes state with condition
check, transitioning to then
statements if condition true,
otherwise to ending state
then statements would also
be converted to states
Digital Design 2e
Copyright 2010
Frank Vahid

cond
if (cond) {
// then stmts
}

cond
a

(then stmts)
(end)

47

Converting from C to High-Level State Machine


If-then-else
Becomes state with condition
check, transitioning to then
statements if condition true, or
to else statements if condition
false

cond
if (cond) {
// then stmts
}
else {
// else stmts
}

cond
(then stmts) (else stmts)
a

(end)

While loop statement


Becomes state with condition
check, transitioning to while
loops statements if true, then
transitioning back to condition
check
Digital Design 2e
Copyright 2010
Frank Vahid

cond
while (cond) {
// while stmts
}

cond
(while stmts)

(end)

48

Simple Example of Converting from C to HighLevel State Machine


Inputs: uint X, Y
Outputs: uint Max

(X>Y)

(X>Y)

X>Y

X>Y

if (X > Y) {
Max = X;

(then stmts)

(else stmts)

Max:=X

Max:=Y

}
else {
Max = Y;

(end)

(end)

}
a

(a)

(b)

(c)

Simple example: Computing the maximum of two numbers


Convert if-then-else statement to states (b)
Then convert assignment statements to states (c)
Digital Design 2e
Copyright 2010
Frank Vahid

49

Example: SAD C
code to HLSM
Convert each construct
to states

Inputs: byte A[256],B[256]


bit go;
Output: int sad
main()
{
uint sum; short uint i;
while (1) {

Actually, subset of C
(not all C constructs
easily convertible)
Can use language
other than C

go'

go'

go'

go
sum:=0
i:=0

i=0

sum = 0;
i = 0;

go
sum:=0

while (!go);

Simplify, e.g., merge


states

RTL design process to


convert to circuit
Can thus convert C to
circuit using
straightforward process

(go')'

(c)

(b)

while (i < 256) {


sum = sum + abs(A[i] B[i]);
i = i + 1;
}
sad = sum;
go'

(d)

go

go'

go

(a)
sum:=0
i:=0
go'

sum:=0
i:=0

go
(i<256)'
sum:=0
i:=0

i<256
(i<256)'

sum:=sum
+ abs...
i := i + 1

i<256
while stmts

(i<256)'
i<256
sum:=sum
+ abs...
i := i + 1
sadreg :=
sum

(g)
Digital Design 2e
Copyright 2010
Frank Vahid

sadreg :=
sum

(e)

(f)

50

5.7

Memory Components

Some components are used


outside the controller and DP

MxN memory

M words

RTL design instantiates


datapath components to
create datapath, controlled
by a controller

M words, N bits wide each

Several varieties of memory,


which we now introduce

N-bits
wide each

MN memory

Digital Design 2e
Copyright 2010
Frank Vahid

51

Random Access Memory (RAM)


RAM Readable and writable memory
Random access memory
Strange nameCreated several decades ago to
contrast with sequentially-accessed storage like
tape drives

Logically same as register fileMemory with


address inputs, data inputs/outputs, and control

32
4

32
W_data

R_data

W_addr

R_addr

W_en

1632
register file

R_en

Register file from Chpt. 4

RAM usually one port; RF usually two or more

RAM vs. RF
RAM typically larger than about 512 or 1024 words
RAM typically stores bits using a bit storage
approach that is more efficient than a flip-flop
RAM typically implemented on a chip in a square
rather than rectangular shapekeeps longest
wires (hence delay) short

32

data

10
addr
rw

1024 32
RAM

en

RAM block symbol


Digital Design 2e
Copyright 2010
Frank Vahid

52

RAM Internal Structure


32
10

wdata(N-1) wdata(N-2) wdata0

Let A = log2M

data
addr
rw

1024x32
RAM

d0

en

a0
a1 AxM
d1
decoder
a(A-1)

addr0
addr1

addr(A-1)
clk

en
rw

w ord

data cell
word word
enable enable
rw data

d(M-1)
to all cells

wdata0

wdata
(N-1)

rdata
(N-1)

bit storage
block
(aka cell)

rdata0

Combining rd and wr
data lines

word
enable

data0

Similar internal structure as register file

rdata(N-1) rdata(N-2) rdata0

RAM cell

rw

data(N-1)

Digital Design 2e
Copyright 2010
Frank Vahid

Decoder enables appropriate word based on address inputs


rw controls whether cell is written or read
rd and wr data lines typically combined
53
Lets see whats inside each RAM cell

Static RAM (SRAM)


32
10

SRAM cell
data

data

cell

data

addr
rw

1024x32
RAM

en

word
enable

SRAM cell
data
0

data
1

Static RAM cell


6 transistors (recall inverter is 2 transistors)
Writing this cell
word enable input comes from decoder
When 0, value d loops around inverters
That loop is where a bit stays stored

d
a

word
enable

When 1, the data bit value enters the loop


data is the bit to be stored in this cell
data enters on other side
Example shows a 1 being written into cell

Digital Design 2e
Copyright 2010
Frank Vahid

data

1
word
enable

data
cell
d

54

Static RAM (SRAM)


32
10

data
addr
rw

1024x32
RAM

en

Static RAM cell

SRAM cell

Somewhat trickier
When rw set to read, the RAM logic sets
both data and data to 1
The stored bit d will pull either the left line or
the right bit down slightly below 1
Sense amplifiers detect which side is
slightly pulled down

data
1

data
1

Reading this cell

d
1

0
a

word
enable

<1

To sense amplifiers

The electrical description of SRAM is really


beyond our scope just general idea here,
mainly to contrast with DRAM...
Digital Design 2e
Copyright 2010
Frank Vahid

55

Dynamic RAM (DRAM)


32
10

data
addr
rw

1024x32
RAM

en

DRAM cell

Dynamic RAM cell


1 transistor (rather than 6)
Relies on large capacitor to store bit
Write: Transistor conducts, data voltage
level gets stored on top plate of capacitor
Read: Just look at value of d
Problem: Capacitor discharges over time
Must refresh regularly, by reading d and
then writing it right back

data
cell
word
enable

capacitor
slowly
discharging
(a)

data
enable
d

Digital Design 2e
Copyright 2010
Frank Vahid

discharges
(b)

56

Comparing Memory Types


Register file
Fastest
But biggest size

SRAM
Fast
More compact than register file

DRAM

MxN Memory
implemented as a:
register
file
SRAM
DRAM

Slowest
And refreshing takes time

But very compact

Use register file for small items,


SRAM for large items, and DRAM
for huge items

Size comparison for same


number of bits (not to scale)

Note: DRAMs big capacitor requires


a special chip design process, so
DRAM is often a separate chip
Digital Design 2e
Copyright 2010
Frank Vahid

57

Reading and Writing a RAM


clk
2

1
addr

13

data

500

999

rw

3
9
Z

addr
500

1 means write

en

clk

data
rw

valid setup
time
valid
hold
time
setup
time

500

access
time

RAM[9]
RAM[13]
now equals 500 now equals 999

Writing

(b)

Put address on addr lines, data on data lines, set rw=1, en=1

Reading
Set addr and en lines, but put nothing (Z) on data lines, set rw=0
Data will appear on data lines

Dont forget to obey setup and hold times


In short keep inputs stable before and after a clock edge
Digital Design 2e
Copyright 2010
Frank Vahid

58

RAM Example: Digital Sound Recorder

wire
microphone

en

rw

addr

data

4096x16
RAM

16
analog-todigital
conver ter

ad_buf
ad_ld

12

Ra RrwRen

processor

digital-toanalog
conver ter

wire

da_ld

Behavior

speaker

Record: Digitize sound, store as series of 4096 12-bit digital values in RAM
Well use a 4096x16 RAM (12-bit wide RAM not common)

Play back later


Common behavior in telephone answering machine, toys, voice recorders

To record, processor should read a-to-d, store read values into


successive RAM words
To play, processor should read successive RAM words and enable d-to-a
Digital Design 2e
Copyright 2010
Frank Vahid

59

RAM Example: Digital Sound Recorder


4096x16
RAM

RTL design of processor


Create HLSM
Begin with the record behavior
Create local storage a
Stores current address,
ranges from 0 to 4095 (thus
need 12 bits)

Create state machine that


counts from 0 to 4095 using a
For each a
Read analog-to-digital conv.
ad_ld:=1, ad_buf:=1
Write to RAM at address a
Rareg:=a, Rrw:=1,
Ren:=1
Digital Design 2e
Copyright 2010
Frank Vahid

16
analog-todigital
converter

ad_buf
ad_ld

12

digital-toanalog
converter

Ra Rw Ren

processor

da_ld

Record behavior
Local register: a, Rareg (12 bits)
a<4095
S
a:=0

T
ad_ld:=1
ad_buf:=1
Rareg:=a
Rrw:=1
Ren:=1

U
a:=a+1
(a<4095)

60

RAM Example: Digital Sound Recorder


Now create play behavior
Use local register a again,
create state machine that
counts from 0 to 4095 again
For each a
Read RAM
Write to digital-to-analog conv.

Note: Must write d-to-a one


cycle after reading RAM, when
the read data is available on
the data bus

The record and play state


machines would be parts of a
larger state machine controlled
by signals that determine when
to record or play
Digital Design 2e
Copyright 2010
Frank Vahid

4096x16
RAM

data bus

16
analog-todigital
converter

ad_buf
ad_ld

12

digital-toanalog
converter

Ra Rw Ren

processor

da_ld

Play behavior
Local register: a,Rareg (12 bits)
a<4095
V
a:=0

W
ad_buf:=0
Rareg:=a
Rrw=0
Ren=1

X
da_ld:=1
a:=a+1
(a<4095)

61

Read-Only Memory ROM


Memory that can only be read from, not
written to

32

data

10

addr

Data lines are output only


No need for rw input

rw

1024 32
RAM

en

Advantages over RAM


Compact: May be smaller
Nonvolatile: Saves bits even if power supply
is turned off
Speed: May be faster (especially than
DRAM)
Low power: Doesnt need power supply to
save bits, so can extend battery life

Choose ROM over RAM if stored data wont


change (or wont change often)

RAM block symbol

32
10

data
addr 1024x32
ROM
en

ROM block symbol

For example, a table of Celsius to Fahrenheit


conversions in a digital thermometer
Digital Design 2e
Copyright 2010
Frank Vahid

62

Read-Only Memory ROM


32

data

10
addr

1024x32
ROM

Let A = log2M

en

d0

ROM block symbol

addr0
addr1

addr(A-1)
addr

a0
a1 AxM
d1
decoder
a(A-1)

clk

word
enable

bit storage
block
(aka cell)
w ord

data
word word
enable enable
data

d(M-1)

en
rdata(N-1) rdata(N-2) rdata0

ROM cell

Internal logical structure similar to RAM, without the data


input lines
Digital Design 2e
Copyright 2010
Frank Vahid

63

ROM Types
If a ROM can only be read, how
are the stored bits stored in the
first place?
Storing bits in a ROM known as
programming
Several methods

Mask-programmed ROM
Bits are hardwired as 0s or 1s
during chip manufacturing
2-bit word on right stores 10
word enable (from decoder) simply
passes the hardwired value
through transistor

data line
cell

data line
cell

word
enable

Notice how compact, and fast, this


memory would be
Digital Design 2e
Copyright 2010
Frank Vahid

64

ROM Types
Fuse-Based Programmable
ROM
Each cell has a fuse
A special device, known as a
programmer, blows certain fuses
(using higher-than-normal voltage)
Those cells will be read as 0s
(involving some special electronics)
Cells with unblown fuses will be read
as 1s
2-bit word on right stores 10

Also known as One-Time


Programmable (OTP) ROM

Digital Design 2e
Copyright 2010
Frank Vahid

data line
cell

data line
cell
a

word
enable
fuse

blown fuse

65

ROM Types
Erasable Programmable ROM
(EPROM)

Electrons become trapped in the gate


Only done for cells that should store 0
Other cells (without electrons trapped in
gate) will be 1
2-bit word on right stores 10

Details beyond our scope just general


idea is necessary here

floating-gate
transistor

Uses floating-gate transistor in each cell


Special programmer device uses higherthan-normal voltage to cause electrons to
tunnel into the gate

word
enable

data line

data line

cell

cell
a

1
0

1
e- e-

trapped electrons

To erase, shine ultraviolet light onto chip


Gives trapped electrons energy to escape
Requires chip package to have window
Digital Design 2e
Copyright 2010
Frank Vahid

66

ROM Types
Electronically-Erasable Programmable ROM
(EEPROM)
Similar to EPROM
Uses floating-gate transistor, electronic programming to
trap electrons in certain cells

But erasing done electronically, not using UV light


Erasing done one word at a time

Flash memory
Like EEPROM, but all words (or large blocks of
words) can be erased simultaneously
Became very common starting in late 1990s

Both types are in-system programmable


Can be programmed with new stored bits while in the
system in which the ROM operates
Requires bi-directional data lines, and write control input
Also need busy output to indicate that erasing is in
progress erasing takes some time
Digital Design 2e
Copyright 2010
Frank Vahid

32

data

10
addr

en

1024x32
EEPROM

write

busy

67

ROM Example: Talking Doll


Hello there!

4096x16 ROM

Hello there! audio


divided into 4096
samples, stored
in ROM

16

speaker

Hello there!
a

digital-toanalog
Ra Ren

processor

converter

vibration
sensor

da_ld
v

Doll plays prerecorded message, triggered by vibration


Message must be stored without power supply Use a ROM, not a RAM,
because ROM is nonvolatile
And because message will never change, may use a mask-programmed ROM or
OTP ROM

Processor should wait for vibration (v=1), then read words 0 to 4095 from
the ROM, writing each to the d-to-a
Digital Design 2e
Copyright 2010
Frank Vahid

68

ROM Example: Talking Doll


Local register: a, Rareg (12 bits)
4096x16 ROM

v
a:=0

16

digital-toanalog
converter

Ra Ren

processor

a<4095
T
Rareg:=a
Ren:=1
U

v
(a<4095)

da_ld

da_ld:=1
a:=a+1

HLSM
Create state machine that waits for v=1, and then counts from 0 to
4095 using a local storage a
For each a, read ROM, write to digital-to-analog converter
Digital Design 2e
Copyright 2010
Frank Vahid

69

ROM Example: Digital Telephone Answering Machine


Using a Flash Memory
Want to record the outgoing
announcement

What type of memory?


Should store without power
supply ROM, not RAM
Should be in-system
programmable EEPROM
or Flash, not EPROM, OTP
ROM, or mask-programmed
ROM
Will always erase entire
memory when
reprogramming Flash
better than EEPROM
Digital Design 2e
Copyright 2010
Frank Vahid

analog-todigital
converter

busy

erase

en

rw

Were not home.


addr

When rec=1, record digitized


sound in locations 0 to 4095
When play=1, play those
stored sounds to digital-toanalog converter

4096x16 Flash
data

16
ad_buf
ad_ld

12

Ra Rrw Ren er

processor

record

microphone

bu

digital-toanalog
converter

da_ld

rec
play

speaker

70

ROM Example: Digital Telephone Answering Machine


Using a Flash Memory
HLSM
Once rec=1, begin
erasing flash by setting
er=1
Wait for flash to finish
erasing by waiting for
bu=0
Execute loop that sets
local register a from 0 to
4095, reading analog-todigital converter and
writing to flash for each a

4096x16 Flash

analog-todigital
converter

ad_buf

12

ad_ld

Ra Rrw Ren er

processor

bu

digital-toanalog
converter

da_ld

rec
record
microphone

S
a:=0
er:=1
rec

Digital Design 2e
Copyright 2010
Frank Vahid

16

play
speaker

Local register: a, Rareg (13 bits)


bu
a<4096
T bu U
er:=0

ad_ld:=1
ad_buf:=1
Rareg:=a
Rrw:=1
Ren:=1
a:=a+1

V
(a<4096)

71

Blurring of Distinction Between ROM and RAM


We said that
RAM is readable and writable
ROM is read-only

ROM

Flash

EEPROM

RAM
a

NVRAM

But some ROMs act almost like RAMs


EEPROM and Flash are in-system programmable
Essentially means that writes are slow
Also, number of writes may be limited (perhaps a few million times)

And, some RAMs act almost like ROMs


Non-volatile RAMs: Can save their data without the power supply
One type: Built-in battery, may work for up to 10 years
Another type: Includes ROM backup for RAM controller writes RAM contents to
ROM before turning off

New memory technologies evolving that merge RAM and ROM benefits
e.g., MRAM

Bottom line
Lot of choices available to designer, must find best fit with design goals
Digital Design 2e
Copyright 2010
Frank Vahid

72

5.8

Queues (FIFOs)
A queue is another component
sometimes used during RTL
design
Queue: A list written to at the
back, from read from the front
Like a list of waiting restaurant
customers

Writing called a push, reading


called a pop
Because first item written into a
queue will be the first item read
out, also called a FIFO (first-infirst-out)
Digital Design 2e
Copyright 2010
Frank Vahid

back

front

write items

read (and

to the back
of the queue

remove) items
from front of
the queue

73

Queues
7

Queue has addresses, and two


pointers: rear and front

rf
0

Initially both point to 0

Push (write)
Item written to address pointed to
by rear
rear incremented

Pop (read)
Item read from address pointed
to by front
front incremented

If front or rear reaches 7, next


(incremented) value should be 0
(for a queue with addresses 0 to
7)
Digital Design 2e
Copyright 2010
Frank Vahid

r
2

r
1

f
0

f
0

B
r

f
74

Queues
Treat memory as a circle

If front or rear reaches 7, next (incremented)


value should be 0 rather than 8 (for a queue
with addresses 0 to 7)

Two conditions of interest

Full queue no room for more items

In 8-entry queue, means 8 items present


No further pushes allowed until a pop occurs
Causes front=rear

Empty queue no items


No pops allowed until a push occurs
Causes front=rear

Both conditions have front=rear


To detect whether front=rear means full or
empty, need state machine that detects if
previous operation was push or pop, sets full
or empty output signal (respectively)
Digital Design 2e
Copyright 2010
Frank Vahid

5
4
75

Queue Implementation

rear used as register files


write address, front as read
address

Simple controller would


set control lines for
pushes and pops, and
also detect full and empty
situations
FSM for controller not
shown
Digital Design 2e
Copyright 2010
Frank Vahid

8x16 register file

16

wdata

wdata

rdata

waddr

raddr

wr

reset

inc

3-bit
up counter
rear

eq

rdata

clr

inc
rd

16

rd

clr

wr

Controller

Can use register file for


item storage
Implement rear and front
using up counters

3-bit
up counter
front

=
full
empty
8-word 16-bit queue

76

Common Uses of a Queue


Computer keyboard
Pushes pressed keys onto queue, meanwhile pops and sends to
computer

Digital video recorder


Pushes captured frames, meanwhile pops frames, compresses
them, and stores them

Computer network routers


Pushes incoming packets onto queue, meanwhile pops packets,
processes destination information, and forwards each packet out
over appropriate port

Digital Design 2e
Copyright 2010
Frank Vahid

77

Queue Usage Example


Example series of pushes
and pops
Note how rear and front
pointers move
Note that popping doesnt
really remove the data from the
queue, but that data is no
longer accessible
Note how rear (and front)
wraps around from address 7
to 0

Note: pushing a full queue is


an error
So is popping an empty queue

Initially empt y
queue
6

rf
0

f
0

r
7

f
1

9
r
0
3

7
1. Aft er pushing
9, 5, 8, 5, 7, 2, 3
r
7
2. Aft er popping

3. Aft er pushing 6

4. Aft er pushing 3

f
1

data:
9

full

rf
Digital Design 2e
Copyright 2010
Frank Vahid

5. Aft er pushing 4

ERROR! Pushing a full queue


results in unknown state.

78

5.9

Multiple Processors
Using multiple processors
can ease design
Keeps distinct behaviors
separate
Ex: Laser-based distance
measurer with button
debounce

from ButtonDebouncer
button
Bin
Bout

16

Laser-based
distance
measurer

to display

L
to laser
S
from sensor

Use two processors

Ex: Code detector with


button press synchronizers
(BPS)
BPS processor for each
input, plus CodeDetector
processor

Star t
Red
Green
Blue

si

BPS

s
u

ri
gi

BPS

bi
ai

BPS

BPS

r
g

Code
detector

Door
lock

b
a

BPS
Digital Design 2e
Copyright 2010
Frank Vahid

79

Interfacing Multiple Processors


Use signal, register, or other component outside processors
Known as global

Common methods use global...


control signal, data signal, register, register file, queue

Typically all multiple processors and clocked globals use


same clock
Synchronized

Digital Design 2e
Copyright 2010
Frank Vahid

80

Ex: Temperature Statistics with Multiple Processors

16-bit unsigned input T from temperature sensor, 16-bit output A. Sample T


every 1 second. Compute output A every minute, should equal average of most
recent 64 samples.
Single HLSM: Complicated
Instead, two HLSMs (and hence two processors) and shared register file
Tsample HLSM: Store T into successive RF address, once per sec.
Avg HLSM: Compute and output average of all 64 RF words, once per min.
Note that each uses distinct timer

Keeping the
sampling and
averaging
behaviors
separate leads to
simple design
Digital Design 2e
Copyright 2010
Frank Vahid

TempStats

W_d
W_a
W_e

Tsample

W_d
W_a
R_a
W_e
R_e

TRF
RF[64](16)
R_d

R_a
R_e

Avg

R_d

81

Ex: Digital Camera with Mult. Processors and Queue


Read and Compress processors (Ch 1)
Compress may take longer, depends on picture
Use queue, read can push additional pics (up to 8)
Likewise, use queue between Compress and Store

Image sensor

8
Read
circuit

wdata
rdata
wr
rd
full Queue empty
[8](8)

8
Compress
circuit

Queue
[8](8)

Store
circuit

Memory
a

Digital Design 2e
Copyright 2010
Frank Vahid

82

5.10

Hierarchy A Key Design Concept

CityG
Country A

Province 3

Digital Design 2e
Copyright 2010
Frank Vahid

CityF

Province 3

To go from transistors to gates, muxes,


decoders, registers, ALUs, controllers,
datapaths, memories, queues, etc.
Imagine trying to comprehend a controller
and datapath at the level of gates

CityE

Province 2

Hierarchy helps us manage complexity

CityC

Province 1

1 item at top (the country)


Country item decomposed into
state/province items
Each state/province item decomposed into
city items

CityB

Province 2

Organization with few items at the top, with


each item decomposed into other items
Common example: Country

Province 1

Hierarchy

CityD

CityA

Country A

Map showing just top two levels


of hierarchy
83

Hierarchy and Abstraction


Abstraction
Hierarchy often involves not just
grouping items into a new item, but also
associating higher-level behavior with
the new item, known as abstraction
Ex: 8-bit adder has understandable highlevel behavioradds two 8-bit binary
numbers

Frees designer from having to


remember, or even understand, the
lower-level details

Digital Design 2e
Copyright 2010
Frank Vahid

a7.. a0

b7.. b0

8-bit adder
co

ci

s7.. s0

84

Hierarchy and Composing Larger Components


from Smaller Versions

A common task is to compose smaller components


into a larger one

i0

4x1
i0

Gates: Suppose you have plenty of 3-input AND gates,


but need a 9-input AND gate

i1

i1

Can simple compose the 9-input gate from several 3-input


gates

i2

i2

i3

i3

Muxes: Suppose you have 4x1 and 2x1 muxes, but


need an 8x1 mux
s2 selects either top or bottom 4x1
s1s0 select particular 4x1 input
Implements 8x1 mux 8 data inputs, 3 selects, one output

P
ro
vin
c
e1

d
2x1
s1

s0

i0
d

i4

4x1
i0

i5

i1

i6

i2

i1
s0
d

i3
s1

s0

0
s1

Digital Design 2e
Copyright 2010
Frank Vahid

s0

s2

85

Hierarchy and Composing Larger Components


from Smaller Versions
Composing memory very common
Making memory words wider

en

addr

Easy just place memories side-by-side until desired width obtained


Share address/control lines, concatenate data lines
Example: Compose 1024x8 ROMs into 1024x32 ROM
10

addr
1024x8
ROM
en
data

addr
1024x8
ROM
en
data

addr
1024x8
ROM
en
data

addr
1024x8
ROM
en
data

data(31..0)
10

1024x32
ROM
data
Digital Design 2e
Copyright 2010
Frank Vahid

32

86

Put memories on top of one another until the


number of desired words is achieved
Use decoder to select among the memories
Can use highest order address input(s) as
decoder input
Although actually, any address line could be
used

a10 just chooses


which memory
to access

Digital Design 2e
Copyright 2010
Frank Vahid

0 1
0 1

P
1 1 1
ro
vin
1 1 1

1
1

P
r1 1
1o
vin
1 c1 1

1 0
1 1

1 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 1 0
1 1 1 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1

a9..a0
a10

addr

1x2 d0
dcd
i0
e d1

1024x8
ROM
en

data
8

addr
11

en

Example: Compose 1024x8 memories into


2048x8 memory
a10 a9a8
a0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 1 0

11

addr

Creating memory with more words

en

addr

Hierarchy and Composing Larger Components


from Smaller Versions

1024x8
ROM
2048x8
ROM

en

data
8

data
addr
1024x8
ROM
en data

addr
1024x8
ROM
en data

To create memory with more


words and wider words, can first
compose to enough words, then
widen.
87

Chapter Summary
Modern digital design involves creating processor-level components
High-level state machines
RTL design process
1. Capture behavior: Use HLSM
2. Convert to circuit
A. Create datapath B. Connect DP to controller C. Derive controller FSM

More RTL design


More components, arrays, timers, control vs. data dominated

Determining fastest clock frequency


By finding critical path

Behavioral-level design C to gates


By using method to convert C (subset) to high-level state machine

Memory components (RAM, ROM)


Queues
Multiple processors
Hierarchy: A key concept used throughout Chapters 2-5

Digital Design 2e
Copyright 2010
Frank Vahid

88

Potrebbero piacerti anche