Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Acknowledgement
Slides taken from http://bwrc.eecs.berkeley.edu/IcBook/index.htm which is the web-site of Digital Integrated Circuit A Design Perspective by Rabaey, Chandrakasan, Nicolic
Outline
Multipliers
Shifter
Multipliers
Expensive and slow operations Multiplication units in state of the art DSP and mP Complex adders earlier discussion on adders relevant Partial products; accumulation; final summation
X Y
M+ N 1
=
k=0 N 1 i
Zk 2
M 1 = Xi 2 i=0 j = 0
Yj 2
with M 1 i X = Xi 2
i=0 N 1
M 1 N 1
=
i =0 j= 0
Xi Yj 2
i + j
Y j2
j= 0
Multiplication needs M cycles using N-bit adder In shift and add -M partial product added -Partial product is AND operation of multiplier bit and multiplicand followed by a shift
5
1 0 1 0 1 0
0 0 0 0 0 0
Partial products
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0
Result
6
Logical AND of multiplicand X and multiplier bit Yi Adding zeros has no impact on results Can reduce no. or partial products by half!! Eg. 0111 1110 1000 0010 where 1 = -1
So only two partial products need be added!
(N-1)/2
Eg. 01111111 is bunched into 01(1), 11(1), 11(1), 11(0) Multiplier 10 00 00 01 (see table) Four partial products developed instead of eight
8
X2 X2
FA X1
X1 X1
FA X0
X0 X0
HA Y1 Z0
Y0
Y2
Z1
FA X3
FA
FA X1
FA
FA X0
HA Y3
HA
Z2
X2
FA
Z7
Z6
Z5
Z4
Z3
N partial products of M bit size each NM two bit AND; N-1 M-bit adders Layout need not be straggled, but routing will take care of shift
9
FA
FA
FA
HA
FA
FA
FA
HA
Similar circuits for sum and carry generation tsum = tcarry in this case
11
Carry-Save Multiplier
HA HA HA HA HA FA FA FA
HA
FA
FA
FA
HA
FA
FA
HA
12
Multiplier Floorplan
X3 X2 X1 X0 Y0 Y1 C S C S C S C S HA Multiplier Cell Z0 FA Multiplier Cell Y2 C S C S C S C S
Z1
Y3
C S
C S
C S
C S
Z2
S Z6
S Z5
S Z4
S Z3
Z7
Can make layout rectangular Regular shape and layout Amenable to automation
13
Wallace-Tree Multiplier
Partial products 6 5 4 3 2 1 0 First stage 6 5 4 3 2 1 0 Bit position
Higher Speeds
(d)
(c)
Wallace-Tree Multiplier
HA
15
Wallace-Tree Multiplier
y0 y1 y2 Ci-1 y3 Ci FA y4 FA Ci FA y5 Ci FA C C S S FA Ci-1 Ci Ci-1 Ci-1 Ci Ci y0 y1 y2 y3 y4 y5 FA FA FA Ci-1 Ci-1
3 to 2 compression 4 to 2 and higher order compression proposed 16 Todays high performance multipliers do just that!
Wallace-Tree Multiplier
HA
Final adder choice critical; depends on structure of accumulator array Carry look ahead might be good if data arrives simultaneously Place pipeline stage before final addition 17 In non-pipelined, other adders similar performance w/ less hardware
Multipliers Summary
Optimization Goals Different Vs Binary Adder Once Again: Identify Critical Path Other possible techniques - Logarithmic versus Linear (Wallace Tree Mult) - Data encoding (Booth) - Pipelining FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION
54 54 multiplier achieved propagation delay of 4.4 ns Combined Booth encoding and Wallace tree using 4-2 compression With pass transistors; mixed carry-select and carry look ahead topology
18
Shifters
Needs
extensive hardware support Used for floating point units; scalers and multiplication by constants Programmable shifter more complex
an intricate multiplexer circuitry
19
Ai
Bi
Ai-1
Bi-1
Bit-Slice i
...
Signal goes through only one transmission gate (theoretically delay is constant for shift value and shifter size)
: Data Wire : Control Wire
Sh0
Sh1
Sh2
Sh3
Sh0
Sh1
Sh2
Sh3
Buffer
Widthbarrel ~ 2 pm M
22
Logarithmic Shifter
Sh1 Sh1 Sh2 Sh2 Sh4 Sh4
Total shift decomposed into powers of two Max shift width of M has log2M stages ith stage shifts 2i or passes data unchanged
A3
B3
A2
B2
Speed depends on shift length Series connection of pass transistor slows shifter down for larger shift values (need intermediate buffers) Appropriate for larger shifts (in terms of area and speed) Structure is regular Can be parameterised / auto- generated
23
A1
B1
A0
B0
Out2
Out1
Out0
24