Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Oct. 2014
Slide 1
Released
Revised
Revised
Revised
Revised
First
July 2003
July 2004
July 2005
Mar. 2006
Jan. 2007
Jan. 2008
Jan. 2009
Jan. 2011
Oct. 2014
Oct. 2014
Slide 2
Number Representation
Slide 3
Next addr
jta
Next PC
ALUOvfl
(PC)
PC
Instr
cache
rs
rt
inst
0
1
2
rd
31
imm
op
Br&Jump
Instruction fetch
Fig. 13.3
Oct. 2014
Register
writeback
(rs)
Ovfl
Reg
file
ALU
(rt)
/
16
ALU
out
Data
cache
Data
out
Data
in
Func
0
32
SE / 1
Data
addr
0
1
2
Register input
fn
RegDst
RegWrite
ALUSrc
ALUFunc
ALU operation
DataRead
RegInSrc
DataWrite
Data access
Slide 4
Slide 5
9 Number Representation
Arguably the most important topic in computer arithmetic:
Affects system compatibility and ease of arithmetic
Twos complement, flp, and unconventional methods
Topics in This Chapter
9.1 Positional Number Systems
9.2 Digit Sets and Encodings
9.3 Number-Radix Conversion
9.4 Signed Integers
9.5 Fixed-Point Numbers
9.6 Floating-Point Numbers
Oct. 2014
Slide 6
For example:
27 = (11011)two = (124) + (123) + (022) + (121) + (120)
Number of digits for [0, P]: k = logr (P + 1) = logr P + 1
Oct. 2014
Slide 7
1111
1110
15
0001
0010
1
2
14
1101
0011
3
13
1100
12
1011
Turn x notches
counterclockwise
to add x
11
1010
0101
12
11
10
15
0
2
3
4
5
9 8 7
6
9
1001
0100
4
5
10
14
13
8
1000
0110
7
0111
Turn y notches
clockwise
to subtract y
Slide 8
max
Numbers smaller
than max
Overflow region
Numbers larger
than max
Slide 9
Slide 10
Carry-Save Numbers
Radix-2 numbers using the digits 0, 1, and 2
Example: (1 0 2 1)two = (123) + (022) + (221) + (120) = 13
Possible encodings
(a) Binary
(b) Unary
0
1
2
0
1
1
2
MSB
LSB
Oct. 2014
00
01
10
11 (Unused)
1 0 2 1
0 0 1 0 = 2
1 0 0 1 = 9
00
01 (First alternate)
10 (Second alternate)
11
First bit
Second bit
1 0 2 1
0 0 1 1 = 3
1 0 1 0 = 10
Slide 11
Carry-save
input
Carry-save
addition
Two
carry-save
inputs
Binary input
Carry-save
0
output
Carry-save
addition
a. Carry-save addition.
Slide 12
Slide 13
( xk 1 xk 2 x0 ) r xk 1r
k 1
xk 2 r
k 2
x1r x0
x0 r ( x1 r ( x2 r ()))
Justifying Horners rule.
x
Binary representation of x/2
Figure 9.4
Oct. 2014
x mod 2
Slide 14
Signed-magnitude representation
+27 in 8-bit signed-magnitude binary code 0 0011011
27 in 8-bit signed-magnitude binary code 1 0011011
27 in 2-digit decimal code with BCD digits 1 0010 0111
Biased representation
Represent the interval of numbers [N, P] by the unsigned
interval [0, P + N]; i.e., by adding N to every number
Oct. 2014
Slide 15
Twos-Complement Representation
With k bits, numbers in the range [2k1, 2k1 1] represented.
Negation is performed by inverting all bits and adding 1.
0000
1111
1
1110
+0
0001
+1
0010
+2
2
1101
0011
3
1011
1100
Turn x notches
counterclockwise
to add x
+3
+4
+5
5
6
1001
0100
4
5
6
0101
1
0
2
3
4
5
7 8 7
+6
7
1010
2
3
8
1000
+7
0110
0111
Turn 16 y notches
counterclockwise to
add y (subtract y)
Slide 16
(i.e., 75)
Slide 17
k
/
c in
Adder
k
/
k
/
k
/
xy
c out
y or
y
AddSub
Figure 9.6
Oct. 2014
Slide 18
Slide 19
1.110
0.000
+0
.25
0.001
+.125
0.010
+.25
1.101
0.011
.375
1.011
.5
1.100
+.5
0.100
+.625
.625
0.101
+.75
.75
1.010
+.375
.875
1.001
1
1.000
+.875
0.110
0.111
Slide 20
Slide 21
Significand
Exponent base
Exponent
Also, 7E9
Slide 22
Sign Exponent
11 bits,
bias = 1023,
1022 to 1023
Significand
52 bits for fractional part
(plus hidden 1 in integer part)
Figure 9.8
Oct. 2014
Slide 23
Single/Short
32
23 + 1 hidden
[1, 2 223]
8
127
e + bias = 0, f = 0
e + bias = 0, f 0
represents 0.f 2126
e + bias = 255, f = 0
e + bias = 255, f 0
e + bias [1, 254]
e [126, 127]
represents 1.f 2e
Double/Long
64
52 + 1 hidden
[1, 2 252]
11
1023
e + bias = 0, f = 0
e + bias = 0, f 0
represents 0.f 21022
e + bias = 2047, f = 0
e + bias = 2047, f 0
e + bias [1, 2046]
e [1022, 1023]
represents 1.f 2e
min
max
Subnormal
Infinity ()
Not-a-number (NaN)
Ordinary number
Oct. 2014
Slide 24
Slide 25
Outputs
s
c
0
0
1
1
0
1
0
1
0
0
0
1
Inputs
Digit-set interpretation:
{0, 1} + {0, 1}
= {0, 2} + {0, 1}
HA
0
1
1
0
Outputs
cin
cout
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
Figures 10.1/10.2
Oct. 2014
cout
FA
cin
Digit-set interpretation:
{0, 1} + {0, 1} + {0, 1}
= {0, 2} + {0, 1}
Slide 26
Full-Adder Implementations
x
y
HA
c out
HA
c in
x
y
c out
s
(a) FA built of two HAs
x
y
c out
0
1
2
3
0
1
2
3
c in
s
(b) CMOS mux-based FA
c in
s
(c) Two-level AND-OR FA
Slide 27
x31
c32
cout
y31
FA
Oct. 2014
c31
. . .
c2
y1
FA
x0
c1
y0
FA
Critical path
s31
Figure 10.4
x1
s1
c0
cin
s0
Slide 28
15 14 13 12
----------1 0 1 1
11 10 9 8
----------0 1 1 0
7 6 5 4
----------0 1 1 0
cout 0 1 0 1
1 0 0 1
1 1
\__________/\__________________/
4
6
g = xy
Oct. 2014
3 2 1 0
----------1 1 1 0
0
0 0 1 1 cin
\________/\____/
3
2
p=xy
Slide 29
Oct. 2014
Slide 30
Carry is:
0
0
1
1
annihilated or killed
propagated
generated
(impossible)
0
1
0
1
g k1 p k1
xi
g k2 p k2
yi
gi = xi yi
pi = xi yi
g i+1 p i+1
gi
pi
...
...
g1 p1
g0 p0
c0
Carry network
ck
c k1
...
c k2
ci
c i+1
...
c1
c0
si
Figure 10.5 The main part of an adder is the carry network. The rest
is just a set of gates to produce the g and p signals and the sum bits.
Oct. 2014
Slide 31
gk2 pk2
g1
p1
...
ck
ck1
Figure 10.6
Oct. 2014
ck2
c2
c1
g0
p0
c0
Slide 32
Carry is:
0
0
1
1
annihilated or killed
propagated
generated
(impossible)
0
1
0
1
g k1 p k1
gk1 pk1
ck
ck
xi
g k2 p k2
c k1
gi = xi yi
pi = xi yi
g i+1 p i+1
gi
pi
...
...
gk2 pk2
Carry network
ck2
...
c k2
g1 p1
g1
.
ck1
yi
ci
c i+1
p1
.
c2
c1
...
c1
g0 p0
g0
c0
p0
c0
c0
si
Slide 33
c4j+4
c4j+3
g4j+2 p4j+2
c4j+2
g4j+1
p4j+1
c4j+1
g4j
p4j
c4j
One-way street
Freeway
Figures 10.7/10.8
A 4-bit section of a ripple-carry network
with skip paths and the driving analogy.
Oct. 2014
Slide 34
g4j+3 p4j+3
c4j+4
g4j+2 p4j+2
c4j+3
g4j+3 p4j+3
c4j+4
0
1c4j+4
p[4j, 4j+3]
c4j+3
g4j+1
c4j+2
g4j+2 p4j+2
p4j+1
g4j
p4j
c4j+1
g4j+1
c4j+2
p4j+1
c4j
g4j
p4j
c4j+1
c4j
Slide 35
Data in
IncrInit
0
1
k
/
Count
register
D
k
/
c in
_
C
Adder /
k
Update
a
(Increment
amount)
Figure 10.9
Oct. 2014
c out
Slide 36
Substantially
simpler than
an adder
gk1 pk1
ck
xk1
ck
0g1
gk2 pk2
sk1
x1
...
ck1
ck2
c2
xk2
ck1
p1
ck2
sk2
c2
s2
p0
x0
c1
s1
s0
x0
c0
c1
x1
...
0g0
Slide 37
i+1 i
[h, j] = [i + 1, j] [h, i]
Oct. 2014
Slide 38
Carry is:
0
0
1
1
annihilated or killed
propagated
generated
(impossible)
0
1
0
1
g k1 p k1
xi
g k2 p k2
yi
Assuming c0 = 0,
we have ci = g [0,i 1]
g i+1 p i+1
gi
pi
...
...
g1 p1
g0 p0
c0
Carry network
ck
c k1
Figure 10.5
Oct. 2014
...
c k2
ci
c i+1
...
c1
c0
si
Computer Architecture, The Arithmetic/Logic Unit
Slide 39
[6, 6 ]
[5, 5 ]
[4, 4 ] [3, 3 ]
[2, 2 ]
[1, 1 ]
[6, 7 ]
[0, 0 ]
g [1, 1] p [1, 1]
g [0, 0]
p [0, 0]
[2, 3 ]
[4, 5 ]
[0, 1 ]
[4, 7 ]
[0, 3 ]
[0, 7 ]
[0, 6 ]
[0, 5 ]
[0, 4 ] [0, 3 ]
[0, 2 ]
g [0, 1] p [0, 1]
[0, 1 ]
[0, 0 ]
Slide 40
[6, 6]
[5, 5]
[4, 4]
[3, 3]
[2, 2]
[1, 1]
[0, 0]
[7, 7 ]
[6, 6 ]
[5, 5 ]
[4, 4 ] [3, 3 ]
[2, 2 ]
[6, 7 ]
[2, 3 ]
[0, 1 ]
[4, 7 ]
[0, 3 ]
[0, 6]
[0, 5]
[0, 4]
[0, 7 ]
[0, 7]
[0, 0 ]
[4, 5 ]
4-input Brent-Kung
carry network
[1, 1 ]
[0, 3]
[0, 2]
[0, 1]
[0, 6 ]
[0, 5 ]
[0, 4 ] [0, 3 ]
[0, 2 ]
[0, 1 ]
[0, 0 ]
[0, 0]
Slide 41
c8 = g [0,7]
c7 = g [0,6]
c6 = g [0,5]
c5 = g [0,4]
c4 = g [0,3]
c3 = g [0,2]
c2 = g [0,1]
c1 = g [0,0]
Slide 42
[6, 6 ]
[5, 5 ]
[4, 4 ]
[3, 3 ]
[2, 2 ]
[1, 1 ]
[6, 7 ]
[0, 0 ]
g [1, 1] p [1, 1]
g [0, 0]
p [0, 0]
[2, 3 ]
[0, 1 ]
[4, 5 ]
[4, 7 ]
[0, 3 ]
[0, 7 ]
[0, 6 ]
[0, 5 ]
[0, 4 ]
[0, 3 ]
[0, 2 ]
g [0, 1] p [0, 1]
[0, 1 ]
11 carry operators
4 levels
Oct. 2014
[0, 0 ]
17 carry operators
3 levels
Slide 43
g i+3
p i+2
g i+2
p i+1 g i+1 p i
gi
g [i, i+3]
Intermeidte carries
p [i, i+3]
ci
c i+3
c i+2
c i+1
Figure 10.13
Blocks needed in the design of carry-lookahead adders
with four-way grouping of bits.
Oct. 2014
Slide 44
c out
Adder
c in
Version 0
of sum bits
c out
Figure 10.14
Oct. 2014
[a, b]
Adder
[a, b]
c in
Version 1
of sum bits
a
[a, b]
The lower
a positions,
(0 to a 1)
are added
as usual
Slide 45
00, x[30, 2]
0, x[31, 1]
x[31, 0]
32
32 32
Figure 10.15
x[0], 00...0
x[1, 0], 00...0
x[30, 0], 0
x[31, 0]
. . .
32
32
31
32
32
33
. . .
32
62
32
63
Multiplexer
Oct. 2014
00...0, x[31]
Left-shifted
values
32
Slide 46
Arithmetic Shifts
Purpose: Multiplication and division by powers of 2
sra $t0,$s1,2
srav $t0,$s1,$s0
op
31
25
20
rt
15
rd
10
sh
fn
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1
ALU
instruction
op
31
rs
Unused
25
rs
Source
register
20
rt
Destination
register
15
rd
Shift
amount
10
sh
sra = 3
fn
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1
ALU
instruction
Figure 10.16
Oct. 2014
Amount
register
Source
register
Destination
register
Unused
srav = 7
Slide 47
No shift
Logical left
Logical right
Arith right
x[31], x[31, 1]
0, x[31, 1]
x[30, 0], 0
x[31, 0]
32
32
32
32
0
y[31, 0]
0
2
(0 or 2)-bit shift
z[31, 0]
Multiplexer
0
32
2
Figure 10.17
Oct. 2014
(0 or 4)-bit shift
(0 or 1)-bit shift
Slide 48
AND with mask to isolate a field: 0000 0000 0000 0000 1111 1100 0000 0000
Right-shift by 10 positions to move field to the right end of word
The result word ranges from 0 to 63, depending on the field pattern
32-pixel (4 8) block of
black-and-white image:
Row 0
Row 1
Row 2
Row 3
Representation
as 32-bit word:
Hex equivalent:
0xa0a80617
Slide 49
Operand 1
Operand 2
Arith
unit
Logic
unit
Result
Select fn type
(logic or arith)
Slide 50
ConstVar
Shift function
Constant
5
amount
Amount
Variable
amount
00
01
10
11
No shift
Logical left
Logical right
Arith right
Shifter
Function
class
32
5 LSBs
Shifted y
Adder
y
0 or 1
c0
32
32
k
/
c 31
xy
Shift
Set less
Arithmetic
Logic
2
0
Shorthand
symbol
for ALU
MSB
32
32
00
01
10
11
An ALU for
MiniMIPS
Control
c 32
x
Func
AddSub
s
ALU
Logic
unit
AND
OR
XOR
NOR
00
01
10
11
32input
NOR
Ovfl
Zero
Logic function
Zero
Ovfl
Slide 51
Slide 52
x
y
Partial
products
bit-matrix
y0
y
1
y2
y3
Product
Figure 11.1
x
x
x
x
20
21
22
23
Slide 53
2z (1)
0 1 0 1 0
(1)
z
0 1 0 1 0
4
+y1x2
1 0 1 0
2z (2)
0 1 1 1 1 0
(2)
z
0 1 1 1 1 0
4
+y2x2
0 0 0 0
2z (3)
0 0 1 1 1 1 0
(3)
z
0 0 1 1 1 1 0
4
+y3x2
0 0 0 0
2z (4)
0 0 0 1 1 1 1 0
(4)
z
0 0 0 1 1 1 1 0
=========================
Figure 11.2
Oct. 2014
Example 11.1
Position
7 6 5 4 3 2 1 0
=========================
x104
3 5 2 8
y
4 0 6 7
=========================
z (0)
0 0 0 0
4
+y0x10 2 4 6 9 6
10z (1)
2 4 6 9 6
(1)
z
0 2 4 6 9 6
4
+y1x10 2 1 1 6 8
10z (2)
2 3 6 3 7 6
(2)
z
2 3 6 3 7 6
4
+y2x10 0 0 0 0 0
10z (3)
0 2 3 6 3 7 6
(3)
z
0 2 3 6 3 7 6
4
+y3x10 1 4 1 1 2
10z (4)
1 4 3 4 8 3 7 6
(4)
z
1 4 3 4 8 3 7 6
=========================
Slide 54
Twos-Complement Multiplication
Position
7 6 5 4 3 2 1 0
=========================
x24
1 0 1 0
y
0 0 1 1
=========================
z (0)
0 0 0 0 0
4
+y0x2
1 1 0 1 0
2z (1)
1 1 0 1 0
(1)
z
1 1 1 0 1 0
4
+y1x2
1 1 0 1 0
2z (2)
1 0 1 1 1 0
(2)
z
1 1 0 1 1 1 0
4
+y2x2
0 0 0 0 0
2z (3)
1 1 0 1 1 1 0
(3)
z
1 1 1 0 1 1 1 0
4
+(y3x2 ) 0 0 0 0 0
2z (4)
1 1 1 0 1 1 1 0
(4)
z
1 1 1 0 1 1 1 0
=========================
Figure 11.3
Oct. 2014
Example 11.2
Position
7 6 5 4 3 2 1 0
=========================
x24
1 0 1 0
y
1 0 1 1
=========================
z (0)
0 0 0 0 0
4
+y0x2
1 1 0 1 0
2z (1)
1 1 0 1 0
(1)
z
1 1 1 0 1 0
4
+y1x2
1 1 0 1 0
2z (2)
1 0 1 1 1 0
(2)
z
1 1 0 1 1 1 0
4
+y2x2
0 0 0 0 0
2z (3)
1 1 0 1 1 1 0
(3)
z
1 1 1 0 1 1 1 0
4
+(y3x2 ) 0 0 1 1 0
2z (4)
0 0 0 1 1 1 1 0
(4)
z
0 0 0 1 1 1 1 0
=========================
Slide 55
Hi
Shift
Lo
Multiplicand x
Mux
yj
Enable
Select
c out
Figure 11.4
Oct. 2014
Adder
c in
AddSub
Slide 56
Sum
/k
Partial product
/k1
Multiplier
/k1
/k
To adder
yj
Figure11.5
Shifting incorporated in the connections to
the partial product register rather than as a separate phase.
Oct. 2014
Slide 57
High-Radix Multipliers
Multiplicand
Multiplier
x
y
0, x, 2x, or 3x
Product
Slide 58
Tree Multipliers
All partial products
...
...
Large tree of
carry-save
adders
Logdepth
Adder
Logdepth
Product
(a) Full-tree multiplier
Figure 11.6
Oct. 2014
Small tree of
carry-save
adders
Adder
Product
(b) Partial-tree multiplier
Slide 59
Array Multipliers
0
s
x3 0
0
x2 0
0
x1 0
0
x0 0
MA
MA
MA
MA
0
0
Figure 9.3a
(Recalling
carry-save
addition)
s
MA
y0
c
MA
MA
MA
y1
z0
0
MA
MA
MA
MA
y2
z1
0
MA
MA
MA
MA
y3
z2
0
z3
FA
Figure 11.7
Oct. 2014
Our original
dot-notation
representing
multiplication
FA
FA
HA
z7
z6
z5
Straightened
dots to depict
array multiplier
to the left
z4
Slide 60
$s0,$s1
$s2,$s3
$t0
$t1
#
#
#
#
set
set
set
set
Example 11.3
Finding the 32-bit product of 32-bit integers in MiniMIPS
Multiply; result will be obtained in Hi,Lo
For unsigned multiplication:
Hi should be all-0s and Lo holds the 32-bit result
For signed multiplication:
Hi should be all-0s or all-1s, depending on the sign bit of Lo
Oct. 2014
Slide 61
Also, holds
LSB of Hi
during shift
Shift
Mux
$t1 (bit j of y)
yj
Enable
Select
$t2 (counter)
c out
Adder
c in
AddSub
Part of the
control in
hardware
Slide 62
$v0,$zero
$vl,$zero
$t2,$zero,32
$t0,$zero
$t1,$a1
$a1,1
$t1,$t1,$a1
$t1,$t1,$a1
$t1,noadd
$v0,$v0,$a0
$t0,$v0,$a0
$t1,$v0
$v0,1
$t1,$t1,$v0
$t1,$t1,$v0
$t0,$t0,31
$v0,$v0,$t0
$v1,1
$t1,$t1,31
$v1,$v1,$t1
$t2,$t2,-1
$t2,$zero,mloop
$ra
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
initialize Hi to 0
initialize Lo to 0
init repetition counter to 32
set c-out to 0 in case of no add
copy ($a1) into $t1
halve the unsigned value in $a1
subtract ($a1) from ($t1) twice to
obtain LSB of ($a1), or y[j], in $t1
no addition needed if y[j] = 0
add x to upper part of z
form carry-out of addition in $t0
copy ($v0) into $t1
halve the unsigned value in $v0
subtract ($v0) from ($t1) twice to
obtain LSB of Hi in $t1
carry-out converted to 1 in MSB of $t0
right-shifted $v0 corrected
halve the unsigned value in $v1
LSB of Hi converted to 1 in MSB of $t1
right-shifted $v1 corrected
decrement repetition counter by 1
if counter > 0, repeat multiply loop
return to the calling program
Slide 63
Subtracted
bit-matrix
z
Dividend
y 3 x 23
y 2 x 22
y 1 x 21
y 0 x 20
s
Figure11.9
Quotient
Remainder
Slide 64
z (1)
0 1 0 0 1 0 1
(1)
2z
0 1 0 0 1 0 1
4
y2x2
0 0 0 0
y2=0
z (2)
1 0 0 1 0 1
(2)
2z
1 0 0 1 0 1
4
y1x2
1 0 1 0
y1=1
z (3)
1 0 0 0 1
(3)
2z
1 0 0 0 1
4
y0x2
1 0 1 0
y0=1
z (4)
0 1 1 1
s
0 1 1 1
y
1 0 1 1
=========================
Figure 11.10
Oct. 2014
Example 11.5
Position
1 2 3 4 5 6 7 8
==========================
z
.1 4 3 5 1 5 0 2
x
.4 0 6 7
==========================
z (0)
.1 4 3 5 1 5 0 2
(0)
10z
1.4 3 5 1 5 0 2
y1x
1.2 2 0 1
y1=3
z (1)
.2 1 5 0 5 0 2
(1)
10z
2.1 5 0 5 0 2
y2x
2.0 3 3 5
y2=5
z (2)
.1 1 7 0 0 2
(2)
10z
1.1 7 0 0 2
y3x
0.8 1 3 4
y3=2
z (3)
.3 5 6 6 2
(3)
10z
3.5 6 6 2
y4x
3.2 5 3 6
y4=8
z (4)
.3 1 2 6
s
.0 0 0 0 3 1 2 6
y
.3 5 2 8
==========================
Slide 65
z (1)
0 0 0 1 1 0 1
(1)
2z
0 0 1 1 0 1
4
y2x2
0 0 0 0
y2=0
z (2)
0 0 1 1 0 1
(2)
2z
0 1 1 0 1
4
y1x2
0 1 0 1
y1=1
z (3)
0 0 0 1 1
(3)
2z
0 0 1 1
4
y0x2
1 0 1 0
y0=0
z (4)
0 0 1 1
s
0 0 1 1
y
0 0 1 0
=========================
Figure 11.11
Oct. 2014
Example 11.6
Position
1 2 3 4 5 6 7 8
==========================
z
.0 1 0 1
x
.1 1 0 1
==========================
z (0)
.0 1 0 1
(0)
2z
0.1 0 1 0
y1x
0.0 0 0 0
y1=0
z (1)
.1 0 1 0
(1)
2z
1.0 1 0 0
y2x
0.1 1 0 1
y2=1
z (2)
.0 1 1 1
(2)
2z
0.1 1 1 0
y3x
0.1 1 0 1
y3=1
z (3)
.0 0 0 1
(3)
2z
0.0 0 1 0
y4x
0.0 0 0 0
y4=0
z (4)
.0 0 1 0
s
.0 0 0 0 0 0 1 0
y
.0 1 1 0
==========================
Slide 66
Signed Division
Method 1 (indirect): strip operand signs, divide, set result signs
Dividend
z=5
z=5
z = 5
z = 5
Divisor
x=3
x = 3
x=3
x = 3
Quotient
y=1
y = 1
y = 1
y=1
Remainder
s=2
s=2
s = 2
s = 2
Slide 67
Hi
Shift
Lo
Mux 1
Load
Quotient
digit
selector
Divisor x
yk j
Enable
Select
out
Adder
Trial difference
Figure 11.12
Oct. 2014
in
1
(Always
subtract)
Slide 68
Partial remainder
/k
Quotient
/k
MSB
/k
To adder
Figure 11.13 Shifting incorporated in the connections to the
partial remainder register rather than as a separate phase.
Oct. 2014
Slide 69
High-Radix Dividers
x
Divisor
Quotient
Dividend
0, x, 2x, or 3x
s
Remainder
Slide 70
Array Dividers
z7
y3
x3
z6
MS
x1
MS
x0
z5
MS
y2
x2
z4
MS
MS
z3
0
z2
d
MS
MS
MS
z1
y1
MS
MS
MS
MS
Our original
dot-notation
for division
z0
y0
MS
s3
MS
s2
Figure 11.14
Oct. 2014
MS
s1
MS
s0
Straightened
dots to depict
an array divider
Slide 71
$s0,$s1
$s2,$s3
$t0
$t1
#
#
#
#
Lo = quotient, Hi = remainder
unsigned version of division
set $t0 to (Hi)
set $t1 to (Lo)
Example 11.7
Compute z mod x, where z (singed) and x > 0 are integers
Divide; remainder will be obtained in Hi
if remainder is negative,
then add |x| to (Hi) to obtain z mod x
else Hi holds z mod x
Oct. 2014
Slide 72
yk j
Load
$t1 (bit k j of y)
$a0
Divisor
(divisorx x)
Mux
Enable
Quotient
digit
selector
Select
$t2 (counter)
c out
Adder
Trial difference
c in
1
(Always
subtract)
Part of the
control in
hardware
Slide 73
$v0,$a2
$vl,$a3
$t2,$zero,32
$t0,$v0,$zero
$v0,$v0,1
$t1,$v1,$zero
$v0,$v0,$t1
$v1,$v1,1
$t1,$v0,$a0
$t1,$t1,$t0
$a1,$a1,1
$a1,$a1,$t1
$t1,$zero,nosub
$v0,$v0,$a0
$t2,$t2,-1
$t2,$zero,dloop
$v1,$a1
$ra
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
initialize Hi to ($a2)
initialize Lo to ($a3)
initialize repetition counter to 32
copy MSB of Hi into $t0
left-shift the Hi part of z
copy MSB of Lo into $t1
move MSB of Lo into LSB of Hi
left-shift the Lo part of z
quotient digit is 1 if (Hi) x,
or if MSB of Hi was 1 before shifting
shift y to make room for new digit
copy y[k-j] into LSB of $a1
if y[k-j] = 0, do not subtract
subtract divisor x from Hi part of z
decrement repetition counter by 1
if counter > 0, repeat divide loop
copy the quotient y into $v1
return to the calling program
Slide 74
Shift
yk j
Multiplier y
Dou blewi dth pa rtial product z (j)
Load
Shift
Shift
Quotient
digit
selector
Divisor x
Mux
Enable
Multiplicand x
yj
Enable
Mux
Select
Select
Figure 11.12
c
out
Adder
in
Trial difference
x3 0
0
x2 0
0
MA
x1 0
0
MA
x0 0
z7
y3
MA
Adder
c out
1
(Always
subtract)
0
MA
Figure 11.4
y0
x3
z6
MS
x2
x1
z5
MS
c in
x0
z4
MS
MS
0
MA
MA
MA
MA
y1
z0
0
MA
MA
MA
MA
y2
z1
MS
MS
MS
MS
MS
MS
MS
0
MA
MA
MA
MA
y3
MS
z3
FA
FA
HA
z7
z6
z5
Oct. 2014
S traighten ed
dots to depic t
arr ay m ultiplier
to the left
Figure 11.14
z4
Our original
dot-notation
for division
z0
y0
MS
MS
MS
0
FA
z1
y1
MS
z2
z3
z2
y2
O ur o rigin al
dot-n otation
rep res entin g
m ultiplic ation
AddSub
s3
s2
s1
s0
Turn upside-down
0
Straightened
dots to depict
an array divider
Figure 11.7
Slide 75
12 Floating-Point Arithmetic
Floating-point is no longer reserved for high-end machines
Multimedia and signal processing require flp arithmetic
Details of standard flp format and arithmetic operations
Topics in This Chapter
12.1 Rounding Modes
12.2 Special Values and Exceptions
12.3 Floating-Point Addition
12.4 Other Floating-Point Operations
12.5 Floating-Point Instructions
12.6 Result Precision and Errors
Oct. 2014
Slide 76
Sign Exponent
IEEE 754
Format
0, , NaN
Denormals:
0.f 2emin
1.f 2e
Significand
11 bits,
bias = 1023,
1022 to 1023
Denormals allow
graceful underflow
Negative numbers
max
min
FLP
Sparser
Overflow
region
Denser
Positive numbers
min +
FLP +
Denser
Figure 12.1
Oct. 2014
Underflow
example
Sparser
Underflow
regions
Midway
example
max +
Overflow
region
Typical
example
Overflow
example
Slide 77
Round-to-Nearest (Even)
rtnei(x)
rtni(x)
x
4
Oct. 2014
Figure 12.2
x
2
Slide 78
Directed Rounding
ritni(x)
rutni(x)
x
4
x
4
Slide 79
Slide 80
Exceptions
Undefined results lead to NaN (not a number)
(0) / (0) = NaN
(+) + () = NaN
(0) () = NaN
() / () = NaN
(+) + ()
0
0 0 or
Operand < 0
Slide 81
Numbers to be added:
x = 25 1.00101101
y = 21 1.11101101
Operands after alignment shift:
x = 25 1.00101101
y = 25 0.000111101101
Result of addition:
s = 25 1.010010111101
s = 25 1.01001100
Figure 12.4
Oct. 2014
Operand with
smaller exponent
to be preshifted
Extra bits to be
rounded off
Rounded sum
Slide 82
Inp ut 1
Hardware for
Floating-Point
Addition
Inp ut 2
Unpack
Signs Exponents
Significands
AddSub
Mu x
Sub
Possible swap
& compleme nt
Align
significands
Control
& sign
logic
Add
Norma lize
& round
Figure 12.5
Simplified schematic of
a floating-point adder.
Sign
Exponent
Significand
Pack
Outp ut
Oct. 2014
Slide 83
Floating-point multiplication
Floating-point division
when e is even
when e is odd
Slide 84
Hardware for
Floating-Point
Multiplication
and Division
Input 1
Input 2
Unpack
Signs Exponents
Significands
MulDiv
Multiply
or divide
Control
& sign
logic
Normalize
& round
Sign
Exponent
Significand
Pack
Output
Oct. 2014
Slide 85
op
$f0,$f8,$f10
$f0,$f8,$f10
$f0,$f8,$f10
$f0,$f8,$f10
$f0,$f8
25
ex
20
#
#
#
#
#
ft
set
set
set
set
set
15
$f0
$f0
$f0
$f0
$f0
fs
to
to
to
to
to
10
($f8) +fp
($f8) fp
($f8) fp
($f8) /fp
($f8)
($f10)
($f10)
($f10)
($f10)
fd
fn
0 1 0 0 0 1 0 0 0 0 x 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x x x
Floating-point
instruction
s=0
d=1
Source
register 2
Source
register 1
Destination
register
add.* = 0
sub.* = 1
mul.* = 2
div.* = 3
neg.* = 7
Slide 86
Memory
Loc
Loc
m 8 m 4
up to 2 30 words
Coprocessor 1
...
EIU
$0
$1
$2
(Main proc.)
$31
ALU
Execution
& integer
unit
(Coproc. 1)
Integer
mul/div
FP
arith
Hi
Lo
FPU
$0
$1
$2
Floatingpoint unit
$31
Pairs of registers,
beginning with an
even-numbered
one, are used for
double operands
TMU
Chapter
10
Chapter
11
Figure 5.1
Oct. 2014
Chapter
12
Slide 87
op
$f0,$f8
$f0,$f8
$f0,$f8
$f0,$f8
$f0,$f8
$f0,$f8
25
ex
#
#
#
#
#
#
20
set
set
set
set
set
set
ft
$f0
$f0
$f0
$f0
$f0
$f0
15
to
to
to
to
to
to
fs
single(integer $f8)
double(integer $f8)
double($f8)
single($f8,$f9)
integer($f8)
integer($f8,$f9)
10
fd
fn
0 1 0 0 0 1 0 0 0 0 x 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 x x x
Floating-point
instruction
Figure 12.8
Oct. 2014
*.w = 0
w.s = 0
w.d = 1
*.* = 1
Unused
Source
register
Destination
register
To format:
s = 32
d = 33
w = 36
Slide 88
$f8,40($s3)
$f8,A($s3)
$f0,$f8
$f0,$f8
$t0,$f12
$f8,$t4
op
25
20
ft
fs
10
fd
fn
0 1 0 0 0 1 0 0 0 0 x 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0
Floating-point
instruction
31
ex
#
#
#
#
#
#
op
s=0
d=1
25
rs
Unused
20
rt
Source
register
15
rd
Destination
register
10
sh
mov.* = 6
fn
0 1 0 0 0 1 0 0 x 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Floating-point
instruction
Figure 12.9
Oct. 2014
mfc1 = 0
mtc1 = 4
Source
register
Destination
register
Unused
Unused
Slide 89
L
L
$f0,$f8
$f0,$f8
$f0,$f8
op
25
20
operand / offset
15
0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 x 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
Floating-point
instruction
31
rs
#
#
#
#
#
op
bc1? = 8
25
ex
true = 1
false = 0
20
ft
Offset
Correction: 1 1 x x x 0
15
fs
10
fd
fn
0 1 0 0 0 1 0 0 0 0 x 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
Floating-point
instruction
Figure 12.10
Oct. 2014
s=0
d=1
Source
register 2
Source
register 1
Unused
c.eq.* = 50
c.lt.* = 60
c.le.* = 62
Slide 90
Floating-Point
Instructions of
MiniMIPS
Copy
Table 12.1
Arithmetic
Conversions
Memory access
Control transfer
Oct. 2014
Instruction
Usage
mov.* fd,fs
mfc1 rt,rd
mtc1 rd,rt
add.* fd,fs,ft
sub.* fd,fs,ft
mul.* fd,fs,ft
div.* fd,fs,ft
neg.* fd,fs
c.eq.* fs,ft
c.lt.* fs,ft
c.le.* fs,ft
cvt.s.w fd,fs
cvt.d.w fd,fs
cvt.d.s fd,fs
cvt.s.d fd,fs
cvt.w.s fd,fs
cvt.w.d fd,fs
lwc1 ft,imm(rs)
swc1 ft,imm(rs)
bc1t L
bc1f L
ex fn
#
0
4
#
#
#
#
#
#
#
#
0
0
1
1
0
1
rs
rs
8
8
Slide 91
6
0
1
2
3
7
50
60
62
32
33
33
32
36
36
Compute a + b
25 0.00000011
a+b = 22 1.10000000
c =-22 1.01100101
Compute (a + b) + c
22 0.00011011
Sum = 26 1.10110000
Compute a + (b + c)
25 0.00000000
Sum = 0 (Normalize to special code for 0)
Oct. 2014
Slide 92
b
a
Possible remedies
Carry extra precision in intermediate results (guard digits):
commonly used in calculators
Use alternate formula that does not produce cancellation errors
Certifiable arithmetic with intervals
A number is represented by its lower and upper bounds [xl, xu]
Example of arithmetic: [xl, xu] +interval [yl, yu] = [xl +fp yl, xu +fp yu]
Oct. 2014
Slide 93
Slide 94
Input x
k - h bits
xL
xL
Table
for a
f(x)
Table
for b
Best linear
approximation
in subinterval
Multiply
x
xH
Add
Output
Figure 12.12
Oct. 2014
f(x)
Slide 95