Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
DIVISIONS DIFFICULTY
Division is an iterative algorithm where the result from the quotient must be shifted to the remainder using a
Euclidean measure whereas, multiplication can be reduced to a (fixed) series of bit manipulation tricks.
http://scicomp.stackexchange.com/questions/187/why-is-division-so-much-more-complex-than-other-arithmetic-operations
1
3/14/2014
DIVISION
1001ten
1000ten 1001010ten
-1000
Shift divisor right and compare it with current dividend.
10
101 If divisor is larger, shift 0 as the next bit of the quotient.
1010
-1000 If divisor is smaller, subtract to get new dividend and shift 1
10ten as the next bit of the quotient.
2
3/14/2014
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
S Exponent Fraction
1 bit 8 bits 23 bits
1 ()(2 ) 1 (1 + )(2 )
What is the bit pattern for dividing two floating point numbers?
What is the bit pattern for dividing two floating point numbers?
= , ,
SA SB S
1 ()(2 )
0 0 0 2
0 1 1 2 = = 2
2
1 0 1
1 1 0 1 (2 )
=
=
3
3/14/2014
DIVISION METHODS
Digit Recurrence
Functional Iteration
Very High Radix
Variable Latency
DIGIT RECURRENCE
Digit recurrence algorithms use subtractive methods to calculate quotients one digit per iteration.
Restoring Division
Long Division method shown earlier
Non-Restoring Division
Sweeney Robinson Tocher (SRT) Method
Similar to non-restoring division, but it uses a lookup table based on the dividend and the divisor to determine each quotient digit.
Intel Pentiums floating-point division bug was caused by an incorrectly coded lookup table. Five of the 1066 entries had been mistakenly omitted.
4
3/14/2014
NON-RESTORING DIVISION
5
3/14/2014
DIGIT RECURRENCE
Advantages Drawbacks
Works Execution time is relative to the length of numbers
Very simple On 32-bit systems, let alone 64-bit, it can be very taxing.
FUNCTIONAL ITERATION
1
=
6
3/14/2014
NEWTON RAPHSON
FUNCTIONAL ITERATION
Unlike digit recurrence division, division by functional iteration utilizes multiplication as the fundamental
operation. The primary difficulty with subtractive division is the linear convergence to the quotient. Multiplicative
division algorithms, though, are able to take advantage of high-speed multipliers to converge to a result
quadratically. Rather than retiring a fixed number of quotients bits in every cycle, multiplication-based algorithms
are able to double the number of correct quotient bits in every iteration. However, the tradeoff between the two
classes is not only latency in terms of the number of iterations, but also the length of each iteration in cycles.
Additionally, if the divider shares an existing multiplier, the performance ramifications on regular multiplication
operations must be considered. Studies report that in typical floating-point applications, the performance
degradation due to a shared multiplier is small. Accordingly, if area must be minimized, an existing multiplier may
be shared with the division unit with only minimal system performance degradation.
For modern division implementations, the most common method of generating starting approximations is
through a look-up table. Such a table is typically implemented in the form of a ROM or a PLA.
7
3/14/2014
FUNCTIONAL ITERATION
Advantages Drawbacks
Faster than digit recurrence Not as common as digit recurrence; unfamiliar
Does not converge linearly to an answer More difficult to implement
Can potentially reduce latency in terms of the Does not give a remainder at the end
number of iterations, although iterations take more
clock cycles
The term very high radix applies roughly to dividers which determine more than 10 bits of quotient in every
iteration.
Digit recurrence algorithms are readily applicable to low radix division and square-root implementations. As the
radix increases, the quotient-digit selection hardware and divisor multiple process become more complex,
increasing cycle time, area or both. To achieve very high radix division with acceptable cycle time, area, and means
for precise rounding, it is necessary to use a variant of the digit recurrence algorithms, with simpler quotient-digit
selection hardware.
8
3/14/2014
ROUNDING ERROR
For example, the decimal number 0.1 is not representable in binary floating-point of any finite precision; the exact binary representation would have a "1100" sequence continuing
endlessly:
e = 4; s = 1100110011001100110011001100110011...,
where s is the significand and e is the exponent. When rounded to 24 bits this becomes
e = 4; s = 110011001100110011001101,
which is actually 0.100000001490116119384765625 in decimal.
As a further example, the real number J, represented in binary as an infinite series of bits is
11.0010010000111111011010101000100010000101101000110000100011010011...
but is
11.0010010000111111011011
when approximated by rounding to a precision of 24 bits.
In binary single-precision floating-point, this is represented as s = 1.10010010000111111011011 with e = 1. This has a decimal value of
3.1415927410125732421875,
whereas a more accurate approximation of the true value of J is
3.14159265358979323846264338327950...
Advantages Drawbacks
Similar advantages to digit recurrence Similar disadvantages to digit recurrence
Provides a remainder Rounding error is a larger problem
9
3/14/2014
VARIABLE LATENCY
DIVISION CONSIDERATIONS
Instead of dedicating huge areas of silicon to a single-cycle divide unit, many CPU's have more than one unit for a
task which can perform operations together in parallel, but are still optimized for their own specific situations.
Floating point division in the past has not been extremely important, or at least the tradeoff hasnt been worth
taking for computer architects. We are getting to the point where floating point is more commonplace and it
might make sense to make this usage of chip real estate.
Chip makers must decide what to dedicate to division
Area and Power
Time
10
3/14/2014
SOURCES
11