Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Stage 1:
24
z The larger value of exponents is found by 24
comparing the exponents of operand A and operand B. 24 bit multiplication
z If E1 is larger than E2, subtract E2 from E1 in order
to calculate the number of positions to shift the mantissa 25
B to right so as to make mantissa A and mantissa B Carry bit
aligned before addition and subtraction in next stage. processing
12 12
By properly selecting n , we can neglect the
12
8 8 12
2Kx25bit second-order and higher terms in above Taylor series
ROM
1 b
1 with rational approximation. ≈1− l .
25 bit 12 bit 12 bit 1 + bl / bh bh
multiplier multiplier multiplier
So expression (4) can be approximated by equation
25 12 12 12 (9).
Exponent Aligned 12 bit
subtraction addition multiplier 1 . f 1 a h a l a h bl a l bl (9)
= + − × − ×
26 12 1 . f 2 bh bh bh bh bh bh
8 Aligned
Subtraction For the same reason mentioned above, the last term
26 in(9) can also be neglected.
Exponent 2 Leading zero
subtraction detecting The value of 1 is stored in a lookup table. So by using
bh
Normalization
this algorithm, the division algorithm only requires four
multiplications and two additions, which will reduce the
S E Mantissa
chip cost and latency greatly.
Figure.4 Flow chart of the 32 bit floating point divider
D Error analysis
Suppose dividend and divider with single precision are
in the following format. It is obvious that only division algorithm will cause
approximate error. Whether the approximate error can be
m1 = ( − 1) s1 × 2 E 1 × 1 . f 1 (2) acceptable or not? We can use the mathematical method
to find out the maximum absolute approximate error. The
m 2 = ( − 1) s 2 × 2 E 2 × 1 . f 2 (3) following inequalities(10) and (11) are validated for this
infinite Taylor series due to the facts:
so the quotient q can be expressed as (4). b
0 < l < 1 and 1.0 < bh < 2 .
bh
m1 1 . f1 (4) b b b b
q= = ( − 1) S 1− S 2 × ( 2 ) E 1− E 2 × ( l ) 2 − ( l ) 3 + ( l ) 4 − " < ( l ) 2 < ( b ) 2 (10)
m2 1. f 2 bh bh bh bh
l
Fig.5 Split A,B into high parts and low parts While the values of al and bl can be expressed in the
Therefore A and B can be expressed as (5) and (6).
following format.
A = a h + al (5) a l = 0. 0
"
0 xx "
xx (13)
24 − n n
B = b h + bl (6) (14)
b l = 0. 0
"
0
xx
"
xx
24 − n n
Substituting above equations into (4), the last term of
the expression can be rewritten as follows. Using above expressions, the values of (bl )2 and
1 . f1 a a 1 (7) al × bl can be obtained as the following format.
= ( h + l )( )
1. f2 bh b h 1 + bl / b h (15)
( b l ) 2 = 0. 0
"
0
xx
"
xx
a l × bl = 0.0
"
0
xx "
xx (16)
1 + bl / bh
2× ( 24 − n ) −1 2n
given by Therefore the maximum absolute approximate error
can be expressed as (17).
Error = 0. 0
"
0 xx "
xx (17) A Simulation
max
Halt to produce Figure 9. Computation time of FFT between FPU and DSP algorithms
a fatal error
VI. CONCLUSION
Figure 7. Flow chart for testing the floating -point units using DSP
In this paper, a floating-point coprocessor is
V. VERIFICATION AND COMPARISON STUDY successfully configured by FPGA to enhance the
computation capability and flexibility of the digital
FFT algorithm was adopted to verify the correctness of platform for power electronics applications. The
the results obtained by the FPU and make a comparison coprocessor can operate under 25MFLOP.The
between the FPU and DSP algorithms(supported by run computation precision of the coprocessor, though lower
time library) in computation efficiency. All these than IEEE 574 standard in which 0.5ulp is required, is
less than 2ulps, which is also sufficient accurate for
common applications. The platform is being used for
prototype development and implementation of
computationally intensive algorithms for various power
electronics applications.
REFERENCES
[1] Mongkol Konghirun, Longya Xu Jennifer Skinner Gray.
“Quantization Errors in Digital Motor Control Systems”.Power
Electronics and Motion Control Conference,Aug.
2004,pp.1421-1426.
[2] D.D.Bester, J.A. du Toit J.H.R Enslin. “High Performance DSP/FPGA
Figure 8. Proposed digital platform for testing FPUs
controller for Implementation of Computationally Intensive
experiments were carried in the digital platform shown in Algorithms”.IEEE International Symposium on Industrial
Fig(8). The results of FFT calculated by FPU were Electronics,Jul.1998,pp.240-244.
exported by using “data saving” feature available in CCS [3] Habib-ur Rehman, Richard J. Hampo. “ A flexible high
and then analyzed in Matlab. The results of FFT obtained performance advanced controller for electric machines”. Applied
by FPU are totally in consistence with those calculated Power Electronics Conference and Exposition, Feb.
by Matlab. 2000,pp.939-943.
To study the computation efficiency between DSP [4] Joep Jacob,Dirk Detjen, etc. “Rapid Prototyping Tools for Power
algorithms and FPU, FFT algorithm with different points Electronic Systems: Demonstration with Shunt Active Power
was adopted. The time consumed by DSP algorithms and Filters”. IEEE Trans. on Power
FPU was measured by the interval between the very Electronics,Vol.19(2),2004,pp:500-507.
beginning of FFT calculation and the end using DSP T2
[5] Wangjun Lei, Fang Zhuo,etc. “Development of 100KVA Active
timer. Notice that all the operation environments were set
Filter with Digital Controlled Multiple Parallel Power Converters”.
the same for all these tests. Computation time of two
IEEE Power Electronics Specialists Conference, Jun.
methods was shown in Fig(9). From this figure, it is very
2004,pp.1121-1126.
clear that the computation time by FPU is five time less
[6] Albert Austin Liddicoat, “High-performance Arithmetic for
than that of DSP algorithms. If the FPU were interfaced
Division and the Elementary Functions”. Ph.D Dissertation,
to the DSP with 32 bit data width or integrated into the
Stanford University. 2002.
DSP like other peripherals, the computation efficiency
would further improved, since at least two-thirds time are [7] Nabeel Shirazi, Al Walters, Peter Athanas.“Quantitative Analysis
wasted in organizing, feeding and retrieving data to and of Floating Point Arithetic on FPGA based Custom Computing
from FPU. Machines”. IEEE Symposium on FPGAs for Custom Computing
Machines. Apr. 1995,pp.155-162.