Sei sulla pagina 1di 9

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317

Synthesis and Simulation of Floating Point Multipliers


Dr. P. N. Jain1, Dr. A.J. Patil2, M. Y. Thakre3
1Professor and Academic Dean, Department of E&TC, Shri. Gulabrao Deokar College of Engineering,
Jalgaon, India.
2Principal, Department of E&TC, Shri. Gulabrao Deokar College of Engineering, Jalgaon, India.
3PG Student ,Department of E&TC, Shri. Gulabrao Deokar College of Engineering, Jalgaon, India
Email- jainpnj@gmail.com, principal@sgdcoejalgaon.org, thakre.mayuri23@gmail.com
ABSTRACT
Performance of floating point arithmetic units is of prime importance in several areas of
computing, signal processing and medical imaging. The binary representation of decimal floatingpoint numbers permits an well organized application of the advanced radix independent IEEE
standard for floating-point arithmetic. Multiplication is a representative and core operation
which demands high performance and area efficient implementation. A Binary multiplier is an
integral part of the arithmetic logic unit (ALU) scheme found in various processors. Integer
multiplication is likely to be inefficient and costly, in time and hardware, depending on the
illustration of signed numbers. In this project, methods of implementing floating point
multiplication are explored. 754 standards. The prime objective is to develop a multiplier which
is compliant with IEEE floating-point standard. For this purpose two different algorithms were
studied and the design was developed for a serial multiplier and parallel multiplier. The concept
of control circuit to control the arithmetic network was used for implementation of multiplier.
This work is primarily aimed for implementation of floating-point multiplication on FPGA
platform using VHDL due to the logic resources available and flexibility of implementation. The
Xilinx ISE 9.2i was used for the purpose. Use of VHDL provided a technology independent
hardware design. The design was targeted for the SPARTAN3 FPGA and the device chosen was
XC3S500e-5pq208. A test-bench was designed and the results were verified by hands-on
calculations and published results. The workings of these two multipliers were compared on
basis of device utilization, timing summary and throughput of the design. Improvement in
performance of present multiplier will be possible by pipelining with trade-off related to area and
power.
I. FLOATING POINT MULTIPLICATION
A clean floating point multiplication for two binary numbers can be seen below mathematically.
(F1 X 2E1) X (F2 X 2E2) = (F1 x F2) X (2(E1+ E2) =F x 2E
The basic steps involved in FP multiplication are as follows:
1.
2.
3.
4.
5.
6.

Add exponents
Multiply fractions
If product is zero, adjust for proper zero
Normalize product fraction
Check for proponent overflow or underflow
Round product fraction

27 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
Thus multiplication includes the addition of exponents and multiplication of fractions. The product
fraction is then normalized and exponent overflow or underflow is observed. Finally the product fraction
is rounded. The flow chart diagram for floating point multiplication is shown in Figure 1 [1].

Figure 1: Flow Chart Diagram for Floating Point Multiplication


II. SYSTEM IMPLEMENTATION
The concept of control circuit to control the arithmetic network was used for implementation of
multiplier. This concept is extremely important to coordinate the behavior of the surrounding
subsystems. The use of this control unit or system controller idea leads to systematic and structured
approaches to digital system design [2]. Two different algorithms for high speed hardware multipliers
were studied and chosen for implementation.
1. Serial multiplier
2. Parallel multiplier
2.1 SERIAL MULTIPLIER:
The hardware required to implement the FP multiplier consists of exponent adder and a fraction
multiplier. The project work started with the development of a simple multiplier unit and an adder
circuit for floating point number represented by 4 bit fraction and 4 bit exponent [1]. The basic multiplier

28 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
unit was modified to function for multiplication of signed binary number. The design of serial adder with
accumulator forms section below. The binary multiplier and signed binary multiplier are discussed [1].
This 4 bit multiplier is synthesized, simulated and tested for various combinations and the same system
is extended for 32 bit floating point multiplication.
2.2 Design of an adder for serial multiplier:
A serial adder with accumulator as in [1]was used for this purpose. The block diagram for a 4 bit serial
adder with control circuit is indicated in Figure 2.2.

Figure 2-2: Block diagram for a 4 bit serial adder


Two shift registers are used to control the 4-bit numbers to be added, X and Y. The box at the left end of
each shift register represents the inputs: Sh (shift), SI (serial input), and Clock. When Sh =1 and the clock
is pulsed, SZis entered into x, (or y,) as the contents of the register are shifted right one position. The Xregister serves as the accumulator, and after four shifts, the number X is replaced with the sum of X and
Y. The addend register is connected as a cyclic shift register, so after four shifts it is back to its original
state and the number Y is not lost. The serial adder consists of a full adder and a carry flip-flop. At each
clock time, one pair of bits is added. When Sh = 1, the falling edge of the clock shifts the sum bit into the
accumulator, stores the carry bit in the carry flip-flop, and causes the addend register to rotate right.
Additional connections needed for initially loading the X and Y registers and clearing the carry flipflop[5].
2.3 Design of a Multiplier unit:
The Figure 2.3 shows the hardware required to multiply two 4 bit fractions. Multiplication of two 4-bit
numbers needs a 4-bit multiplicand register, a 4-bit multiplier register, a 4-bit full adder, and an 8-bit
register for the product. The product register present as an accumulator to accumulate the sum of the
partial products. The contents of the product register are shifted to the right each time, as shown in the
block diagram of Figure 2.4 This type of multiplier is sometimes mention as a serial-parallel multiplier,
since the multiplier bits are processed serially, but the addition takes place in parallel. Depending on the
multiplier bit, whether 0 or 1, shift or shift and add operation takes place. Finally the ACC register
contains the product when multiplication is complete.

29 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317

Figure 2-3: 4 bit multiplier


2.4 Design description and testing
The VHDL code for the system uses behavioral style description. Two processes are used in this
description. The main process generates control signals for the system. A second process generates the
control signals for the fraction multiplier. Generation of unwanted latches was avoided by initializing of
output signals. The testing for the 4 bit FP multiplier unit is done to account for all for all the special cases
in combination with positive and negative fractions, as well as positive and negative exponents [8].
III. PERFORMANCE ANALYSIS
This presents synthesis and simulation outcomes of a floating point multipliers.
3.1 RESULTS FOR OF SERIAL FLOATING POINT MULTIPLIER
A 4 bit multiplier is synthesized and tested thoroughly and the same design is extended for a 32 bit
multiplier.

3.1.1 Synthesis Outcomes:


The Table 3.1 and Table 3.2 show the device utilization and timing summary for the 4 bit and 16 bit serial
floating multiplier. It can be seen that the 16 bit multiplier has much higher device utilization which is
obvious. The devices utilization is proportional to bit size of multiplier. The timing summary shows that
the maximum clock frequency at which multiplier will work is almost same [6]. It must be recollected
that since this is serial multiplier, the final output i.e. product will be generated only after complete
multiplication. Thus if it is N bit multiplier, it is likely to take (2N+1) cycle for final product. Thus
throughput of the machine is low [7].
Table 3.1: Device Utilization for 4 bit and 32 bit serial Multiplier
Logic Utilization
Number of Slice Flip Flops

30 | 2015, IJAFRC All Rights Reserved

4 bit multiplier 32 bit multiplier


18
61

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
Number of 4 input LUTs
Number of occupied Slices
Number of bonded IOBs
Total equivalent gate count for design

61
33
32
628

223
113
223
2316

Table 3.2: Timing Summary for 4 and 32 bit Serial Multiplier


Timing Parameter
4 bit multiplier 32 bit multiplier
Minimum period:
8.004 nS
8.076 nS
Maximum Frequency 124.938 MHz
123.823MHz
3.1.2 Simulation waveforms:
The Figure 3.1. Indicates the results obtained after simulation of the 4 bit multiplier. The inputs are
tested for the combinations [3][4]. The signals have the interpretation as in Table 3.1.The start signal
when high is generated to start the multiplication which takes 9 cycles to complete the process. The done
signal is asserted when the multiplication is complete [5].

Figure 3-1: Simulation Results for 4 bit serial Multiplier


IV. PARALLEL MULTIPLIER
The goal of the design, to develop an IEEE compliant 32-bit floating point multiplier core was satisfied
using a simple methodology where VHDL operators were directly used [11]. The core was desired to
implement all four rounding modes, round to nearest, round into +inf, round into -inf and round to zero.
All exceptions had to be handled and reported according to the IEEE standard. An arrangement is made
using a generic constant K which can be set to 32 or 64, so that this core can be extended to work with
double precision format [12][13].
4.1 Microarchitecture of Parallel Multiplier
Figure 4.1 indicates a simple FP Multiplier which is developed based on parallel. It consists of Pre
Normalize block, Multiplier, Post Normalize - Round unit and an exception unit [14]. The two floating
point numbers opa and opb serve as input to the floating point multiplier.

31 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
A pre-normalization process is carried out in which the sum/difference of exponents is computed;
checking for exponent overflow, underflow condition and INF value on an input is done.

Figure 4-1: Block Diagram for Floating Point Multiplier


4.1 Design description and testing
This multiplier also adopts VHDL for design entry code. The system uses code which is mixed type
description, It is combination of dataflow and behavioral style description. Multiple processes are used in
this design. The process are used to load the input, adjust the pre- normalize the input, unpack the
floating point number. After initial checking and multiplication, the process is used to post- normalize
and pack the generated output. The round-off and exception handling is also done. The simple strategy is
used for testing where a combination of inputs is given. This allows exploring features of the design.
V. RESULTS FOR PARALLEL FLOATING POINT MULTIPLIER:
The single precision IEEE compliant multiplier referred as Parallel Multiplier.
5.1.1Synthesis Outcomes:
Table 5.1 presents the device utilization summary for the Parallel Multiplier.
Table 5.1: Device Utilization Summary for Parallel Multiplier

32 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
Logic Utilization
Number of Slice Flip Flops
Number of 4 input LUTs
Number of occupied Slices
Number of bonded IOBs
Total equivalent gate count for design

Parallel Multiplier
96
256
148
104
2985

Table 5.2 presents the timing summary indicating the highest possible speed for the Parallel Multiplier
can work for the XC3S500e.
Table 5.2: Timing Summary for Parallel Multiplier
Timing Parameter
Minimum period:
17.323 nS
Maximum Frequency 57.725 MHz

5.1.2 Simulation waveforms


The Figure 5.1 is the simulated output for the Parallel Multiplier. The inputs are tested for the
combinations as in Table 4.8. The results in simulation waveform are similar to that of hand calculations
which are verified [9][10] and presented. The signals have the interpretation as in the Table. Every time
the ce signal is made high, the operands are loaded into the multiplier on rising edge of clock. The
computation takes place using multiplication operator in a single cycle. The done signal is asserted only
after successful, valid multiplication operation[15][16]. In case of exception handling, input operand
being zero or NaN, this done signal remains low.

Figure 5-1: Simulation waveform for the Parallel Multiplier

VI. COMPARISON BETWEEN SERIAL AND PARALLEL MULTIPLIER

33 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
Table 6.1 and Table 6.2 presents the comparison of device utilization and timing summary for the serial
and the parallel multiplier. The serial multiplier is a simple multiplier which works with serial adder and
a serial parallel multiplier. The parallel multiplier is a single precision IEEE compliant multiplier. It
makes use of addition and multiplication operands to serve the purpose. Both work with 32 bits, have
exception handlers, so not much difference is observed between device utilization resources.
Table 6.1: Logic Utilization for Serial multiplier and Parallel multiplier
Logic Utilization
Number of Slice Flip Flops
Number of 4 input LUTs
Number of occupied Slices
Number of bonded IOBs
Total equivalent gate count for design

Serial multiplier
61
223
113
223
2316

Parallel multiplier
96
256
148
104
2985

Table 6.2: Timing Summary for Serial multiplier and Parallel multiplier
Logic Utilization
Serial multiplier Parallel multiplier
Minimum period:
8.076 nS
17.323 nS
Maximum Frequency 123.823 MHz
57.725 MHz
VII. CONCLUSION
This work presented design, synthesis and simulation of a 32 bit floating point multipliers. Two different
algorithms, serial and parallel for high speed hardware multipliers were studied and chosen for
implementation. The serial multiplier used an add and shift algorithm for multiplication, while parallel
multiplication uses multiplication operand. The parallel multiplier is a IEEE compliant 32-bit floating
point multiplier satisfying all the requirements of rounding and exception handling. The concept of
control circuit to control the arithmetic network was used for implementation of multiplier.
The design were developed using Xilinx ISE environment and VHDL was used for design entry. The
modules were targeted for FPGA implementation and XC3S500e was chosen for this purpose. The
proposed designs were exhaustively tested and the calculations were verified with previous results.
A comparative analysis was done for both the multipliers. The device utilization is almost the same
considering the features of parallel multiplier. The serial multiplier operates at high speed compared
with the parallel configuration, but the throughput is less.
VIII. REFERENCES
[1] Roth Charles H., Digital System Design Using VHDL. singapore: Thomson, 2001.
[2] William Fletcher, An engineering Approach to Digital Design.: Prentice Hall, 2005.
[3] P Addanki and M Avana Venkat A., "An FPGA based high speed IEEE-754 Double Precision Floating
Point Adder/Substractor and Multiplier Using Verilog," International Journal of Advanced Science
and Technology, vol. 52, pp. 61-74, March 2013.
[4] Alvaro Vazquez, "High Performance Decimal Foating point Units," University of Santiago, PhD thesis
Jan 2009.

34 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 11, November - 2015. ISSN 2348 4853, Impact Factor 1.317
[5] Surapong Pongyupinpanich, "Optimal Design of Fixed-Point and Floating-Point Arithmetic Units for
Scientific Applications," Darmstadt Univeristy, Germany, PhD Thesis 2012.
[6] Hossam A. H. Fahmy, "A Redundant Digit Floating Point System," Stanford University, PhD thesis
2003.
[7] Anjana S and Philip Samuel Pradeep C., "Synthesize of High Speed Floating-point Multipliers Based
on Vedic Mathematics," in ICICT-2014, 2015, pp. 1294-1302.
[8] Galal S and M. Horowitz, "Energy-Efficient Floating-Point Unit," in IEEE Transactions on Computers,
2011, pp. 913-922.
[9] Concordia University, "Floating Point Adders and Multipliers".
[10] Eduardo Sanchez. Floating-Point Multipliers.
[11] Sukhvir Kaur and Parminder Singh Jassal, "Synthesis Of Double Precision Float-Ing Point Multiplier
Using VHDL," Journal of Research in Electrical and Electronics Engineering, vol. 2, no. 2, pp. 33-39,
March 2014.
[12] P.Krishna Kumari, V.Vamsi Krishna,T.S.Trivedi P.Gayatri and V.Nancharaiah, "Design of Floating
Point Multiplier Using Vhdl," International Journal of Engineering Research and Development, vol. 10,
no. 3, pp. 73-78, March 2014.
[13] Bernie New and Bob Slous Tom Kean, "A Fast Constant Coefficient Multiplier for the XC6200," Xilinx,
Application Note.
[14] Baljinder Kaur and Vipasha Thakur, "Review of Booth Algorithm for Design of Multiplier,"
International Journal of Emerging Technology and Advanced Engineering, vol. 4, no. 4, pp. 134-137,
April 2014.
[15] Steve Wong, J Martin Dr David Parent, "A 6 Bit Multiplier for a DSP SOC,".
[16] Prashanth, P.A. Kumar, and G. Sreenivasulu, "Design & implementation of floating point ALU on a
FPGA processor," in International Conference on Computing, Electronics and Electrical Technologies
(ICCEET), Kumaracoil, 2012, pp. 772-776.

35 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

Potrebbero piacerti anche