Multiplication Paper

International Journal of Artificial Intelligence and Mechatronics
Volume 4, Issue 3, ISSN 2320 5121
Computational Time, Latency, Throughput

Improvement of Digit-Serial/Parallel Finite Field
Multiplier
Vinay Patel
Vinod Pathak
Student, Department of ECE

Lakshmi Narain College of Technology, Bhopal (M.P), India
Email: vinaypatel28290@gmail.com
Asst. Prof., Department of ECE

Lakshmi Narain College of Technology, Bhopal (M.P), India
Email: vpathak65@gmail.com
Abstract In this work a novel recursive decomposition

algorithm for Redundant Basis (RB) multiplication is use to
obtain high-throughput digit-serial implementation. A
parallel architecture using Manchester carry chain adder is
design for this purpose. Since it gives faster addition in
multiplication process with less number of transistors. The
carries of this adder are computed in parallel by two
independent 4-bit carry chains which reduce the carry chain
length. The circuit is propose for 8-bit adder module with
significant operating speed improvement compared to the
corresponding adders based on the standard 4-bit MCC
adder module. In this work the length of input sequence for
multiplication is from two to eight.
In [2] a low power and low area array multiplier with

using transmission gate with carry save adder is proposed.
Their propose circuit bypass the final addition stage of the
multiplier as compare to the conventional parallel array
multiplier. The conventional and proposed multiplier both
are design with 16Transistor full adder. By using
Transmission Gate in the design the number of transistor
requirement will reduce up to 56 transistors [2].
Diogo Brito, Taimur G. Rabuske, Jorge R. Fernandes,
Paulo Flores, and Jose Monteiro work on delay, area and
energy consumption constraint in IC design using
Quaternary Logic. A rise and fall delay calculates up to ps
range for 16X1 multiplexer using quaternary logic circuits.
The required logic to perform the shifts and to decode is
design using quaternary to binary converters and
look up tables. These blocks are implemented keeping
area and energy consumption as low as possible while
satisfying the delay constraints. This decoder is based on
voltage-mode self-referenced comparators that allows the
use of a standard CMOS technology and overcomes
previous design drawbacks. There presented design is a
valid solution to reduce the interconnections impact,
without increasing power consumption or losing
performance. But there design requires excessive number
of transistors for their design
Aravind E Vijayan, Arlene John, Deepak Sen work on
8-bit Vedic Multipliers for Image Processing Application.
Their design implemented on Xilinx Vertex 4 based FPGA
board and the performance was evaluated using Xilinx
ISim simulator. A comparative analysis has been
performed and the proposed architecture was found to be
efficient both in terms of area and total propagation delay
[4].
Keywords Computational Time, Latency, Throughput

Improvement, Digit-Serial/Parallel, Redundant Basis, Finite
Field Multiplier.
I. INTRODUCTION
Multiplication is the most time consuming process in
various signal processing operations like Convolution,
Circular Convolution, Cross Correlation, and autocorrelation, Image processing applications such as edge
detection, microprocessors arithmetic and logical units etc.
The performance of microprocessor is determined by
performance of the multiplier. Multiplier operation is
depends on the speed of adder. Speed of adder is affected
to its path propagation delay. Thus the multiplier is usually
slowest and area consuming element in the system. Our
designs offer tradeoffs between Computational time area,
latency and throughput for performing multiplication. So
to increase the speed of multiplier, it is requiring
improving the speed of addition. In this approach it is
required to find the longest critical paths in the multi-bit
adders and then shortening the path to reduce the total
critical path delay.
II. RELATED WORK LITERATURE REVIEW

Several digit-level serial/parallel structures for RB
multiplier over have been reported in the last years after its
introduction. A bit-serial word-parallel (BSWP)
architecture of RB multiplier is use to reduce the
complexity of implementation and for high-speed
realization. It is observed that the hardware utilization
efficiency and throughput of existing structures can be
improved by efficient design of algorithm and architecture
[1].
III. ARITHMETIC ADDITION

The multiplier module is design using continuous
addition of binary numbers. For this we can design the
Manchester carry full adders which is the series
connection of two half adder . The half adders are design
using xor, and , not gates. We will design the basic logic
gate cells and then by regularity and modularity methods
we can design four bit digital multiplier logic. We have
proposed an efficient recursive decomposition scheme for
digit-level RB multiplication, and based on that we have
derived parallel algorithms for high throughput digit-serial
multiplication. In the proposed RB multipliers, both
operands and are decomposed into a number equal to A.of
Copyright 2015 IJAIM right reserved

114

blocks to achieve digit-serial multiplication, and after that
the partial products corresponding to these blocks are
added together to obtain the desired product word.
Addition rules are the same as in the decimal system.
The sum or product of two digits may only produce one or
two digit numbers. In the latter case, if necessary, the first
digit is carried over to the next operation (on the left.) For
example, in base 7, addition of number (36)7 and (144)7 is
36 + 144 = 213. Indeed, from right to left, 6 + 4 = 13.
Then 3 + 4 + 1 = 11, and finally 1 + 1 = 2.
As everyone knows, 2 + 2 = 4. This is true in all base
systems. For example, in base 4, we have 2 + 2 = 10. In
base 3, 2 + 2 = 11. However, recollect that the number
system (4)10 = (10)4 = (11)3, and everything falls into its
right place again. Numbers equal in one base are equal in
any other base. Conversion between bases does not violate
arithmetic identities. In base 2, 2 + 2 = 4 appears as 10 +
10 = 100 - looking differently but having exactly the same
meaning [4,5].
The same, of course, is true of 2 2 = 4 which is true in
all bases starting with 5. In bases 4,3, and 2 it appears as
2 2 = 10
2 2 = 11
10 10 = 100,
Respectively.
Fig
shows the full adder schematic base on input
bypass logic. By observing the truth table of full adder, it
is shown that when the input B and C is equal to 00 or
11 then the sum output is equal to the binary value of
input A of full adder and carry output is equal to the
binary value of input B. For the input BC equal to binary
logic 01 or 10 , then the sum output is equal to input
A+1 i.e NOT A and the carry is generated with this
addition i.e
This operation is modelled by XOR gate and two 2X1
multiplexers. Thus for binary input BC =00 or 11 the
addition operation is bypass. This will reduce the
computation time and carry propagate time of adder, thus
increase the speed.
The conventional adder requires five number of logic

gates that in turn requires 45 transistors. While in our
design using bypass logic it requires 20 numbers of
transistors. In this bypass logic circuit the two 2X1
multiplexers requires 10 transistors, Xor gates requires 12
transistors, buffer requires 4 transistors and completer
logic requires 6 transistors, i.e total transistor require are
20. This will reduce the area of full adder circuit.
IV. METHODOLOGY
To illustrate this methodology, let us consider the
multiplication of different length of two decimal numbers.
Here we show the implementation between input sequence
length two to eight. In this technique, the numbers to be
multiplied let us say bi and ai are written on the
consecutive sides of the square table. On partitioning the
square into rows and columns, each row/column belongs
to one of the digit of the two numbers to be multiplied
such that every digit of one number has a small square
common to the digit of other number. These small squares
are further divided into two equal parts by crosswise lines.
Now the each digit of one number is multiplied with every
digit of second number and two digit products are placed
in their corresponding square.
Length of input sequence x(n) is = L= 2 Length of input
sequence h(n) is = M+1 = 2
Length of output sequence y(n) is = L+M+1-1 = 3
Multiplicand = [a0, a1]
Multiplier = [b0, b1]
Let us consider the two finite input sequence is of length
two as 99X99 the the result is 9801.
9 (a0)
9 (b0)
9 (b0)
81 (a0b0)
81 (a1b0)
9 (b1)
81 (a0b1)
81 (a1b1)
S0
81 (S00)
162 (S01)
81 (S02)
S1
8 (S10)
17 (S11)
10 (S12)
1 (S13)
S2
9 (S20)
8 (S21)
0 (S22)
1 (S23)
Then their product is evaluated as shown

s00 = a0b0
s01= a0b1+a1b0
s02=a1b1
At row S0, The element S00=a0.b0=81, element
S01=a0.b1+a1b0=81+81=162, element S02=a1.b1=81.
At row S1, The element S13=LSB (S02)=1, element
S12=MSB(S02)+LSB(S01)=2+8=10,
element
S11=
MSB(S01)+LSB(S00)=16+1=17, S10= MSB(S00)=8=8.
At row S2, The element S23=(S13)=1, element
S22=LSB(S12)=0=0, element
S21= LSB(S11)+mSB(S12)=7+1=8,
S20= MSB(S10)+LSB(S11)=8+1=9.
Fig.1. Binary input Bypass full adder logic
This gives the multiplication of 99X99=9801.
115

Similarly let take one more example of two number i.e
83X77 is
8 (b0)
7 (a0)
56 (a0b0)
7 (b0)
56 (a1b0)
3 (b1)
21 (a0b1)
21 (a1b1)
S0
56 (S00)
77 (S01)
21 (S02)
S1
5 (S10)
13 (S11)
9 (S12)
1 (S13)
S2
6 (S20)
3 (S21)
9 (S22)
1 (S23)
throughput since it gives faster addition in multiplication

process with less number of transistor.
REFERENCES
[1]
[2]
The multipliers and adders used are developed from

basic logic gates. The implementation and simulation was
carried out in VHDL and ISim respectively. The first step
carried out was the preparation and simulation of the
various subsystems required to build the multiplier [5,6].
[3]
[4]
[5]
V. ALGORITHM
1.
Take the first n bit element an which is to be

multiplied.
2. Take the second n bit element bn which is to be
multiplied.
3. For n bit multiplication, take n+1 row i.e rowS0, row
S1, row S2.up to row Sn+1.
4. Initially all elemnt of row S are initialize to zero.
5. At row S0, the elements of multiplier is s00 = a0b0,
s01= a0b1+a1b0. s02=anbn.
6. At row S1, The element S1(n+1)=LSB (S0n), element
S1(n)=MSB(S0n)+LSB(S0(n-1)), element S1(n-1)=
MSB(S0(n-1))+LSB(S0(n-2))..
7. Similarly all rows are evaluated.
8. Calculate the computational time.
9. Calculate the latency of multiplication.
10. Calculate the throughput of multiplication.
[6]
Jiafeng Xie, Pramod Kumar Meher, and Zhi-Hong Mao "HighThroughput Finite Field Multipliers Using Redundant Basis for
FPGA and ASIC Implementations " IEEE Transactions On
Circuits And SystemsI: Regular Papers, Vol. 62, No. 1,
January 2015 pp no 110.
Gustavo D. Sutter, Jean-Pierre Deschamps, and Jos Luis Imaa
"Efficient Elliptic Curve Point Multiplication Using Digit-Serial
Binary Field Operations" IEEE Transactions On Industrial
Electronics, Vol. 60, No. 1, January 2013 pp no. 217.
Diptendu Kumar Kundu1, Supriyo Srimani2, Saradindu Panda3,
Prof. Bansibadan Maji4 " Implementation of Optimized High
Performance 4x4 Multiplier using Ancient Vedic Sutra in 45 nm
Technology" IEEE Second International Conference on Devices,
Circuits and Systems (ICDCS) 2014.
Aravind E Vijayan, Arlene John, Deepak Sen "Efficient
Implementation of 8-bit Vedic Multipliers for Image Processing
Application" IEEE International Conference on Contemporary
Computing and Informatics (IC3I) 2014 pp no. 544-550.
Alex Pappachen James, Dinesh S. Kumar, and Arun Ajayan
"Threshold Logic Computing: Memristive-CMOS Circuits for
Fast Fourier Transform and Vedic Multiplication" IEEE
Transactions On Very Large Scale Integration (VLSI) Systems
Jan 2015.
P. E. Allen and D. R. Holberg, CMOS Analog Circuit Design.
New York, NY, USA: Oxford Univ. Press, 2011.
VI. CONCLUSION
In this propose work, the different type multiplication
operation use in convolution, circular convolution, cross
correlation, auto correlation is possible. A fast
computation of multiplication operations of two finite
length sequences implemented with the help of this
proposes algorithm is possible using VHDL language. In
our work we proposed a Manchester carry chain adder
base algorithm for RB multiplication to derive highthroughput digit-serial multipliers. By using the propose
flow diagram we will design high-throughput digit-serial
RB multipliers are derived using hardware description
language (VHDL) to achieve significantly less gate counts
than the existing ones. Moreover, efficient structures with
low register-count have been derived for area-constrained
implementation. The design has an efficient recursive
decomposition scheme for digit-level RB multiplication,
and based on that we have derived parallel algorithms for
high throughput digit-serial multiplication. In our work we
propose Manchester carry chain adder to improve
116

Multiplication Paper

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Multiplication Paper

Caricato da

Copyright:

Formati disponibili

International Journal of Artificial Intelligence and Mechatronics

Volume 4, Issue 3, ISSN 2320 5121

Computational Time, Latency, Throughput

Student, Department of ECE

Asst. Prof., Department of ECE

Abstract In this work a novel recursive decomposition

In [2] a low power and low area array multiplier with

Keywords Computational Time, Latency, Throughput

II. RELATED WORK LITERATURE REVIEW

III. ARITHMETIC ADDITION

Copyright 2015 IJAIM right reserved

International Journal of Artificial Intelligence and Mechatronics

The conventional adder requires five number of logic

Then their product is evaluated as shown

International Journal of Artificial Intelligence and Mechatronics

throughput since it gives faster addition in multiplication

The multipliers and adders used are developed from

Take the first n bit element an which is to be

Potrebbero piacerti anche