Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
2
3
8xSRL16
8xSRL16
8xSRL16
8xSRL16
C. MixC
Galois
column tr
column
paramete
calculate
and decry
The m
a constan
c(x)
While
by a fixed
=
This e
implemen
of {0b}
implemen
where
c(x)
Howev
InvMixC
results in
using the
different
Figure
As sh
InvMixC
the logic
number o
Instead
apply aft
be seen,
each of
Moreove
mean onl
0 4
1 5
2 6
3 7
(a)In
Table 1 Result o
Column and In
s Field multip
ransformation
are represen
er in GF ( 2
d by function
yption. The fo
o(x) = o
3
x
3
mix column m
nt polynomial
) = {uS]x
3
+
, in the decry
d polynomial
J(x)
= {ub]x
3
+ {u
equation of in
nted owing to
, {0d}, {09
nt as follow.
J(x) = c(x
= {u8]x
3
+{
(x) =
ver, this
Column is very
n large circui
e method me
method follow
5 Implementation
hown in Figu
Column in this
c and resourc
of slices occup
c(x)
c(x)
J(x)
2
=
d of comput
er c(x)in orde
J(x)
2
has o
them is str
er, {05} coul
ly one multipl
4 8 12 0
9 13 5
6 10 14 10
7 11 15 15
nput(b)Aft
of ShiftRow and I
nverseMixColu
plication is es
n, and in the 3
nted as pol
2
8
). Every
n that is variab
orm of polyno
3
+o
2
x
2
+o
1
multiplied mod
c(x).
+{u1]x
2
+ {u
yption the inv
d (x), shown
) = c
-1
(x)
uJ]x
2
+ {u9]x
nverse mix co
o complicated
9} and {0e}
x) +c(x) +
{u8]x
2
+ {u8
{u4]x
2
+ {u4
method of
y complex, in
it. In this des
entioned abov
wing [4].
n of mix column
column
ure 5, the M
s system are d
ces in order
pied. By using
J(x) = {u1]
J(x)
2
= J(x)
= {u4]x
2
+ {u
ting J(x) dire
er to get the in
only two mul
raightforward
ld equal to
lication need c
4 8 12
9 13 1 1
0 14 2 6 1
3 7 11
erShiftRow(c)Aft
InvShiftRow
umn
ssential for m
2-bit system t
lynomials w
byte could
ble in encrypti
mials is
1
x +o
0
dulo x
4
+ 1 w
1]x + {u2]
verse multipli
by
x + {uc]
lumn is direc
d multiplicati
. It could
(x)
8]x + {u8]
4]
implementi
nefficient whi
sign, instead
ve, we apply
and inverse mix
MixColumn a
esigned to sha
to optimize t
g this relation
]
)
S]
ectly, J(x)
2
c
nverse. As it c
ltiplications a
to impleme
{04}+{01} th
calculate.
0 4 8 12
13 1 5 9
10 14 2 6
7 11 15 3
terInvShiftRow
mix
the
with
be
ion
with
ied
ctly
ion
be
ing
ich
of
y a
and
are
the
can
can
and
ent.
hat
sy
co
M
D
ex
ev
ca
w
re
w
st
be
ut
co
w
co
D
an
R
Fi
w
de
R
fir
th
on
pr
an
co
M
E.
en
m
co
as
co
key
re
By using th
ystem is mu
omponent occ
MixColumn an
D. Key Expans
Generally th
xpansion. The
very block of
an change key
way will repea
esult in speed
way is to comp
ore all keys i
efore key add
tilising the
omponent.
In order to a
way is used.
omponent ca
Decryption uni
nd then store
AM(RAMB1
igure 6.
Figu
As a new ke
will store in 3
elay. When th
otWord, S-Bo
rst column re
hat, the new c
ne clock and
rocess is repea
nd stored. The
olumn being c
Machine.
. Control Un
The contro
ncryption/decr
multiplexers. T
ome from the
s a master
ontroller also
y_in
set
his method, t
uch reduced
cupies total
d InvMixColu
sion
here are two w
e first way is
encrypted da
y very fast w
at key schedu
reduction and
plete whole k
into block ram
dition. This s
resource,
achieve low-ar
By using th
an be share
t. We first pre
them in a 5
6_36), the d
ure 6 Structure of
ey come in, t
-deep shift re
he fourth colu
ox and then a
eading from th
column create
add with the
ated until all 1
e control of R
calculate are
it
ol of data
ryption is do
The coltrol si
Finite State
controller. M
used to contr
FSM
Ro
Rcon
the complexi
d. As a re
of 56 slices
umn.
ways to imple
s process key
ata. Using this
with no delay
ule every tim
d inefficient.
key expansion
m, then read
second way
especially
rea degsign, th
his way, the
ed with En
ecompute all r
12x9 single-p
diagram is s
Key Expander
the first three
egister after o
umn flow in,
add with Rco
the shift regis
ed will be de
e second colu
10 round keys
Rcon and the c
done by a Fi
a path us
one by contro
ignals for mu
Machine whi
Moreover, thi
rol the round
del otWord
SBox
ty of the
sult, this
for both
ement key
y schedule
s way one
y, but this
me, which
The other
n first and
them just
also help
SubByte
he second
SubByte
ncryption/
round key
port block
shown in
e columns
one clock
it will do
n and the
ster. After
elayed by
umn. This
s are done
chosen for
inite State
sing for
olling the
ultiplexers
ich works
is master
d key read
ay
3deep
SRL16
key_
order during the reading from block RAM. In this
design, the controller is a 256-state FSM, which is
implemented using the 8-bit counter.
In general, the value of 8-bit counter is
correspond to the address of block RAM. Using
this way, the round key can be easily read from
block RAM then added into the data.
In order to control the datapath, multiplexers can
be easily control by a single Encryption/Decryption
signal (E/D) using simple logic gates. Moreover, in
case of decyption, the round key is needed to be
read in reverse order. This can be done by using a
simple substraction circuit to reverse the state of
FSM.
IV. RESULT AND COMPARISON
TABLE 2 PERFORMANCE COMPARISON BETWEEN
AES IMPLEMENTATION
Our P.
Chodo
wiec et
al.[4]
G.
Rouvro
y et al.
[5]
S.
McMill
an et
al.[6]
K. Gaj
et al.
[7]
Device
used
Sparta
n3
Sparta
n2-6
Sparta
n3
Virtex Virtex
-6
Functio
nality
Both
Encry
ption
and
Decry
ption
Both
Encry
ption
and
Decry
ption
Both
Encry
ption
and
Decry
ption
Encry
ption
only
Both
Encry
ption
and
Decry
ption
Key
length
128-
bit
128-
bit
128-
bit
Extern
al Key
Expan
sion
Extern
al Key
Expan
sion
CLB
slices
288 222 163 240 2902
BRAM
s
3 3 3 8 0
Throug
hput
(Mbps)
195 166 208 250 331.5
Clock
(MHz)
130 60 71 136 26
Clock
cycle
per
round
8 4 4 7 1
As shown in Table 2, this design is comparable
to the other available designs. The first issue is
device used. Our design used Spartan3 device,
which are small and low cost FPGA, which is
comparable to the design of [4] and [5]. Whereas
the [6] and [7] used Virtex FPGA which are much
more advance and, thus, more expansive. In term of
functionality, Most design available with
encryption/decryption capability, except that of [6]
which has only encryption core available . In term
of key length, out design, as well as [4] and [5], has
internal key expansion unit which can work with
128-bit key version. However, the design of [6] and
[7] has no internal key expansion, therefore, the
additional key expansion from outside source is
required. In term of area, Our design requires 288
slices with 3 block RAMs, which are a slightly
higher than the other low-area design, such as
[4],[5], and [6]. However, our throughput achieved
is at 195 Mbps, which are even higher than [4]
which is our ancestor. The interesting part of our
result is the maximum clock frequency. In our
design, we can achieve the maximum clock
frequency at 130MHz, which is much higher than
any other design in devices in Spartan family. Our
clock frequency almost equal to the system which
is implemented in Virtex family. By using this
design, the Spartan3 device can work at very fast
speed and enable the high clock frequency for other
circuits implemented in the same FPGA.
V. CONCLUSION
In this work, a compact and fast solution of AES
on FPGA was implemented. This design is shown
to be one of the highest throughput per slice as
compared to table above. This implementation was
done on the smallest Spartan-3 FPGA, results in
288 slices occupied with 3 block RAMs to achieve
the throughput of 195Mbps at 130MHz clock
frequency. This design can serve wide range of
embedded system that varies from applications
which is sensitive to latency and need high speed
connection like video conference down to
applications that require low area like smart card.
REFERENCE
[1] S. N. Han and X. J. Li, Area and Power Optimized serial
AES Encrypt/Decrypt Circuit MICROELECTRONICS,
vol. 40, Beijing: Chinese Academy of Sciences, 2010.
[2] W.-K. Chen, Linear Networks and Systems (Book style).
Belmont, CA: Wadsworth, 1993, pp. 123135.
[3] National Institute of Standards and Technology (NIST),
Information Technology Laboratory (ITL), Advanced
Encryption Standard (AES), Federal Information
Processing Standards (FIPS) Publication 197, November
2001.
[4] P. Chodowiec, K. Gaj, P. Bellows and B. Schott,
Experimental Testing of the Gigabit IPSec-Compliant
Implementations of Rijndael and Triple DES Using
SLAAC-1V FPGA Accelerator Board, Information
Security Conference (ISC 2001), Malaga, Spain, 2001.
[5] G. Rouvroy, F.-X. Standaert, J.-J. Quisquater and J.-D.
Legat,Compact and efficient encryption/decryption
module for FPGA implementation of the AES Rijndael
very well suited for small embedded applications, In
Proc. IEEE Int. Conf. on Inf. Tech.: Coding and
Computing, vol. 2, pp. 583587, Las Vegas, NV, USA,
April.2004.
[6] S. McMillan and C. Patterson, JBits Implementations of
the Advanced Encryption Standard (Rijndael), Field-
Programmable Logic and Application (FPL 2001),
Belfast, Northern Ireland, UK, 2001.
[7] K. Gaj and P. Chodowiec, Comparison of the hardware
performance of the AES candidates using reconfigurable
haredware, Third Advanced Encryption Standard (AES3)
Candidate Conference, New York, 2000.