Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
The resulting speed in terms of throughput rate and Figure 1. General structure for the AES encryption
implementation area is evaluated and compared with algorithm.
existing design implementations in terms of the same gate
technology factor and pipeline architectural measure in 2.1. General AES structure without pipelining
[12] . We began the design of the AES by analyzing the
basic architecture as introduced in figure 1 [3], which
shows structure of the AES round for encryption. In
In this architecture we implement only one round with
Iterative looping structure; such that, the data-path
addition, let’s define the following notations:
consists of Key Addition, Mix Column, Byte-Substitution,
and Row Shift as clearly demonstrated in Figure 1 [7].
Ngate: Number of cascaded gates
Gate_delay: ALTERA MAX3000A 2-input
2.1.1. Time analysis
AND gate delay, 1.636 nano
second (nsec)
Number_of_rounds: Number of AES rounds The amount of time required to encrypt N blocks for 10
Number_of_blocks: Number of raw data coming rounds can be evaluated using the following:
blocks
N_logest_pipe: Longest cascaded number of T = Number_of_blocks × Number_of_rounds ×
gates between two pipelines, 9 Ngate× Gate_delay (1)
gates
N_BS: Cascaded number of gates for
ByteSubstitution, 33 gates For one block of input data which is equal to 16 byte of
N_SR: Cascaded number of gates for information:
ShiftRowTransformations, 5 gates
N_MC: Cascaded number of gates for T = 1 x 33 x 10 x 1.636 + (N_SR + N_MC +
MixColumns Transformations, 3 N_ARK) x 10 (2)
gates = (54 + K) x 10
N_ARK: Cascaded number of gates for = (540 + 10K) ns
AddRoundKey Transformations, = (540 + 10 x 11 x 1.636) = 719.9 ns
3 gates
K: (N_SR + N_MC + N_ARK) x For 1600 blocks of input data which is equal to 25600
Gate_delay byte of information:
M: Number of pipeline stages
N: Number of data blocks, where T = 1600 x 33 x 10 x 1.636 + (N_SR + N_MC +
each block is 16 bytes of data N_ARK) x 10
= 1600 x (54 + K) x 10 = (864000 + 16000K) (3)
ns = 1,1150,880 ns
For 1600 blocks of input data which is equal to 25600 2.3.1. Time analysis
byte of information:
The amount of time required to encrypt N blocks for 10
T = [(1600 – 1) + 10] x (33 x 1.636 + K) rounds can be evaluated using the following:
= [86886 + 1609K] ns (6)
= 115,735 ns T = [((N x 16/4) -1) + K] x [N_logest_pipe x (7)
Gate_delay] + (N x 16/4
2.3. Proposed AES pipeline structure
For one block of input data which is equal to 16-byte of
information:
In this section, the architecture evaluation of the proposed
design is achieved, where the AES is divided equally into
T = [(16/4 -1) + 11] x [9 x 1.636] + 4
ten pipeline stages with equal gates delay of about nine
= 14 x 15 = 214 ns (8)
gates per stage. In addition, 10 AES blocks are used
which are separated by pipeline stages resulting in total
For 1600 blocks
eleven stages per each round as depicted in Figure 3. In
order to reduce the area and alleviate the processing of all
T = [((1600 x 16/4) – 1) + 110] x [9 x 1.636] +
16-byte (one block) at a single AES, the ByteSub and
(1600 x 16/4) (9)
ShiftRow operation in Figure 1 are reordered as clearly
= 6509 x 15 = 104,035 ns
shown in this figure.
1,000,000
800,000 hardware design.
600,000
400,000
116,867 111,890
200,000 6. REFERENCES
0
General AES Fully pipelining AES Fully pipelining with
without pipelining 10 sub pipelining [1] FIPS PUB 197, Advanced Encryption Standard (AES).
AES National Institute of Standards and Technology, U.S.
Design Name Department of Commerce, November 2001;
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
Figure 13. Total time for 1600 blocks. [2] Kris Gaj, and Pawel Chodowiec, Hardware performance
of the AES finalist's survey and analysis of results. George
Mason University. Proc. 3rd Advanced Encryption Standard
As a result, the following observations can be depicted: (AES) Candidate Conference, New York, 2000:1-5.
[3] A. Elbirt, Recon_gurable Computing for Symmetric-Key
Al-gorithms, Ph.D. thesis, Department of Electrical
Throughput for fully pipelining AES is 89.846% Engineering, Worcester Polytechnic Institute, 2002.
times higher in than general AES without pipelining [4] Elena Trichina, and Tymur Kokishko, Secure AES
(iterative looping); however, the area is about Hardware Module for Resource Constrained Devices. Security
90.79% times larger than the general AES without in Ad-hoc and Sensor Networks. First European Workshop.
pipelining. ESAS 2004, Heidelberg ,Germany, August 6, 2004; (3313):215-
229.
Throughput for fully pipelining with ten sub [5] Sumio Morioka, and Akashi Satoh, An Optimized S-BOX
pipelining AES is 90.27% times higher than general Circuit Architecture for low power AES Design. IBM Research,
Tokyo Research laboratories, CHES 2002; (2523):172-186.
AES without pipelining, but the area is about 68.75%
[6] N. Sklavos, O. Koufopavlou, "Architectures and VLSI
times larger than the general AES without pipelining. Implementations of the AES-Proposal Rijndael," IEEE
Transactions on Computers, Vol. 51, Issue 12, pp. 1454-1459,
Throughput for fully pipelining with ten sub 2002.
pipelining AES is 4.25% times higher than fully [7] Akashi Satoh, and Sumio Morioka, Unified Hardware
pipelining AES; however, the area is about 56% Architecture for 128-Bits Block Ciphers AES and Camellia.
smaller than the fully pipelining AES IBM Research, Tokyo Research laboratories, CHES 2003;
(2779):304-318.
[8] Wolfram Drescher, Kay Bachmann 1, and Gerhard
Fettweis, VLSI Architecture for Non-Sequential Inversion Over
5. CONCLUSION GF(2m) Using the Euclidian Algorithm. Mobile
Communications Systems, Dresden University of Technology, D
New hardware architecture is proposed for AES - 01062 Dresden. sponsored in part by the Deutsche
algorithm over GF((2)4)2and compared against two AES Forschungsgemeinschaft within the Sonder for schung sbereich
hardware structures which are iterative looping and ten SFB 358.
rounds pipeline approach. The physical implementations [9] Joan Daemen and Vincent Rijmen, The Design of
of the three structures were conducted through FPGA Rijndael, AES - The Advanced Encryption Standard, Springer-
ALTER MAX3000A device. The comparison analysis Verlag 2002; 238.
[10] Pawel Chodowiec, and Kris Gaj, Very Compact FPGA
show that our proposed design has the advantage of
Implementation of the AES Algorithm. George Mason
saving area of about 56% over general ten AES round University, CHES 2003; (2779):319-333.
pipeline structure, and provides a higher throughput of [11] Tim Good, and Mohammad Benaissa, AES on FPGA
about 4.25%. Furthermore, it gives a superior From the Fastest to the Smallest. University of Sheffield,
throughput advantage over iterative looping AES Department of Electrical and computer Engineering, CHES
structure of about 90.27% rate; however, the area is 3.6 2005; (3659):427-440.
times larger than the iterative looping structure. [12] Amos R. Omondi, The Microarchitecture of Pipelined and
The main key advantage for saving area is the Superscalar Computers, Kluwer Academic Publishers, 1999.
rewording of ByteSub and ShiftRow which streamlines
the process of evaluating four bytes of data in parallel
rather than having 16 bytes in parallel with all required
hardware. Another key feature is having high throughput
by partitioning the AES into ten sub-blocks with
intermediate buffers between them; thus, creating a deep
pipelining structure for complete ten AES blocks.
As a result, for applications that starve for
increase instruction throughput and small area overhead,