Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
7, JULY 2014
1593
I. I NTRODUCTION
Manuscript received February 12, 2013; revised July 11, 2013; accepted
July 28, 2013. Date of publication September 9, 2013; date of current version
June 23, 2014.
E. Consoli is with Maxim Integrated Products, Catania 92100, Italy (e-mail:
elioconsoli83@gmail.com).
G. Palumbo is with the DIEEI, Universit di Catania, Catania I-95125, Italy
(e-mail: gaetano.palumbo@dieei.unict.it).
J. M. Rabaey is with the Electrical Engineering and Computer Science
Department, University of California, Berkeley, CA 94720 USA (e-mail:
jan@eecs.berkeley.edu).
M. Alioto is with the Electronics and Computer Engineering Department, National University of Singapore, 117576 Singapore (e-mail:
malioto@ieee.org).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TVLSI.2013.2276100
(a)
(b)
Fig. 2. (a) TGPL topology. (b) Pulse generator topology (area in dashed line
is shareable among multiple cells).
1063-8210 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1594
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
Fig. 3.
Fig. 4.
1595
Fig. 5.
cells).
1596
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
cycle 1
CK
Q
Q=1
Q=0
CPf
CPr
glitch on CPr
R=1
D
R
Fig. 6. Clock phase generator and waveforms defining CPr and CP f pulses.
D=0
CPf,ext
conditional
half
pulse selection
latches
QD M17 CPr,ext M18
M3
CPf D
M15
M2
M16
FALL
M1
PATH
Qn,D
CPr,ext
output
stage
M6
M20
CPr
M22
QD M21 CPf,ext RISE
PATH
M19
Qn,D
D M5
M8
M7
M12
R M11
M4
S
delay
M26 QDM24
M25
M23
M10
M9
M14
M13
Qn
CK
CKn(I)
CKn(III)
CKn(I)
CK(IV)
CPf,ext
CK
CKn(III)
CPr,ext
(1)
+
Dmin,CP3 L Dmin,CSP3 L
3 Cin
3
5 CL
34
Dmin,TGPL
(2)
+
3 Cin
9
where C L and Cin are, respectively, the load and the input
capacitance of the pulsed latch. From (1)(2), CP3 L and
CSP3 L have basically the same minimum DQ delay, as is
expected by considering that they have the same DQ critical
path (M1M8 in Figs. 5 and 8).
From (1)(2), CP3 L and CSP3 L are always faster than
TGPL. Their theoretical maximum speed advantage is about
2.3 and is obtained at light loads (i.e., electrical effort
1597
(a)
Fig. 10.
(b)
(c)
Fig. 9.
Layout under sizing for minimum ED. (a) TGPL. (b) CP3 L.
(c) CSP3 L. (Area in dashed line is shareable among multiple cells.)
TABLE I
A REA C OMPARISON (65 nm, S TD C ELL H EIGHT: 3.9 m)
384:1
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
1598
1599
TABLE II
AVERAGE AND S TANDARD D EVIATION OF M AIN PARAMETERS OF I NTEREST (256 R EPLICAS , M IN .- E D D ESIGN )
TABLE III
C OMPARISON W ITH S TATE OF THE A RT
Fig. 11.
Die photo.
Fig. 12.
1600
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
(a)
Fig. 14.
(b)
Fig. 13. Setup characteristic: CKQ and DQ delay versus CK-D of latches
designed for (a) minimum ED (16 load) and (b) minimum ED3 (64 load).
1601
(a)
(a)
(b)
(b)
Fig. 15.
Fig. 16. Histogram of CP3 L D-Q delay for (a) minimum-ED design and
(b) minimum-ED3 design (256 measurements).
CP3 L and CSP3 L have approximately the same variability as TGPL in regard to setup time and leakage from
Tables II and III. On the other hand, CP3 L and CSP3 L have
similar or 2 worse variability of CKQ delay, compared
to TGPL. From the perspective of VLSI systems timing,
the above-discussed DQ delay variations are more impactful
than CKQ delay variations. Indeed, from Tables II and III,
CKQ variations are smaller than DQ delay variations. In
addition, critical paths typically go through a DQ delay, rather
than CKQ delay (late computations are finished during the
transparency window). As expected, energy variations were
found to be extremely small ( 1%), hence related results
are omitted for brevity. From Tables II and III, CP3 L and
CSP3 L also have 1.72.6 less variations in hold time,
which translates into a proportionally lower number of buffers
inserted by place and route tools at the timing closure design
phase.
For completeness, the proposed class of pulsed latches was
also compared to other existing topologies that cover a much
wider range of applications, from very high performance to
very low energy. In addition to TGPL, we thus considered
STFF for its very high performance [16], TGFF for its high
energy efficiency at moderate performance [17], and ACFF
for its high energy efficiency at low performance targets [18].
The results of the comparison are summarized in Table IV,
where data are normalized to the best, and the results from
1602
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
TABLE IV
AVERAGE AND S TANDARD D EVIATION OF M AIN PARAMETERS OF I NTEREST (256 R EPLICAS , M IN .- E D 3 D ESIGN )
Fig. 17.
1603
WL
3W2
Fig. 18.
(A.2)
+1
+
Dmin,TGPL =
3
3W1 3W2
9
5 WL
5 CL
34
34
(A.3)
+
+
9 W1
9
3 Cin
9
where we considered that the input capacitance Cin in Fig. 17
is equal to the gate capacitance of a transistor with width 3W1 ,
and the load capacitance C L is by definition the gate capacitance of a transistor with width W L .
Finally, the detailed pulse generator sizing is very simple
and herein omitted, as transistors of the output NAND gate
in Fig. 2 must be simply sized to ensure the targeted slope
(i.e., rise/fall time) of signal CP. Commonly adopted values
of the clock slope range from F O3 to F O4 [3], being F O X
the slope of the output waveform of an inverter loaded by X
inverters with the same size. Subsequently, inverters are easily
sized to obtain the targeted transparency window.
g1,FALL =
(A.4a)
h 1,FALL
(A.4b)
p1,FALL
(A.4c)
g1,RISE =
(A.5a)
h 1,RISE
(A.5b)
p1,RISE
(A.5c)
For the second stage, analysis for the fall path leads to
1
3
4 + WL
=
W2
=1
g2,FALL =
(A.6a)
h 2,FALL
(A.6b)
p2,FALL
(A.6c)
g2,RISE =
(A.7a)
h 2,RISE
(A.7b)
p2,RISE
(A.7c)
1604
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 7, JULY 2014
1605
Massimo Alioto (M01SM07) was born in Brescia, Italy, in 1972. He received the Laurea (M.Sc.)
degree in electronics engineering and the Ph.D.
degree in electrical engineering from the University
of Catania, Catania, Italy, in 1997 and 2001, respectively.
He is an Associate Professor with the Department
of Electrical and Computer Engineering, National
University of Singapore, Singapore. He was an
Associate Professor with the Department of Information Engineering, University of Siena, Siena, Italy. In
2013, he was a Visiting Scientist with Intel Labs CRL, Hillsboro, OR, USA,
on ultra-scalable microarchitectures. From 2011 to 2012, he was a Visiting
Professor with the University of Michigan, Ann Arbor, MI, USA, investigating
on active techniques for resiliency in near-threshold processors, error-aware
VLSI design for wide energy scalability, and self-powered circuits. From 2009
to 2011, he was a Visiting Professor with BWRC University of California,
Berkeley, CA, USA, investigating on next-generation ultra-low power circuits
and wireless nodes. In 2007, he was a Visiting Professor with EPFL Lausanne, Lausanne, Switzerland. He has authored or co-authored over 180
publications on journals (60+, mostly IEEE Transactions) and conference
proceedings. He is the co-author of two books Flip-Flop Design in Nanometer
CMOS - from High Speed to Low Energy (Springer, 2013) and Model and
Design of Bipolar and MOS Current-Mode Logic: CML, ECL and SCL Digital
Circuits (Springer, 2005). His current research interests include ultra-low
power VLSI circuits, self-powered and wireless nodes, near-threshold circuits
for green computing, error-aware and widely energy-scalable VLSI circuits,
and circuit techniques for emerging technologies.
Prof. Alioto was a member of the HiPEAC Network of Excellence (EU)
and the MuSyC FCRP Center, USA. From 2010 to 2012, he was the Chair
of the VLSI Systems and Applications Technical Committee of the IEEE
Circuits and Systems Society, for which he was a Distinguished Lecturer
from 2009 to 2010 and a member of the DLP Coordinating Committee from
2011 to 2012. He currently serves as an Associate Editor-in-Chief of the
IEEE T RANSACTIONS ON VLSI S YSTEMS , and served as a Guest Editor
of various journal special issues (including the issue on Ultra-Low Voltage
Circuits and Systems for Green Computing published in 2012 on IEEE
T RANSACTIONS ON C IRCUITS AND S YSTEMS PART II). He serves or has
served as an Associate Editor of a number of journals (IEEE T RANSACTIONS
ON VLSI S YSTEMS , ACM Transactions on Design Automation of Electronic
Systems, IEEE T RANSACTIONS ON CAS - PART I, Microelectronics Journal,
Integration The VLSI Journal, Journal of Circuits, Systems, and Computers,
Journal of Low Power Electronics, and Journal of Low Power Electronics
and Applications). He was a Technical Program Chair of the ICECS in 2013,
NEWCAS in 2012, and ICM in 2010 conferences, and a Track Chair in a
number of conferences (ICCD, ISCAS, ICECS, VLSI-SoC, APCCAS, ICM).