Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
P RATEEK M ISHRA
A D ISSERTATION
P RESENTED TO THE FACULTY
OF
IN
P RINCETON U NIVERSITY
D OCTOR OF P HILOSOPHY
D EPARTMENT OF
E LECTRICAL E NGINEERING
A DVISER : N IRAJ K. J HA
N OVEMBER 2012
c Copyright
Abstract
Moores law has enabled the scaling of CMOS technologies over the past several decades. However,
the scaling of conventional transistors beyond 22nm is limited by various factors, such as power
consumption and process variation effects. With every successive technology generation, leakage
current has been increasing exponentially due to the various short-channel effects, such as threshold voltage (Vth ) roll off, drain-induced barrier lowering (DIBL) and gate-induced drain leakage
(GIDL). Thus, the major challenge in continuing the Moore scaling lies in controlling the shortchannel effects. Double-gate field-effect-transistors (DGFETs) have been proposed as a promising
alternative to the conventional transistor technology. Due to the superior electrostatic integrity of
the channel, provided by the double-gate structure, they can significantly mitigate the effects of
short-channel effects. Thus, they have been proposed as an attractive solution for scaling beyond
22nm. Among DGFETs, FinFETs have recently attracted a lot of attention due to their superior fabricatability. The fabrication process of FinFETs is quite similar to that of conventional transistors.
FinFETs are quasiplanar structures in which the channel is made to stand up on its edge. FinFETs consist of a thin silicon fin around which a gate electrode is wrapped. This results in a
dual/tri-gate structure, depending upon the thickness of the oxide at the top of the channel. FinFETs have also been shown to have a superior ION /IOF F ratio as compared to the conventional
transistor at the same technology node. Hence, FinFETs can be used to increase performance and
reduce leakage current of a chip simultaneously. The two gates of the FinFET can be made independent of each other by etching out the top portion of the FinFET. Such FinFETs have been exploited
by researchers to develop various innovative standard cell designs. Also, the Vth of the front gate
of the FinFET can be controlled by applying a bias to its back gate. Since Vth controls both the
subthreshold leakage and the delay of a logic gate, the back-gate bias can be used as an important
knob to optimize the delay and power of circuits that employ independent-gate FinFETs. Another
important property of FinFETs is that they can be easily fabricated along the < 110 > channel
orientation by rotating the fins by 45o from the < 100 > wafer plane. Since the electron mobility
is maximum along the < 100 > channel orientation and the hole mobility is maximum along the
< 110 > channel orientation, optimized logic gates can be built by fabricating the pull-up network
of the logic gates in the < 110 > channel orientation and the pull-down network in the < 100 >
iii
channel orientation.
In this thesis, we first propose a methodology for low-power FinFET based circuit synthesis
which uses multiple supply and threshold voltages. The scheme is quite different from the conventional multiply supply voltage methods that target power optimization. We also propose a lowpower FinFET based circuit synthesis methodology based on channel orientation optimization. We
investigate various logic design styles that depend on different channel orientations.
Though FinFETs are a promising alternative to conventional transistors, they are still likely to
suffer from the effects of process variations. Process variation can be either environmental or lithographic in nature. Environmental variations can be attributed to both spatial and temporal changes
to temperatures and supply voltages in a chip. Lithographic variations results from an aberration
in the optical lens used to create the mask in the fabrication process. They are manifested both as
systematic and random variations in chip parameters, such as gate length, gate-oxide thickness, fin
thickness, etc. Thus, it is imperative to study the effects of process variation on important FinFET
circuit metrics, such as delay and power.
In this thesis, we study the effects of lithographic variations on FinFET leakage power. We investigate the leakage power of various standard cells under process variations in gate length and fin
thickness. Further, we propose a methodology to analyze leakage power of the full chip under process variations, as well as for a leakage power variation-aware low-power FinFET circuit synthesis.
We also perform a statistical delay characterization of FinFET standard cells under both environmental and lithographic variations. We use a central composite rotatable design under the response
surface methodology to characterize the delay of various standard cells under varying lithographic
and environmental parameter values.
iv
Acknowledgments
Ya devi sarvabhutesu sumati rupen samsthita
namastaseya namastaseya namastaseya namoh namah
First and foremost, I would like to thank the divine mother for all her inspiration, intellect, and
wisdom she bestowed upon me to complete this important task. Next, I would like to pay homage
to Prof. Niraj Jha. I have been very fortunate to have him as my advisor. He has treated me like his
own son. He kept encouraging me whenever I felt depressed or disappointed by events. He is one of
the most fascinating persons I have ever met in my life. His deep understanding and sharp acumen
of circuit design and electronic design automation have provided an excellent basis for this thesis.
I would also like to extend my sincere gratitude to my father, Dr. Ravindra Nath Mishra, and
my mother, Mrs. Sunita Mishra, who have been a tremendous source of love, encouragement, and
inspiration. The support from my parents has helped me finish this industrious work. I would
also like to thank my tauji, Dr. Virendra Nath Mishra, for encouraging me to pursue my dreams.
I would also like to thank my wife, Pallavi, for sticking with me in difficult times through the
course of this thesis. Next, I would like to extend my gratitude to the thesis readers, Prof. Saibal
Mukhopadhyay and Prof. Li-Shiuan Peh, for taking the time out of their busy schedules to go
through my thesis. They provided valuable feedback on my thesis that helped improve its quality. A
special thanks to Sarah Braude and Roelie Abdi for helping me with various non-academic issues.
My stay at Princeton would not have been exciting without the company of some good friends.
Shushobhan, Vaneet, and Aman helped me get acclimatized to Princeton during my initial days
here. Thereafter, they became really close friends and we shared some wonderful times together.
Abhishek, Arnab, Niket, Arun, CJ, and Varun became close friends right from the first year. I have
shared some wonderful times in Princeton with DJ. Our philosphical talks about life made my stay
at Princeton such a wonderful experience. Parthav and Harish were the best roommates one could
ask for. I would also like to thank my labmates, Muzaffer, Najwa, Wei, Sourindra, Meng, Aoxiang,
Chun-Yi, Maxwell, Joseph, and Ting, for fostering an atmosphere of creativity in the lab.
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 Introduction
1.1
1.2
1.3
FinFETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Related Work
10
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.2
FinFET fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.3
13
2.4
FinFET SRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.5
16
2.6
Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
Low-power FinFET Circuit Synthesis using Multiple Supply and Threshold Voltages
19
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.2
Background work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.3
23
3.4
26
3.5
28
3.5.1
28
Optimization flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
3.5.2
30
3.5.3
31
3.5.4
33
3.5.5
36
3.6
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
3.7
Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
4.2
46
4.2.1
46
4.2.2
47
4.2.3
47
Library design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
4.3.1
50
4.3.2
51
52
4.4.1
Optimization flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
4.4.2
52
4.5
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
4.6
Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
4.3
4.4
45
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
5.2
Background work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
5.3
Simulation setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
5.4
61
5.4.1
62
5.4.2
65
5.5
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
5.6
74
vii
5.7
6
Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
Statistical Delay Characterization of FinFET Standard Cells Under Design of Experiments Using Response Surface Methodology
81
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
6.2
Delay modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
6.2.1
83
6.2.2
85
6.3
90
6.4
96
6.5
98
6.6
101
viii
List of Figures
1.1
MEDICI-predicted DIBL and subthreshold swing for DGFETs and bulk silicon
transistor at various channel lengths [1] . . . . . . . . . . . . . . . . . . . . . . .
1.2
IDS -VGS characteristics for DGFETs and bulk-silicon transistors at equalized subthreshold current [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
1.4
FinFET structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
1.6
1.7
Oriented FinFETs with nFinFETs along < 100 > sidewalls and pFinFETs along
< 110 > sidewalls [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1
11
2.2
12
2.3
13
2.4
An SRAM cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.1
Multi-fin FinFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.2
24
3.3
25
3.4
26
3.5
29
3.6
Example circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.7
33
ix
3.8
34
3.9
35
38
40
40
42
43
4.1
48
4.2
49
4.3
50
4.4
Optimization flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
5.1
FinE simulation framework for double-gate circuit design space exploration [7] . .
60
5.2
61
5.3
62
5.4
Matching SG-mode FinFET TCAD simulations with the macromodel for different
LG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5
64
Matching SG-mode FinFET TCAD simulations with the macromodel for different
TSI
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
5.6
66
5.7
67
5.8
68
5.9
68
69
69
. . . . . . . . . . . . . .
5.12 SG-mode NAND I00 distribution predicted by the model and TCAD QMC simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
5.13 I10 distributions for SG-, LP-, and MT-mode NAND gates . . . . . . . . . . . . .
70
71
72
5.16 Spreads in IT OT in the correlated and uncorrelated cases for benchmark circuit c880. 76
5.17 Effect of mixing LP-mode gates into a pure SG-mode c880 benchmark circuit, normalized to the 100% SG-mode case at iso-delay. . . . . . . . . . . . . . . . . . . .
77
5.18 Effect of mixing LP-mode (MT-mode) gates into a pure SG-mode c880 benchmark
circuit, normalized to the 100% SG-mode case with delay slacks. . . . . . . . . . .
78
5.19 Cumulative distribution function of IT OT for 100% SG-mode vs. 40% SG + 60%
LP-mode (MT-mode) gates at iso-delay for benchmark circuit c880. . . . . . . . .
79
6.1
85
6.2
86
6.3
87
6.4
88
6.5
89
6.6
91
6.7
93
6.8
MC and RSM based delay distributions for SG-INV with n assumed to have a
6.9
Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
99
xi
List of Tables
3.1
39
3.2
Area savings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.3
44
4.1
FinFET parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
4.2
55
4.3
55
5.1
60
5.2
5.3
5.4
74
75
Mean and std. deviation of IT OT for ISCAS 85 benchmarks for TSI = 0 and
LG = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
6.1
83
6.2
93
6.3
94
6.4
94
6.5
95
6.6
97
6.7
97
xii
Chapter 1
Introduction
The steady miniaturization of the metal-oxide-semiconductor field-effect transistors (MOSFETs)
with each new generation of CMOS technology has provided us with improved circuit performance
and cost per function over several decades. Transistor scaling has been enabled in the past few years
with the aid of innovative methods, such as shallow junctions and the use of halo doping for channel
engineering. However, three obstacles: (a) subthreshold leakage current, (b) gate-dielectric leakage,
and (c) threshold voltage (Vth ), have become the dominant barrier for further CMOS scaling even
for highly leakage-tolerant integrated circuits, such as microprocessors. The main challenges for the
sub-22nm gate length regime are two-fold: (a) minimization of leakage current, and (b) reduction in
the device-to-device variability to increase yield [8]. Several innovative device structures, such as
ultra-thin body silicon-on-insulator (SOI) and double-gate field-effect transistors (DGFETs), have
been proposed to address these challenges. These devices have an increased surface-to-volume ratio,
which improves device electrostatics, resulting in better short-channel characteristics. FinFETs are
DGFETs in which the channel is made to stand up. Amongst DGFETs, FinFETs have emerged as a
suitable candidate owing to their ease of fabrication in terms of processing and gate alignment [1].
Figure 1.1: MEDICI-predicted DIBL and subthreshold swing for DGFETs and bulk silicon transistor at various channel lengths [1]
Figure 1.2: IDS -VGS characteristics for DGFETs and bulk-silicon transistors at equalized subthreshold current [1]
However, gate oxides cannot be scaled beyond a certain threshold because of the increasing tunneling current associated with smaller gate-oxide thicknesses. Another technique, which is used to
mitigate short-channel effects, is to reduce the depletion width below the channel to the substrate.
A reduced depletion width corresponds to shortened depletion regions and, hence, reduced parasitic
capacitances. This results in improved subthreshold slope in the leakage regime. However, a reduction in the depletion width corresponds to degraded gate influence on the channel, which leads to a
slower turn on/off of the channel region.
In DGFETs, the drain potential does not effect the channel potential because of the proximity of
the second gate. This results in reduced short-channel effects, such as drain-induced barrier lowering
(DIBL) and degraded subthreshold slope (S). Fig. 1.1 shows the MEDICI-predicted DIBL and S at
various effective channel lengths (LEF F ), both for bulk silicon and DG devices [1]. It can be seen
that both DIBL and S are dramatically improved in DGFETs as compared to bulk silicon devices.
Thus, DGFETs can help us extend Moores law beyond the 22nm technology node. Fig. 1.2 shows
the IDS -VGS characteristics of DGFETs and conventional bulk silicon transistors. DGFETs not
only reduce the leakage current, but they also have an improved ION /IOF F ratio.
1.2
DGFETs come in three different flavors, as illustrated in Fig. 1.3. In Type I DGFET, a second gate
is buried in the body of the planar conventional transistor. In Type II DGFET, the silicon body is
rotated to a vertical orientation with the drain and source being on the top and bottom boundaries of
the body, respectively. In Type III DGFET, commonly referred to as the FinFET, the body is made
to stand up, but the drain and source are on either side of the channel instead of being at the top
and bottom. There are four obstacles in manufacturing of DGFETs: (a) fabrication of both the gates
to be of the same size, (b) alignment of the top and bottom gates, (c) alignment of the source and
drain regions to both the gates, and (d) providing for an area-efficient means to connect to the two
gates. Clearly, Type I DGFETs require extra material to be introduced in buried silicon whenever
we need a separate contact to the back gate. Also, Type I DGFETs do not easily meet the first
three requirements of manufacturing [2]. Type II DGFETs have been shown to meet manufacturing
requirement (b) and (d) easily. However, fabricating the source and drain regions such that they are
aligned to the top and bottom gates is difficult [2]. Type III DGFETs (FinFETs) have emerged as the
most promising candidate due to their ease of fabrication, gate alignment and easy access to both
gates.
DGFETs can also be classified as symmetric or asymmetric. Symmetric DGFETs have the
same gate material and oxide thickness for the front and back gates. On the other hand, asymmetric
DGFETs have different strengths for the front and back gates. Different strengths can be obtained by
using different gate-oxide thickness for the front and back gates, or by using materials of different
workfunctions for them. The Vth of the DGFETs can be adjusted through workfunction engineering
of the metal gates. Thus, DGFETs obviate the need for doping of the channel to control the Vth of
the device. This results in no random dopant fluctuation effects in DGFETs. Further, in symmetric
DGFETs, two inversion channels are formed, one on each side of the transistor. However, due to
the thin size of the body, the two channels are effectively merged, providing a single channel. In
asymmetric DGFETs, the channel is only formed near the more conducting gate. The other gate
still contributes to controlling of the channel voltage, but acts as though it has a thicker effective
gate oxide.
1.3 FinFETs
FinFETs are quasiplanar field-effect transistors. The device physics governing the functionality of
FinFETs is exactly the same as that of planar MOSFETs. Fig. 1.4 shows the structure of a FinFET. A
silicon film of thickness TSI is patterned on an SOI wafer. The gate wraps around both sides of the
fin. The channel is formed perpendicular to the plane of the wafer. Its length is shown as LG . This is
the reason that the device is termed quasiplanar. The effective width of a FinFET is 2nHF in , where
n is the number of fins and HF in is the fin height. Thus, wider transistors with higher on-currents
are made possible by using multiple fins. Fig. 1.5 shows the structure of a FinFET employing two
fins. It should be noted that FinFET width is quantized, in terms of the number of fins. This leads to
important design considerations such as functionality, performance and power, which are sensitive
to the ratio [6].
Beyond the technology-driven benefits offered by FinFETs, circuits can also benefit from the
double-gate structure of FinFETs to further optimize power and performance. Etching out the top
part of the FinFET leads to some interesting designs that exploit its independent-gate structure.
Various innovative circuit structures have been suggested in the literature based on independent4
Z
Y
TSI
drain
H Fin
source
LG
drain
front
gate
TSI
source
LG
TSI
HFin
gate
drain
H Fin
source
(a) SG-FinFET
back
gate
(b) IG-FinFET
gate (IG) FinFETs. The FinFETs in which the two gates are shorted are referred to as shorted-gate
(SG). Figs. 1.6(a) and 1.6(b) show the structure of an SG- and IG-FinFET, respectively.
In IG-FinFETs, the Vth of the front gate can be controlled by applying a bias to the back gate.
Since Vth controls the subthreshold leakage and delay its controllability can be a powerful tool for
circuit optimization. Another important characteristic of FinFETs is that they can be fabricated
along the < 110 > channel orientation easily by rotating the fins by 45o from the < 100 > plane.
Electron mobility is highest along the < 100 > plane while the hole mobility is maximum along
the < 110 > plane orientation due to carrier mobility anisotropy in crystalline silicon [9]. Hence,
logic gates with pFinFETs along the < 110 > channel orientation and nFinFETs along the < 100 >
channel orientation are the fastest. Fig. 1.7 shows the nFinFETs and pFinFETs in a < 100 > wafer,
where the nFinFETs sidewalls are oriented in the < 100 > direction while the pFinFETs sidewalls
are oriented in the < 110 > direction [3]. Such a device orientation leads to non-Manhattan layouts,
which might pose an yield issue for sub-wavelength lithography.
Though FinFETs are supposed to mitigate the effects of process variations, they still suffer from
their effects. FinFETs are generally patterned using direct or spacer lithography. Owing to the small
dimensions involved and various factors, such as line edge roughness, both techniques can result in
variations in the values of the chip parameters. Also, the variations can be environmental in nature.
Such variations are generally temporal in nature and can occur at a frequency of nanoseconds to
years [10]. For example, effects, such as negative bias temperature instability (NBTI) and positive
Figure 1.7: Oriented FinFETs with nFinFETs along < 100 > sidewalls and pFinFETs along <
110 > sidewalls [3]
bias temperature instability (PBTI), lead to variations in Vth over the circuit lifetime. On the other
hand, varying computing workload leads to temporal variations in the chip temperature. Since
FinFETs are manufactured on an SOI wafer, heat dissipation issues become an important concern
for FinFETs. Process variations can be classified into different categories:
Systematic vs. random: systematic variations can be modeled using various mathematical
functions. On the other hand, random variations are unpredictable. They cannot be modeled
mathematically. Variations, such as lithography proximity effects, come under the realm of
systematic variations. Dopant fluctuations in the channel are random in nature.
Inter-die vs. intra-die: variations can be classified as inter-die or intra-die depending on the
spatial scale of the variation. Inter-die variations correspond to variation of a parameter value
across nominally identically die. Such variations may be die-to-die, wafer-to-wafer or even
lot-to-lot. Intra-die variations, on the other hand, correspond to spatially distributed parameter
variation inside a die. Intra-die variations are generally spatially correlated, i.e., devices in
close proximity get affected similarly.
Process vs. environmental: variations, which occur at runtime, are classified as environmental.
On the other hand, variations, which occur during the manufacturing of FinFETs, are termed
process variations.
In this thesis, we develop variation-aware logic synthesis methodologies, which work across all
three categories of variations.
7
Recently, the semiconductor industry has shown a lot of interest in FinFETs. TSMC plans
to have 14nm FinFET chips in full production on 450mm wafers by 2015 or 2016 [11]. A fully
functional FinFET SRAM at the 45nm node was announced by Samsung in 2005 [12]. A research
team from IBM Research, GlobalFoundries, Toshiba and NEC produced an SRAM cell with an area
of 0.063 square microns using FinFETs and optical lithography at the 22nm technology node [13].
The team claimed that the cell area produced by its work is the smallest SRAM cell produced using
optical lithography. The cell was shown to be operational down to the supply voltage of 0.4V.
Infineon fabricated a fully functional chip employing 3000 FinFETs in 65nm SOI technology [14].
A fully functional SRAM at the 22nm technology node was demonstrated by Intel [15]. It uses a
variant of FinFETs called tri-gate transistors. Indeed, Intel has announced a complete transition to
tri-gate chips at the 22nm node. Thus, it can be seen that many major semiconductor companies
have taken interest in multi-gate transistors, most notably FinFETs, to address the challenges posed
by the scaling of the conventional transistor.
cusses a mechanism called TCMS (Threshold Control through Multiple Supply Voltages) for
improving the power efficiency of FinFET logic circuits. This scheme presents a significant
divergence from the conventional multiple supply voltage schemes considered, and is shown
to be significantly better than schemes such as extended clustered voltage scaling.
Chapter 4 proposes a low-power FinFET based circuit synthesis methodology that exploits
surface orientation optimization. It includes a study of various logic design styles, which
depend on different FinFET channel orientations, for synthesizing low-power circuits.
Chapter 5 proposes a variation-aware low-power FinFET circuit synthesis methodology. It
discusses leakage current macromodels for various standard cells implemented in different
logic styles. Further, it proposes a methodology to calculate full-chip leakage under process
variations.
Chapter 6 shows how to perform FinFET standard cell statistical delay characterization under design of experiments using the response surface methodology. It shows how the delay
of FinFET standard cells can be characterized statistically under spatial and environmental
variations, using central composite rotatable design.
Chapter 7 concludes the thesis and discusses future research directions.
Chapter 2
Related Work
2.1 Introduction
The increase in chip power consumption with CMOS scaling has significantly affected the designs
of CMOS circuits. The semiconductor industry has been successfully scaling the gate length for
the past few decades. Transistor scaling has necessitated a decrease in gate length, gate dielectric
thickness and an increase in doping concentration [16]. This has resulted in an increase in leakage
current and increased reliability issues with each successive technology generation. Fig. 2.1 shows
the expected trend in the total power consumption of ICs. It can be clearly seen that contribution of
leakage power to the total power consumption is expected to be very significant in future technologies. In the figure, it is assumed that CMOS technology will be used until 2013 and then scaling
will be continued with the use of multi-gate CMOS technology.
FinFETs have been touted as the most promising DGFET technology. In this chapter, we review the work done in the area of FinFETs. Firstly, we study the work done in the area of FinFET
fabrication. We discuss the two most prominent techniques currently used to manufacture FinFETs.
Thereafter, we review the work done in the area of FinFET logic and physical synthesis. We analyze
various innovative standard cell designs proposed in the literature to reduce power consumption.
Next, we analyze the work done in the area of FinFET SRAM design. Various innovative techniques have been proposed to enhance SRAM metrics, such as read margin, write margin and cell
stability. We also review the work in the area of FinFET process variation. FinFETs generally have
a lightly doped channel surface and thus are unlikely to suffer from the effects of random dopant
10
fluctuation effects. However, lithographic variations, such as gate length, gate oxide thickness and
fin thickness, are likely to affect the FinFET manufacturing process, resulting in leakage power and
delay distributions. Also, FinFETs are likely to suffer from environmental variations, such as those
in temperature and supply voltage. Since FinFETs are usually built on an SOI structure, they also
suffer from the ill effects of self-heating.
11
Figure 2.2: Comparison of fin density in spacer and optical lithography [5]
current at a given lithography pitch. Fig. 2.2 shows the comparison of fin density achieved using
optical lithography and spacer lithography. A spacer lithography process technology uses a sacrificial layer and a chemical vapor deposition (CVD) technique to achieve uniform silicon fins. The
minimum-sized features are not decided by photolithography, but by the CVD film thickness [5].
For FinFETs, short-channel effects can be controlled easily when the fin thickness is approximately half of channel length [18]. This becomes impossible by standard lithographic techniques
when gate length reaches the limit of lithographic dimension. Further, standard lithographic techniques produce silicon fins, which are highly non-uniform. Uniformity of silicon fin thickness is
very critical for FinFETs because line width roughness in silicon fins leads to large threshold variations [19, 20]. Also, the gate length-to-silicon fin thickness ratio should be less than 1.5 to keep
short-channel effects under control in FinFETs [18]. All the above requirements can be met using
the spacer lithography technique. Further, since the spacer lithography technique doubles the drive
current in a given area because of the doubled fin density, it has emerged as the technique of choice
for fabrication of FinFET chips.
12
2.3
Various researchers have explored logic synthesis with FinFETs. The property that has been exploited the most is the use of a back-gate bias in IG-mode FinFETs to modulate the Vth of the front
gate. Various innovative standard cell designs have been proposed using different combinations of
SG- and IG-mode FinFETs. In [6], different logic gate styles are presented and thereafter a linear
programming based sizing algorithm is used to optimize the circuit for power.
Fig. 2.3 depicts the SG-, LP-(low power), IG- and IG/LP-mode NAND gates [6]. SG-NAND
gates have the lowest delay among the different logic styles since fast SG-FinFETs are employed
both in the pull-up and pull-down network of the NAND gates. LP-NAND gates have more than
double the delay of SG-NAND gates. However, the leakage power of LP-NAND gates, averaged
over all the input vectors, is reduced by over 90% when compared to SG-NAND gates. This is
because LP-NAND gates employ IG-mode transistors with reverse bias on the back gates. The
reverse bias increases the Vth of the front gate, thereby reducing the leakage but increasing the delay
of the LP-NAND gates. In IG-NAND gates, only one transistor is used in the pull-up network. To
achieve equal rise and fall delays, the size of the pull-up network needs to be scaled up. IG-NAND
gates can achieve equal delay as that of the SG-NAND gates. However, the gates occupy more area
as compared to that of SG-NAND gates. The fourth design, IG/LP-NAND gate, is a hybrid of the
IG- and LP-NAND gates. The leakage/delay characteristics of the IG/LP-NAND gate lie in between
those of LP-NAND and IG-NAND gates. It should be noted that sizing of NAND gates for equal
rise and fall delay is a challenge because of the fin width quantization. The design rules for sizing
the NAND gates are also specified in [6].
Several low-power logic gate options using independent gates are presented in [21]. An effi-
13
cient circuit synthesis methodology based on the proposed low-power logic options has been developed. In [22], a genetic algorithm based power optimization framework for FinFET based circuits
is proposed. The authors exploit IG-mode FinFETs along with other low-power techniques, such
as multi-VDD and gate sizing, for power optimization. A novel look-up table based approach for
design of FinFET circuits is proposed in [23]. It is shown to be accurate by comparisons against
mixed-mode device simulations.
FinFET physical synthesis is still a nascent area of research. There is still a lack of FinFET
physical synthesis tools. However, researchers have looked into the layouts of various standard
cells employing SG- and IG-mode transistors. The layout structure of the FinFET depends upon
the type of process used. The increased fin density made possible by spacer lithography [5, 17, 24]
can be translated to increased layout densities. Another process knob that can be used to improve
layout density is fin height. An increase in fin height can translate to increased current in the same
area [25]. In [26, 27, 28], a comparative study of layout densities in SG-, and IG-FinFET standard
cells is done. It is shown that SG-mode standard cells occupy the same area as the standard bulk
transistor cells at the same technology node. However, the IG-mode standard cells occupy almost
double the area of SG-mode standard cells.
14
also explored and demonstrated to be acceptable. In [30], both a forward bias to reduce Vth , while
performing Read/Write operations in an SRAM, and a reverse bias to reduce the leakage power in
the standby mode are used. In [31], the static noise margin (SNM) of FinFET SRAM cells operating in the subthreshold region is investigated. The 6T FinFET SRAM cell is also shown to be fully
functional in the subthreshold regime. Further, a stability analysis is performed for various novel
IG-mode SRAMs. A device optimization technique for robust and low-power FinFET SRAM is
presented in [32]. In this work, the gate sidewall spacer thickness is optimized to simultaneously
minimize leakage current and drain capacitance to on-current ratio. Further, it is shown that the
optimization reduces the sensitivity of the device Vth to fluctuations in gate length and fin thickness. In [33], a joint exploration of VDD -fin height-Vth design space is done for a 65nm FinFET
SRAM. It is shown that taller fins can accommodate lower VDD as well as a higher Vth to deliver
iso-performance at reduced leakage. An optimization study to improve cell stability in the design
space of silicon fin thickness and fin ratio is done in [34]. An alternative to sizing for stability in
FinFET memory cells is studied in [35]. It is shown how multiple workfunctions can be used to
control the Vth of the six transistors to improve stability at lower leakage power consumption. An
analysis of the impact of channel orientation on stability, performance and power of 6T and 8T
FinFET SRAMs is done in [36].
15
16
variations. In [45], the process parameters, which affect the leakage current of a device exponentially, are identified. Further, a process variation aware leakage current model is developed for a
single device. This model is validated against Monte Carlo simulations and is shown to be very
accurate. In [46], an analytical expression is given to calculate the probability density function of
leakage currents for stacked devices in CMOS gates. Then, these distributions of individual gate
leakage currents are combined to obtain the mean and variance of the leakage current of an entire circuit. Accurate estimation and modeling of total circuit leakage distribution considering both intraand inter-die variations are done in [47]. Leakage power and temperature variation are strongly
coupled. In fact, leakage power varies exponentially with an increase in temperature. Temperature
variations and electrothermal coupling between subthreshold leakage and junction temperature is
studied in [48]. It is shown that it is critical to consider die-to-die temperature variations for accurate leakage estimation. A novel framework for accurate estimation of subthreshold leakage in
process, temperature and supply voltage space, considering both inter-die and intra-die variations,
is presented in [49].
Though FinFET circuit synthesis and SRAM design have attracted a lot of attention from researchers, few researchers have also worked in the area of FinFET process variations. One of
the major differences between a FinFET and a planar device is that the FinFET consists of multiple
small fins. Thus, previous analytical models for obtaining the leakage distribution of a gate or a chip
cannot be directly applied to FinFET circuits. New analytical methods need to be developed, which
take into account the width quantization property. In [50], statistical leakage estimation of FinFETs
is estimated under the width quantization property. It is shown that conventional approaches can
significantly underestimate leakage current by as much as 43%. The effect of process variation on
device temperature in FinFET circuits is studied in [51], where a Monte Carlo simulation methodology based on thermal models is used to solve temperature and leakage power self-consistently.
The influence of process variation on device performance of the optimized 10nm FinFET is studied in [52]. The sensitivity of on-current, leakage current, threshold voltage, drain-induced barrier
lowering and subthreshold swing to process variation is also studied. In [53], engineering the workfunction of the gate materials is shown to be effective in controlling Vth under variations. Further,
the sensitivity of the electrical parameters of the device to several important physical fluctuations,
such as gate length, fin thickness and gate dielectric thickness, is analyzed. Variability of FinFET
17
based devices and circuits considering quantum-mechanical effects and width quantization property
is studied in [54].
18
Chapter 3
19
TSI
drain
H Fin
source
reduce power consumption while maintaining circuit performance [57, 58]. In addition to multiple
supply voltage design techniques, lowering the Vth can maintain high performance while lowering
the supply voltage. Unfortunately, this leads to an exponential increase in the leakage current, which
has become an important concern in low-voltage high-performance designs [59].
Using our TCMS scheme, one can sharply diverge from the way circuits have been designed in
the past. This scheme does not make use of a lower supply voltage and thus no lowering of Vth is
required to maintain performance. It uses a nominal and a higher supply voltage. A possible consequence of an increased supply voltage is an increased Vth . Thus, the leakage power can be reduced
L ), a slightly
drastically. We employ a set of three supply voltages: a nominal supply voltage (Vdd
H ), and a slightly negative supply voltage (V H ). The scheme is based on
higher supply voltage (Vdd
ss
the principle that in an overdriven gate (a gate which is driven by an input voltage that is higher than
its supply voltage), the delay and subthreshold leakage can be reduced simultaneously [60, 61]. We
make the following contributions in this chapter:
We propose the TCMS scheme for arbitrary logic circuits, which uses multiple supply and
threshold voltages to reduce circuit power consumption.
We discuss a library consisting of inverters and two-input NAND and NOR gates based on
the TCMS scheme. The library consists of seven different types of inverters and 25 different
types of NAND and NOR gates.
We extend a linear programming based optimization methodology to implement the TCMS
scheme for delay-constrained power optimization.
Experimental results show that the application of TCMS to a set of benchmarks reduces power
consumption, on an average, by 67.6% at 30% slack.
We propose two variants of the TCMS scheme. The first uses dual supply and threshold
voltages to reduce circuit power consumption. The second uses a TCMS scheme with a
single Vth . These schemes also result in significant power savings.
We also implement traditional extended clustered voltage scaling (ECVS) [58] using the linear programming framework and show that, even under an optimistic scenario, the power
saving obtained by ECVS, on an average, is lower than that obtained by the TCMS scheme.
21
The chapter is organized as follows. In Section 3.2, we review the background work. In Section 3.3, we discuss the TCMS principle, which forms the basis for the scheme presented in this
chapter. In Section 3.4, we discuss gate library design using TCMS. In Section 3.5, we discuss the
power optimization methodology and the implementation of ECVS. In Section 3.6, we present the
experimental results and conclude in Section 3.7.
22
schemes, the gates on the critical path operate at the higher Vdd , i.e., nominal supply voltage, or
lower Vth to meet the performance requirements, and the gates on the non-critical paths operate at
the lower Vdd or higher Vth , thereby reducing the overall power consumption without performance
degradation. In contrast to the above, in our scheme, both a nominal Vdd and a higher Vdd as well
as a nominal Vth and a higher Vth are deployed on critical as well as non-critical paths.
Vthgf
0
Vth
0
Vth
gf
(3.1)
otherwise.
gf
where is a positive quantity whose value depends upon the ratio of gate and body capacitances. If
the FinFET is operated in SG mode, the Vth of both gates responds simultaneously to the change in
the voltage at the other gate. This happens because when the back gate is in depletion mode, charge
coupling occurs between the front and back gates. However, when the back gate is in strong inversion mode, the free carriers effectively screen the back-gate electric field, making Vthgf independent
of Vgb s .
TCMS exploits the fact that in an overdriven FinFET, the delay and subthreshold leakage can
be reduced at the same time. Fig. 3.2 is used to further illustrate this point. In this figure, the
H and V H , and for the NAND gate they are V L and V L . A
supply voltages for the inverters are Vdd
ss
ss
dd
H , V H and V L are 1.08V, 0.08V and 1.0V. V L is assumed to be tied to
possible set of values of Vdd
ss
ss
dd
H is also
ground. In the remainder of this chapter, it is assumed that any logic gate connected to Vdd
23
connected to VssH and similarly for the lower supply voltages. Let V 1 and V 2 in Fig. 3.2 be held at
H
Vdd
VLdd
V1
V1'
H
Vss
H
Vdd
V2'
V2
L
Vss
H
Vss
Figure 3.2: The principle of TCMS
0
logic 0. This would lead to a logic 1 at V 1 and V 2 . Thus, the nFinFETs in the NAND gate will
be conducting and the pFinFETs will be leaking. Both the subthreshold leakage current and delay
can be controlled through the control of FinFET Vth . In this case, it can be seen that the nFinFETs
experience a bias voltage of 1.08V, which is higher than the normal gate drive of 1.0V. On the
other hand, pFinFETs are reverse biased by 0.08V. Thus, the nFinFETs experience an increased
gate-to-source voltage, compared to the case when they are driven by a supply voltage of 1.0V.
The increased drive strength of nFinFETs results in a reduction in the falling delay of the NAND
gate. The applied bias causes the Vth of the pFinFETs to be increased, thereby resulting in a lower
subthreshold leakage. In addition, a negative gate-to-source bias on the pFinFETs further brings
down their subthreshold leakage current. Similarly, the application of a logic 1 at the circuit inputs
leads to a reduction in the leakage current in the nFinFET and improvement in the drive strength of
the pFinFET. The use of TCMS-style logic gates in circuit synthesis is explained in greater detail in
the next section.
TCMS is based on the principle that an nFinFET (pFinFET) experiences an overdrive when it is
conducting, and simultaneously a pFinFET (nFinFET) experiences a reverse-biased voltage, which
leads to very low subthreshold currents. TCMS can provide considerable power savings despite
the use of an increased Vdd . In TCMS there is a limitation to lay out an additional VssH line. This
limitation can be addressed by using supply-double grid suggested in [70]. Another way is to get
24
rid of VssH supply and still apply TCMS principle. This is explained in greater detail in Section 3.6.
In conventional multiple supply voltage schemes, power savings can be attributed to the use of a
lower Vdd on non-critical paths, which results in lower leakage and dynamic power dissipation.
However, in TCMS, power savings can mainly be attributed to the reduction in the leakage current.
Although the dynamic and leakage power may slightly increase for gates operating at the higher
supply voltage, this is far outweighed by the reduction in leakage power in overdriven gates.
We performed HSPICE simulations on an overdriven nFinFET using the predictive technology
model (PTM) for 32nm FinFETs. PTMs are available from [71] and have also been used for all
other HSPICE experiments reported in this chapter. These models have been verified against manufactured 32nm FinFETs [72] and have been widely used for circuit simulations [73, 74, 6]. Fig. 3.3
L and the
shows the simulation results. In the simulation, the drain of the nFinFET was tied to Vdd
H and V H . Thus, the nFinFET
source terminal to ground. The gate voltage level varied between Vdd
ss
H was applied at its gate and reverse-biased when V H was applied. Let
was forward-biased when Vdd
ss
L (I L ) and I H (I H ) denote the on-currents (off-currents) through the FinFET at normal drive
Ion
on of f
of f
L ) and overdrive (V = V H ), respectively. As shown in Fig. 3.3, I H exceeds I L by 3.4%.
(Vgs = Vdd
gs
on
on
dd
H is almost 5X smaller than I L . The large reduction in the subthreshold
On the other hand, Iof
f
of f
current in an overdriven FinFET is the key to the large power savings in TCMS schemes.
Figure 3.3: Simulated Ids -Vgf s characteristics for an overdriven 32nm nFinFET
25
b
L
Vss
During circuit synthesis, when this gate is embedded in a larger circuit, it might so happen that a is
the output of a high-Vdd gate and b comes from a low-Vdd gate or vice versa. Suppose the former is
true. Thus, the FinFETs connected to input a follow the TCMS principle explained above. FinFETs
connected to input b cannot employ the TCMS principle because there is no gate-to-source voltage
difference to exploit.
H and V H , then the
On the other hand, if the power supply voltages for the NAND gate are Vdd
ss
FinFETs connected to input a will not be able to take advantage of the TCMS principle. Also, input
b is from the output of a low-Vdd gate and is driving a high-Vdd gate. This results in an increased
H V L.
leakage current because the pFinFET is forward-biased by Vdd
dd
26
H . These
To avoid the above problem, a level-converter may be used to restore the signal to Vdd
level-converters may be combined with a flip-flop, as in the clustered voltage scaling (CVS) technique [57], to minimize the power for voltage level restoration. In an asynchronous approach,
L and V H .
such as ECVS [58], level-converters may be inserted between logic gates connected to Vdd
dd
In such schemes, the power and delay overheads for the level-converters are large.
In the case of TCMS, using level-converters is not an attractive option because power savings
are obtained through the use of overdriven gates, the frequent use of which necessitates frequent
level conversion. However, level conversion can be built into logic gates without requiring the use
of level-converters [75], through the use of a high-Vth FinFET at the inputs of high-Vdd gates that
need to be driven by a low-Vdd input voltage. FinFET Vth may be controlled through a number of
mechanisms. For example, there are several process-related options to statically control the Vth of
a FinFET, e.g., channel doping, gate workfunction engineering or asymmetrical double gates [76].
The first step towards evaluating the utility of the TCMS principle for arbitrary logic circuits
involves the design of technology libraries, consisting of high-Vdd cells, low-Vdd cells, low-Vdd cells
that are being driven by high-Vdd cells and high-Vdd cells that are being driven by low-Vdd cells.
All these cells have to be characterized both at high-Vth and low-Vth . Thus, the design variables
that need to be targeted are supply voltage, input gate voltage and threshold voltage. Hence, for
a two-input NAND gate of a given size, we have five design variables: supply voltage, gate input
voltage for input a, gate input voltage for input b, Vth for FinFETs connected to input a and Vth for
FinFETs connected to input b. If the Vth of a pFinFET connected to an input is high (low), then the
corresponding nFinFET connected to the same input also has a high (low) Vth . It can be easily seen
that 32 two-input NAND gates of a particular size are possible, because of the five design variables.
H ,V H ), a low input a gate
For example, one type of NAND gate may have a high supply voltage (Vdd
ss
voltage, a high input b gate voltage, a high Vth for FinFETs connected to input a and a low Vth for
FinFETs connected to input b. Let 1 denote the case when either a high supply voltage or a high
input gate voltage or a high Vth is used. Similarly, let 0 denote when either a low supply voltage
or a low input gate voltage or a low Vth is used. Using this convention, the example NAND gate can
be termed nand10110. The first 1 in nand10110 denotes a high supply voltage, thereafter 0 denotes
a low input a gate voltage, third 1 denotes a high input b gate voltage, the fourth 1 represents the
high Vth for input a and the fifth 0 represents a low Vth for input b. Thus, 32 NAND gate modes are
27
possible ranging from nand00000 to nand11111. However, certain combinations of design variables
are not allowed: a logic gate with a high supply voltage and low input gate voltages cannot employ
low-Vth transistors as this will lead to a large leakage current, as explained earlier. Thus, nand10000,
nand10001, nand10010, nand10100, nand10101, nand11000 and nand11010 are not allowed. This
leads to 25 NAND gate modes instead of 32. Similarly, there are 25 NOR gate modes. Since the
inverter is a one-input gate, it has three design variables: supply voltage, input gate voltage and Vth .
This leads to seven valid modes for inverters. For each NAND, NOR and inverter mode, we include
five sizes: X1, X2, X4, X8 and X16. The library is characterized by simulating the delay, leakage
and short-circuit power consumption of each constituent cell in HSPICE. Transistor capacitance is
also measured using HSPICE. To model interconnect delay and load, fanout and size-dependent
wire load models were obtained by scaling the wire characteristics available as part of a 130nm
technology library, according to the method presented in [77].
SG
library
Verilog
netlist
Delay-minimized netlist
by Design Compiler
TCMS
library
Delay
constraints met
?
No
Yes
P >
Yes
No
Power-optimized netlist
29
3.5.2
procedure, the circuit is levelized. The level of each primary input is defined to be 0. The level of
a gate G, denoted as l(G), can be calculated by l(G) = 1 + maxi{1,2,...,F N } l(GIi ) where GIi is
the ith fanin of gate G and F N is the gate fanin. Next, all the gates located at an odd level in the
H at FinFETs connected to input a if this input
initial netlist are replaced by high-Vdd gates, with Vth
arrives from an even level, i.e., input a is the output of a low-Vdd gate. On the other hand, if the
input arrives from an odd level, the threshold voltage of the FinFETs can, in general, be allowed to
L or V H . However in our approach, we replace it by V H to reduce the initial leakage
be either Vth
th
th
H assignment can be changed to V L in Phase II if the optimization algorithm
power. Note that this Vth
th
deems it necessary. The gates at an even level are replaced with other modes of low-Vdd gates to
maintain circuit consistency, as mentioned earlier. We next illustrate the initialization phase through
an example.
L and V L , i.e.,
Consider the circuit shown in Fig. 3.6. Initially, the circuit is synthesized using Vdd
th
all the NAND gates and inverters are of the form nand00000 and inv000, respectively. Thereafter, as
explained earlier, the inverters of size X4 at level 1 are replaced with other inverters from the TCMS
H ,V H ) and their
cell library. The replaced cells have size X4, but their supply voltages are (Vdd
ss
H , i.e., the replaced inverters are of mode inv101. Similarly, the NAND
threshold voltages are Vth
H as the threshold voltage,
gate at level 3 is replaced with a high-Vdd NAND gate, which employs Vth
i.e., it has the nand10111 mode. The NAND gate at level 2 is replaced with nand01011, and the
inverter at level 4 is replaced with inv011. This is done so that modes of the gates at an even level
are consistent with the circuit topology.
When a gate is changed from a low-Vdd gate to a high-Vdd gate, it is not necessary that both of
its inputs will come from an even level and will thus be low-Vdd signals. It might so happen that one
of the inputs comes from an odd level and is the output of a high-Vdd gate. This explains the need
for 25 different NAND and NOR gate modes and seven different inverter modes in the cell library.
The circuit is divided into alternate levels of high-Vdd and low-Vdd gates to make use of the TCMS
30
scheme, which is based on the principle of a high-Vdd gate driving a low-Vdd gate. This also leads
H is
to a low-Vdd gate driving a high-Vdd gate. However, as explained above, in such a situation, Vth
Vdd
x1
X4
x2
Vdd
X8
x3
b
X4
Level :
Vdd
Vdd
X4
2
X2
4
3.5.3
Circuit sizing algorithms often perform a search amongst the various candidate cells available for
each gate to select the cell with the best power-delay sensitivity. Let 4P represent the reduction in
power and 4D the degradation in delay, if an alternate cell is used. The ratio
4P
4D
is the power-delay
sensitivity. Such a cell is then used to replace the gate. However, as shown in [78], such decisions
can be quite suboptimal. The major advantage of the linear programming approach is that it leads
to an analysis of how changing each gate affects the gate it has a path to. We next review the gate
sizing algorithm presented in [78] and discuss enhancements we have made to it for implementing
the TCMS scheme.
The linear programming formulation is an iteration based algorithm. In each iteration, it selects
the best cells for any number of gates in the circuit, based on the power-delay sensitivity. To reduce
power, the cell with maximum reduction in power for a given increase in delay is chosen. To reduce
delay, the cell with maximum delay decrease for the corresponding increase in power is chosen.
When an alternative cell is chosen, the level of the existing gate and the input gate voltages play an
important role. If the existing gate is at an odd level and the voltage at input a (b) is high (low), it
31
can only be replaced by gates that have a high supply voltage and high (low) voltage at input a (b).
The same is true for gates at even levels. The free design variables are the threshold voltages and
gate sizes. The linear programming formulation is able to select alternative cells for any number of
gates in the circuit during each iteration. It uses a cell choice variable v for each gate. v denotes
whether an alternative cell has been chosen or not. v varies continuously in the range [0, 1]. A value
of v greater than a threshold value indicates an alternative cell should be used, else not. In [78],
the threshold value chosen is 0.99. We found that such a high threshold value greatly impairs the
chances of a cell being replaced. We found empirically that a threshold value of 0.6 works better.
An alternate cell for gate v (Fig. 3.71 ) is then chosen by minimizing power among various candidate
cells for which d0v dv + v 4dv , where d0v is the delay through v after a cell change. At the end
of an iteration of the algorithm, all the gates whose alternative cells have a v value greater than 0.6
are replaced with alternative cells. Equation (3.2) gives the objective used to optimize power in the
linear programming formulation. Delay constraints at individual gates (for the circuit in Fig. 3.7)
and at the circuit outputs are given by Equations (3.3) and (3.4), respectively.
min
!
v 4Pv
(3.2)
vV
(3.3)
voutputs
(3.4)
In the above equations, all timing arcs are assumed to have a negative polarity, i.e., a falling input
causes a rising transition at the output if the output changes. 4Pv is the change in power due to
changing gate v. tuv,f all is the falling arrival time at gate v from gate u. tvw,rise is the rising
arrival time at gate w from gate v. duv,rise is the delay from the signal on uv to the output of v
1
32
w
u
v
a
b
z
x
3.5.4
We synthesized the power-optimized netlist for ISCAS85 benchmark c17 at 130% ATC, i.e., with
a slack of 30% relative to the delay-minimized version, using the methodology illustrated earlier.
33
The set of high supply voltages used were 1.08V and 0.08V. The nominal set of supply voltages
were 1.0V and ground. These supply voltages were chosen by fixing the nominal set of supply
voltages and experimenting with various sets of high supply voltages. The two threshold voltages for
nFinFETs were 0.29V and 0.45V and those for pFinFETs were 0.25V and 0.40V. The switching
activity at each primary input was set to 0.1.
c17 is initially mapped to low-Vdd gates with low-Vth and the delay-minimized logic netlist
is obtained, as shown in Fig. 3.8. Thereafter, the power-optimized netlist is obtained at 130%
ATC. We achieved 50.3% power reduction for this circuit. The initial power consumption of the
delay-minimized netlist was 301.35W (leakage power: 28.02W, dynamic power: 273.33W). In
the power-optimized netlist, leakage power reduces by 92.8% and dynamic power by 45.9%, and,
hence, the total power consumption reduces to 149.87W . The cells chosen by our methodology
for c17 are shown in Fig. 3.9.
e
X2
X1
d
X8
X2
X8
X16
X2
d
X16
X8
X4
X16
X4
X8
X16
Level:
34
nor10011
nor11011
X1
X1
inv101
d
X2
X1
nor01100
X2
inv101
nand01001
inv101
c
X2
inv101
nor00111
X4
X8
inv101
a
X1
X8
nor10011
nand00110
X2
X1
nor01100
X2
X8
inv101
Level :
35
to the delay-minimized netlist (as can be seen by the gate sizes shown in Figs. 3.8 and 3.9). The
cells at the odd levels are high-Vdd gates. The dynamic power consumption of these cells tends to
increase due to an increase in their supply voltage, but tends to decrease due to a reduction in the
area of the cells they drive. The power consumption of the cells at even levels decreases if there
is a reduction in the area of the cells they drive. In Fig. 3.9, there is a decrease in the size of all
except one (in which case the size is the same) cell in the netlist. Thus, the dynamic power of the
low-Vdd cells decreases. The dynamic power of high-Vdd cells also decreases in most cases because
the reduction in area outweighs the increase in supply voltage. The total number of fins in the
delay-minimized netlist is 538 while the total number of fins in the power-optimized netlist is only
216.
high input gate voltage is used when the LOW-Vdd library is characterized. When a HIGH-Vdd gate
feeds a LOW-Vdd gate, the rising delay value can be directly obtained from the library. However, the
L to V H . To circumvent
falling delay value reduces because the input gate voltage increases from Vdd
dd
this problem, we reduce the falling delay value by a fixed fraction [79] whenever a HIGH-Vdd gate
feeds a LOW-Vdd gate. Similarly, a HIGH-Vdd library was characterized with supply voltages (1.0V,
ground). Only these two libraries are required for the ECVS methodology.
Initially, the circuit is mapped to the HIGH-Vdd library to obtain the delay-minimized netlist,
and then the ECVS methodology is applied. We synthesized the power-optimized netlist for c17
at 130% ATC, using ECVS. The netlist is shown in Fig. 3.10. The cells marked as inv and nand
are LOW-Vdd gates, while the cells marked as inv h and nor h are HIGH-Vdd gates. The power
consumption of the ECVS power-optimized netlist is 176.69W . The dynamic power reduces by
40.2% and the leakage power by 53.1% in the power-optimized netlist. On the other hand, leakage
power reduces by 92.8% in the TCMS scheme. There is a larger reduction in leakage power in the
TCMS scheme because of the negative gate-to-source voltage resulting from the TCMS principle
and also due to the other set of high-Vth employed. Although ECVS employs LOW-Vdd gates,
which decrease the dynamic power consumption quadratically, still, dynamic power reduces by a
larger margin in the TCMS scheme because of the greater reduction in area obtained. The total
number of fins in ECVS power-optimized netlist is 282, which is 30.1% higher as compared to the
TCMS scheme. Out of 14 gates in the ECVS circuit, three are mapped to LOW-Vdd gates. Note that
the power-optimized netlist does not have any level-converters because there is no LOW-Vdd gate
driving a HIGH-Vdd gate. For larger circuits, however, ECVS would have to incur delay and power
overheads of level-converters.
37
nor_h
nor
X1
X1
inv_h
X4
X1
nor_h
X4
inv_h
nand
inv_h
c
X8
inv_h
X8
inv_h
a
X8
X2
nor_h
d
X4
nor_h
nand
X2
X2
nor_h
X4
X8
inv_h
As we can see, the leakage power reduces by 95.8% and dynamic power by 53.3%, providing a
total power reduction of 67.6%, on an average, when compared to the delay-minimized netlist. The
38
Delay-minimized
Dynamic Leakage
Total
679.15
546.38
1225.53
9174.54
1757.26 10931.80
1499.76
921.35
2421.11
8015.13
1583.94
9599.07
2810.42
1290.01
4100.43
2198.84
2073.06
4271.90
5545.22
2929.66
8474.88
11662.20
9749.32 21411.52
10836.80
5781.22 16618.02
52422.06 26632.20 79054.26
0
0
0
Power consumption (W )
TCMS (1.08V and 0.08V)
TCMS (Single Vth )
Dynamic Leakage
Total Dynamic Leakage
Total
345.08
24.26
369.34
382.53
25.89
408.42
3973.30
95.37
4068.67
4357.34
100.63
4457.97
768.27
33.90
802.17
827.37
35.08
862.45
3414.07
80.98
3495.05
3805.77
88.01
3893.78
1276.65
58.35
1335.00
1435.42
62.95
1498.37
1198.03
53.25
1251.28
1273.48
90.67
1364.15
2889.58
126.10
3015.68
3047.51
129.72
3177.23
5479.86
394.28
5874.14
5806.66
397.43
6204.09
5133.74
234.35
5368.09
5357.89
213.67
5571.56
24478.58 1100.84 25579.42 26293.97 1144.05 27438.02
53.3%
95.8%
67.6%
49.8%
95.7%
65.3%
Dynamic
372.93
4181.26
802.10
3564.44
1373.25
1262.22
2956.05
5792.84
5170.33
25475.42
51.4%
Dual-Vdd
Leakage
Total
31.16
404.09
118.74
4300.00
27.14
829.24
94.86
3659.30
62.49
1435.74
69.07
1331.30
136.97
3093.02
362.09
6154.93
204.72
5375.05
1107.24 26582.67
95.8%
66.3%
Delay-minimized
12731
39533
21483
35994
29507
45283
64807
216800
125762
591900
0
Dual-Vdd
4834
16127
7813
13531
11810
19785
26739
85316
41059
227014
61.6%
Figs. 3.11 and 3.12 present the leakage and dynamic power breakdown for delay-minimized
and power-optimized benchmarks, respectively. In the delay-minimized circuits, the leakage power
accounts, on an average, for 34% of the total power. After applying the TCMS scheme, the leakage
power accounts, on an average, for only 5% of the total power, as expected.
To account for a manufacturing process that allows only a single-Vth , not dual-Vth , we ran
the experiments again assuming that only Vth s of 0.45V for nFinFETs and 0.40V for pFinFETs
were available. The results are shown in major column 4 in Table 3.1. As expected, the overall
power reduction reduced slightly from 67.6% to 65.3%, since the power optimization algorithm
had less freedom to optimize. However, the negative impact of using a single-Vth is marginal.
In general, the dynamic power consumption is slightly higher because the FinFET area (hence
capacitance) is higher at single-Vth . This can be seen from Table 3.2. Although the single-Vth
TCMS scheme employs only high-Vth , the leakage power consumption is marginally higher than
39
Leakage
Dynamic
100
90
80
% of Power
70
60
50
40
30
20
10
0
c432
c499
c880
c1355
c1908
c3540
c5315
c6288
c7552
ISCAS'85 benchmarks
Leakage
Dynamic
100
% of Power
80
60
40
20
0
c432
c499
c880
c1355
c1908
c3540
c5315
c6288
ISCAS'85 benchmarks
40
c7552
the dual-Vth TCMS scheme, because of the greater area reduction obtained in the latter. It was also
observed that in the power-optimized netlists obtained using the dual-Vth TCMS scheme, most of
the cells employed high-Vth . This further explains the similar reductions in leakage power achieved
by the two techniques.
Even though TCMS leads to a substantial power reduction, a limitation is the need to lay out
an additional VssH line. This limitation can be addressed by using the double-supply/double-ground
grid suggested in [70]. Another way to address this limitation is to replace VssH with VssL , i.e., just use
one Vss line instead of two. This would decrease the power reduction possible. However, since the
TCMS principle will still be applicable to the pFinFETs in the circuit, the power reduction would
L , V H and
still be appreciable. Therefore, we performed experiments with dual supply voltages Vdd
dd
a single ground line VssH . We refer to this as the dual-Vdd scheme. The results are shown in major
column 5 in Table 3.1. As expected, the overall power savings decreases slightly from 67.6% to
66.3%. The dynamic power consumption is slightly higher because the fin-count in the dual-Vdd
scheme is higher than the fin-count in the TCMS scheme (see Table 3.2). However, the leakage
power consumption is almost similar across all the benchmarks. This is true because when a lowVdd gate drives a high-Vdd gate in the TCMS scheme, the gate-to-source voltage difference increases
the leakage current exponentially. This is counteracted by the use of a high-Vth in high-Vdd gates.
However, in the dual-Vdd scheme, there is no gate-to-source voltage difference when a low-Vdd gate
drives a high-Vdd gate and the output of the low-Vdd gate is low, due to the use of a single ground
line. This leads to exponential savings in leakage power consumption of high-Vdd gates for the
above case. However, when a high-Vdd gate drives a low-Vdd gate, there is an exponential amount
of power savings in the low-Vdd gates due to the TCMS principle. In the dual-Vdd scheme, these
power savings can only come from pFinFETs. The two counteracting effects in the dual-Vdd scheme
thus lead to similar power savings to the TCMS scheme.
Next, we consider trends in average power savings across ISCAS85 benchmarks at successively
relaxed ATCs. As expected, the average total power savings increase from 56% to 76% (Fig. 3.13).
This happens because at relaxed ATCs, the linear programming algorithm has more overall slack
to allocate to individual gates. This shows that the proposed TCMS based optimization methodology can effectively utilize the increased slack to reduce power consumption in circuits. We also
performed simulation at 110% ATC to study the effectiveness of the technique at overall low slacks
41
% reduction in power
90
80
% reduction in power
70
60
50
40
30
20
10
0
110%
130%
150%
170%
190%
ATCs
42
HIGH-Vdd gates
LOW-Vdd gates
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
c432
c499
c880
c1355
c1908
c3540
c5315
c6288
c7552
ISCAS'85 benchmarks
43
c432
c499
c880
c1355
c1908
c3540
c5315
c6288
c7552
Total
Savings
Dynamic
(W )
417.31
4002.47
865.59
3588.80
1380.27
1697.14
3439.03
5700.75
4765.77
25857.13
50.7%
ECVS scheme
Leakage
Total
(W )
(W )
68.12
485.43
294.11
4296.58
86.40
951.99
234.94
3823.74
147.06
1527.33
228.63
1925.77
328.07
3767.10
1055.93 6756.68
545.35
5311.12
2988.61 28845.74
88.7%
63.5%
Area
(No. of fins)
4987
16567
8173
14437
11758
19347
25840
85328
42776
229213
61.2%
44
Chapter 4
45
4.2.1
The FinFET device consists of a thin silicon body, whose thickness is denoted TSI , wrapped around
by gate electrodes. The effective gate width of a FinFET is 2nHF in , where n is the number of fins
and HF in is the fin height. The fin-pitch (p) is the minimum pitch between adjacent fins allowed
by lithography at a particular technology node. Table 4.1 shows symmetric-gate FinFET device
parameters used in our simulations for the 32nm FinFET technology. The parameters, which have
a drastic effect on the leakage power of FinFETs, are gate-oxide thickness (TOX ), TSI and the
effective channel length (Lch ). The lateral doping profile in the source/drain region defines Lch . We
46
4.2.3
We next discuss how the best back-gate reverse bias can be derived. Fig. 4.2 shows the BSIMsimulated DC transfer characteristics for a 32nm nFinFET implemented in the <100> plane. The
47
150
<100>
<110>
<100>
<110>
100
12%
Ids (A)
50
nFinFET
0
pFinFET
50
100
18%
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Vds (V)
Figure 4.1: BSIM-simulated Ids vs. Vds characteristics for different orientations
drain voltage is set to 1V . The front gate-to-source voltage (Vgf s ) is varied from 0V to 1V . The
transfer characteristics are shown for various back-gate biases (Vgbs ). The top curve corresponds to
the OSG mode and the bottom four curves correspond to the OLP modes of operation, as indicated.
There is a noticeable difference in the Ion and Iof f currents for the different modes. Ion for the
OSG mode is about 73% greater than the Ion for the OLP mode (Vgbs = 0.2V ). However, the
subthreshold current decreases by an order of magnitude in the OLP mode as compared to the OSG
mode. It can be seen that in the OLP mode, the leakage current decreases exponentially with an
increase in reverse bias. The percent decrease in Ion with an increase in reverse bias is marginal.
Beyond a certain point, a further increase in reverse bias results in a very marginal decrease in
leakage current.
The above discussion indicates that it is important to quantify the variation of leakage current
with transistor delay. Fig. 4.3 shows the delay and leakage current for an OLP-mode inverter at
various back-gate reverse bias magnitudes, ranging from 0 to 0.4V . It can be seen that the leakage
current is strongly dependent on the back gate bias. On the other hand, the degradation in delay
48
10
73%
3
10
10
Ids (A)
10
OSGmode
Vgbs=0V
Vgbs=0.1V
Vgbs=0.2V
Vgbs=0.3V
10
10
10
10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
4.3
Library design
In this section, we study the performance and power characteristics of FinFET logic gates in various
channel orientations. We show that optimally channel-oriented logic gates are considerably faster
than corresponding logic gates, which have all FinFETs in one plane. We discuss the design of the
different kinds of cell libraries.
49
40
12
10
Delay (ps)
8
leakage
delay
20
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Vbgs (V)
Figure 4.3: Optimal back gate bias
4.3.1
We next discuss the various issues involved in the design of logic circuits in which the FinFETs have
various surface orientations. We simulate the percent decrease in delay (w.r.t. to <100> oriented
gates) for a minimum-sized inverter, NAND and NOR gates in the <110> orientation and in the
optimized orientation where the pull-up network is in the <110> plane and the pull-down network
is in the <100> plane. The delay is the average of the rising and falling delays. For the <110> case,
all the FinFETs in the logic gates are in the <110> plane. The reduction in delay when the inverter
is switched from the <100> plane to the optimized orientation is 8%, whereas the reduction in
delay for the NAND (NOR) gate is 10% (14%). The reduction in delay for a NOR gate is maximum
because the improvement in hole mobility has a maximal effect on stacked pFinFETs. There is
also a reduction in delay when we move to the <110> orientation, despite degradation in electron
mobility, because the increased hole mobility in this plane reduces the rising time delay. However,
as expected, the delay reduction in the <110> orientation is smaller as compared to the optimal
50
configuration because of the degraded electron mobility in the <110> plane. The delay reduction
in the <110> plane for the INV is 7%, while the delay reduction for the NAND (NOR) gate is 7%
(10%).
4.3.2
In this section, we describe how the three cell libraries (SG, OSG and OLP) are obtained. All the
three libraries contain INV, NAND and NOR gates. Each type of logic gate is implemented in five
different sizes: X1, X2, X4, X8 and X16, where the number indicates its size relative to a minimumsized gate. BSIM is used to characterize the libraries by simulating delay, leakage and short-circuit
power consumption of each constituent cell. The interconnect delay and load are modeled using the
fanout and size-dependent wire load models presented in [77].
We also laid out all the cells in the SG, OSG and OLP libraries to get an accurate estimation
of the area occupied by these cells. The standard cell layouts and a gate-level netlist is given as
input to the place-and-route tool, which provides the layout of the circuit. The area occupied by
the SG-mode NAND gate is 75042 , whereas the area occupied by the OSG-mode NAND gate is
90402 . The OSG-mode gate occupies a larger area because of the presence of oriented pFinFETs
in the OSG mode. The pFinFETs are in the <110> plane whereas the wafer is in the <100> plane.
Hence, the layout of pFinFETs is at an angle of 45 with respect to the nFinFETs. The tilted layout
results in an increase in area. Spacer lithography allows the minimum fin-pitch to be half of the
lithography pitch. Thus, to be conservative, a distance of is assumed between adjacent fins in the
OSG and SG-mode layouts. In the OLP-mode NAND gate, the fin-pitch needs to be increased to
10. This is because the back gate of its FinFETs needs to be reverse biased and, hence, we need to
place a poly-to-metal contact between adjacent fins. Due to the large distance between the adjacent
fins in an OLP-mode NAND gate, its area is considerably larger than the SG-mode NAND gate.
The area of the OLP-mode NAND gate is 139052 . The heights of all cells are kept the same to
enable standard cell design.
51
4.4.1
Optimization flow
The power optimization flow is as shown in Fig. 4.4. It starts by mapping a Verilog netlist to
SG-mode gates. Then its delay-minimized configuration is obtained. Thereafter, to evaluate the
utility of channel-oriented transistors, an iterative linear programming based algorithm is used to
map gates to cells of appropriate sizes and modes. The linear programming formulation can be
used to reduce both circuit delay and power consumption iteratively. The iteration terminates when
all the delay constraints are met and the reduction in power between successive iterations is less
than some prefixed fraction. Finally, the area of the power-optimized netlist is obtained through a
place-and-route tool.
4.4.2
The linear programming algorithm can size multiple logic gates at once and, hence, it avoids greedily sizing individual gates. Furthermore, while sizing a logic gate, the algorithm takes into account
how sizing this gate effects other gates on the same circuit path. The algorithm iteratively improves
the design. It selects one of various candidate cells available for each logic gate based on its powerdelay sensitivity ratio. Let 4P represent the change in power and 4D represent the change in delay
if an alternative cell is chosen. The ratio
4P
4D
to reduce power, then the cell with the maximum power reduction for a given delay increase is chosen (min
4P
4D ,
4P < 0, 4D > 0). On the other hand, to reduce delay, the cell with the maximum
4P
4D ,
determines the best alternative cells for each logic gate in the circuit and then formulates a linear
52
Verilog
netlist
SG library
OSG/OLP/
SG library
Delay-minimized
netlist by Design
Compiler
Linear programming
formulation
Delay
constraints
met ?
No
Yes
Yes
P > ?
No
Power-optimized
netlist
OSG/OLP/
SG layout
library
Place-and-route
tool
53
programming program. The solution obtained from this program indicates which logic gates are to
be replaced with their alternatives. The rest of the algorithmic formulation is the same as the one
presented in Section 3.5.3.
54
Delay-minimized
Dynamic
Leakage
Total
837.28
172.70
1009.98
2190.22.
1817.45
4007.67
34067.60
12925.80
46993.40
6401.05
6641.98
13043.03
11362.70
9130.46
20493.16
14058.50
19844.40
33902.90
30243.20
26174.20
56417.40
41568.80
39012.30
80581.10
140729.35 115719.29 256448.64
0
0
0
Power consumption (W )
SG-mode
Dynamic Leakage
Total
304.65
28.49
333.14
886.45
383.12
1269.58
11360.6
2381.63 13742.23
2770.56
1488.68
4259.24
3948.41
1779.68
5728.09
5846.64
4653.07 10499.71
12296.90
5658.04 17954.94
16522.20
7484.49 24006.69
53936.41 23857.21 77793.62
61.6%
79.3%
69.6%
SG/OSG/OLP-mode
Dynamic Leakage
Total
304.66
15.00
315.00
917.50
249.89
1167.39
11723.60
1561.45 13285.05
2790.56
733.37
3523.93
4076.27
902.03
4978.28
5940.24
2330.60
8270.84
12372.10
2774.17 15146.27
16530.08
4024.39 20555.19
54651.05 12590.90 67241.95
61.2%
90.0%
73.8%
c17
c432
c499
c880
c1908
c3540
c5315
c7552
Total
Savings
Delay-minimized
X-span Y-span
Area
()
()
(2 )
1225
1162
1423450
4852
5146
24968392
10933 11122
121596826
8063
8466
68261358
9335
9794
91426990
13705 14110
193377550
15786 16102
254186172
19154 19422
372008988
83053 85324 1127249726
0
0
0
X-span
()
1283
2700
5476
4594
4917
7796
8749
10206
45721
44.9%
Total area
SG-mode
Y-span
Area
()
(2 )
1282
1644806
2822
7619400
5810
31815560
4814
22115516
5146
25302882
8134
63412664
9130
79878370
10126 103345956
47264 335135154
44.6%
70.3%
SG/OSG/OLP-mode
X-span Y-span
Area
()
()
(2 )
480
830
398400
2856
3154
9007824
5955
6142
36575610
5048
5478
27652944
5786
6142
35537612
9130
9462
86388060
9739
9794
95383766
11150 11454 127712100
50144 52456
41865316
39.6% 38.5%
62.9%
Table 4.3 gives accurate area estimates of the delay-minimized as well as power-optimized
netlists. A place-and-route tool [86] is used to find the area of the netlists. This tool takes the circuit
netlist and the cell layouts as inputs and provides an accurate area estimate of the netlist. Major
column I gives the length, width and the area of the delay-minimized netlists. X-span (Y-span)
denotes the length (width) of the layout in . Major column II gives the area estimates of the poweroptimized SG-mode netlists at 130% ATC. On an average, the length (width) of the circuit layout
reduces by 44.9% (44.6%). The area reduces by 70.3%. Major column III gives the area estimates
of the power-optimized netlists comprising SG/OSG/OLP-modes gates at 130% ATC. The total area
of the power-optimized netlists reduces by only 62.9% because of the oriented transistors.
55
presented a FinFET cell library based circuit synthesis scheme. Such a scheme was not possible
in bulk CMOS due to the difficulty of fabricating transistors along the <110> plane. The efficacy
of the scheme was demonstrated with the help of ISCAS85 benchmarks. It was shown that significant power savings can be obtained at a relaxed delay constraint by using the suggested power
optimization methodology. We also developed standard cell FinFET libraries, which were used by
a place-and-route tool to give accurate area estimates of the logic circuits.
56
Chapter 5
57
We develop leakage current macromodels for SG- and IG-mode FinFET devices, which are
extracted from mixed-mode device simulations in Sentaurus TCAD.
We extend the above to stacked devices in SG-, LP-, mixed-terminal (MT)-mode [89] NAND/NOR
gates to obtain input vector dependent macromodels that can be used in FinFET circuit synthesis. Furthermore, we verify the distributions predicted by the macromodel with quasi-Monte
Carlo (QMC) mixed-mode device simulations of NAND/NOR/INV gates.
We implement a Latin hypercube sampling based methodology to capture leakage current
variations under spatial correlations in ISCAS 85 benchmarks synthesized using FinFET
standard cell libraries.
We examine the leakage yield tradeoffs offered by substituting LP- and MT-mode gates in a
100% SG-mode circuit at iso-delay.
We also show that by replacing an optimal percentage of SG-mode gates with LP- and MTmode gates in a pure SG-mode circuit, with a reasonable delay slack, the mean and spread in
leakage can be reduced dramatically.
The rest of the chapter is organized as follows. In Section 5.2, we review the background work.
In Section 5.3, we describe the setup used to simulate n/pFinFETs in FinE [7]. In Section 5.4, we
formulate leakage current macromodels and validate their distributions for various FinFET standard
cells. In Section 5.5, we describe the simulation flow and methodology used to obtain the leakage distribution of FinFET circuits. In Section 5.6, we present the experimental results and future
directions for synthesis strategies using FinFET standard cells. We conclude in Section 5.7.
5.2
Background work
In the past few years, FinFET research has gained a lot of traction amongst device and process
engineers as well as circuit designers. Logic styles leveraging the SG and IG modes of FinFET
operation have been explored in [65, 90, 91]. Power optimization in FinFET circuits has been
explored in [6, 60, 22, 56] using techniques like genetic algorithms/linear programming for gate
sizing, and multiple supply and threshold voltages.
58
Though FinFET circuit design and synthesis has attracted significant attention, few researchers
have explored the impact of process variations in FinFET devices and its effect at the circuit level.
In [53], engineering the workfunction of gate materials is shown to be effective in controlling Vth
under variations. Further, the sensitivity of the electrical parameters of the device to several important physical fluctuations, such as gate length, fin thickness and gate dielectric thickness is analyzed. Quantum effects are also shown to have a significant impact on FinFET device performance.
In [92, 50], a statistical estimation of leakage in SG-mode FinFET devices is performed under variations. The effect of process variations on device temperature in FinFET circuits is studied in [51],
where a Monte Carlo (MC) simulation based methodology using thermal models is used to solve
the temperature and leakage power self-consistently. Leakage current variability due to process
variation has been extensively studied in conventional CMOS. Models evaluating full-chip leakage
distributions under spatial correlation are presented in [93, 47, 94].
In this work, we perform die-level leakage analysis under process variations for FinFET circuits,
with the goal of leveraging the tradeoffs specific to FinFET standard cells during circuit synthesis.
In the next section, we deal with the simulation setup used to obtain various characteristics of
individual FinFET devices and logic gates.
59
Compact Model
Spice3UFDG
QuasiMC process
variation module
Sentaurus TCAD
mixed mode device
simulation
MATLAB GUI
Parameter extraction
module
MATLAB postprocessing
Figure 5.1: FinE simulation framework for double-gate circuit design space exploration [7]
gate workfunction, source/drain doping and the operating voltage, respectively.
Table 5.1: FinFET device parameters
PARAMETERS
LGF , LGB (nm)
25
TOXF , TOXB (nm)
1
TSI (nm)
10
HF in (nm)
50
HGF , HGB (nm)
20
LSP F , LSP B (nm)
20
LU N (nm)
10
NBODY (cm3 )
1015
G (eV )
nFinFET : 4.4, pFinFET : 4.8
NSD (cm3 )
1020
VDD (V )
1
Fig. 5.2 shows the two-dimensional (X-Y) FinFET cross-section of the 3D device structure that
was simulated in TCAD. The heavily doped extended source and extended drain regions (HCON
LCON ) aid in forming contacts to the device. They lead into the source/drain regions in the fin
where the dopant concentration gradually decreases, progressing towards the relatively undoped
body region, causing an overlap (LOV ) or underlap (LU N ). The underlap (LU N = LU NSOU RCE =
LU NDRAIN ) is defined as the distance from the physical gate edge to the point where source/drain
doping starts decreasing from its peak value. The Vth of FinFETs is typically tuned by directly
adjusting the workfunction of the gate material. The workfunction for nFinFET (G = 4.4eV )
and pFinFET (G = 4.8eV ) are chosen corresponding to high-performance logic requirements.
In order to model the effect of process variations, we have incorporated a QMC tool [97] based
60
GF
CON
HGF
LSPF
TOXF
TSI
HCON
TOXB
HGB
LSPB
GB
LUN (LOV )
Figure 5.2: Two dimensional (X-Y) cross-section of an nFinFET simulated in Sentaurus TCAD
on Sobols sequence in FinE (with 2000 samples) to avoid the sample clustering problem encountered in MC simulation. QMC methods based on low discrepancy sequences have been known to
produce samples that cover the sample space homogeneously, leading to quicker convergence with
fewer samples. Using the above setup, in the following section, we extract simple leakage current macromodels for SG/IG-mode FinFET devices and individual SG/LP/MT-mode logic gates.
We also verify their distributions with QMC sampling described above, in order to obtain reliable
models that can be utilized in circuit synthesis under process variations.
61
5.4.1
We model leakage (ILEAK ) as sub-threshold leakage, ignoring negligible contributions from gate
leakage (due to the undoped body) and gate-induced drain leakage (due to the choice of LU N ). We
identified the two main physical parameters that affect ILEAK using QMC simulation. Fig. 5.3
shows the ILEAK distribution for an nFinFET with inputs LU N , TOX , LG and TSI individually
varying normally, such that 3/ 10%. The spread in leakage is more pronounced in the LG and
TSI cases in comparison to the TOX and LU N cases. LG and TSI primarily face lithographic variations. TOX , LU N and G are dependent on thermal effects of processing, which are controllable
[92]. Hence, we focus on LG and TSI as the primary physical parameters determining leakage.
0.25
L
Probability of Occurence
UN
TOX
0.2
TSI
0.15
0.1
0.05
0
9.4
9.3
9.2
9.1
9.0
Log (I
LEAK
8.9
8.8
8.7
8.6
/ 1A)
Figure 5.3: ILEAK spreads for LU N , TOX , LG and TSI , each varying independently
In [98], the Poisson and carrier continuity equations are solved without the charge sheet approximation to correctly predict volume inversion in a double-gate MOSFET and ILEAK translates
to
ILEAK
i
h
i
h
qV
q(VGS ms )
HF in
k DS
kB T
T
B
=
(1 e
)
kB T ni TSI e
LG
(5.1)
where , kB T, ni are the mobility, thermal energy, and intrinsic concentration, respectively, and ms
62
is the difference in Fermi levels between the metal gate and semiconductor. Here, ILEAK TSI and
is relatively independent of TOX (to the first order, ignoring gate leakage). However, for FinFETs
under the short-channel regime with low TSI and LG , this is inaccurate as it fails to account for the
short-channel effect (SCE) and quantum confinement effect. ILEAK should then be obtained from
the general expression for sub-threshold leakage [92]:
i
h
qV
k DS
T
ILEAK =
HF in kB T (1 e
"
R LG
dy
R TSI /2
(5.2)
nc (x,y)dx
TSI /2
where nc (x, y) is the effective channel concentration. In [92], using a Taylor series expansion of
log(nc (x, y)), an analytical model is developed for leakage in individual transistors and transistor
stacks. The model correctly predicts an exponential loss in gate control over the channel with
increasing TSI /decreasing LG , and hence an exponential increase in ILEAK . However, using the
above approach to extract leakage distributions for large FinFET circuits is infeasible. Inspired by
the above observations, we formulate a macromodel for leakage in SG-mode FinFETs as
h
ILEAK = ISG0 e
b1
LG
h
i
a
a1 TSI + T 2
SI
(5.3)
where a1 , a2 and b1 are coefficients that are extracted from TCAD simulations of the device.
Figs. 5.4 and 5.5 show the variation in ILEAK over a wide range of values for LG and TSI simulated
in TCAD. The macromodel parameters are obtained by fitting the data points with the lowest-degree
polynomial yielding the least residual.
For IG-mode FinFETs, the back-gate bias (Vb ) can alter Vth , and it is an effective knob to control
leakage through the factor ( =
Vth
Vb ),
3TOXF
3TOXB + TSI
(5.4)
From TCAD simulation data shown in Fig. 5.6, the dependence of ILEAK on Vb is better approximated by a quadratic fit than a linear fit around the nominal back bias. Hence, we incorporate the
63
0.1
TCAD
linear fit
LEAK LEAK
) (m)
0.05
y = 7.6*x + 0.079
0.1
0.15
L log(I
/I
0.05
0.2
0.25
0.01
0.015
0.02
0.025
0.03
0.035
0.04
LG (m)
Figure 5.4: Matching SG-mode FinFET TCAD simulations with the macromodel for different LG
effect of Vb using
h
ILEAK = IIG0 e
b1
LG
h
i
a
a1 TSI + T 2
SI
2
e[k1 (Vb Vb0 ) +k2 (Vb Vb0 )]
(5.5)
1 2 ILEAK
1 2 ILEAK
2
|
|(L ,T ) T2SI
+
2
G
SI
2 L2G (LG ,TSI ) LG 2 TSI
(5.6)
ILEAK
LG
2
|(L
,TSI
2
) LG +
64
ILEAK
TSI
2
|(L
,TSI )
T2SI
(5.7)
0.35
TCAD
quadratic fit
0.25
0.3
0.2
0.15
0.1
0.05
0
0.05
0.005
0.01
0.015
0.02
0.025
0.03
TSI (m)
Figure 5.5: Matching SG-mode FinFET TCAD simulations with the macromodel for different TSI
Fig. 5.7 shows that the spread in ILEAK predicted by the macromodel for an SG-mode nFinFET
with coefficients derived from Eqs. (5.4) and (5.5) is in good agreement with that obtained from
TCAD QMC simulations with inputs LG and TSI distributed normally with 3/ 10%.
5.4.2
log(ILEAK/ILEAK )
0
2
y = 14*x 0.26
y = 6.5*x2 + 17*x 0.017
4
TCAD
linear fit
quadratic fit
6
8
0.5
0.4
0.3
0.2
0.1
Vb (V)
0.3
linear
quadratic
Residuals
0.2
0.1
0
0.1
0.2
0.5
0.4
0.3
0.2
0.1
V (V)
b
Figure 5.6: Matching IG-mode TCAD simulations with the macromodel for different Vb
greatly reduces the computational burden for determining leakage in a NAND/NOR stack, unlike the
solution of transcendental equations presented in [92], which compute the mid-point voltage of the
stack. Figs. 5.10 and 5.11 show that the SG-mode NAND ILEAK for input vectors (00, 01, 10, 11)
follows an identical form (with different coefficients) to that of a single SG-mode nFinFET device
shown in Figs. 5.4 and 5.5, respectively. Fig. 5.12 shows that the mean and variance predicted by the
macromodel using the coefficients extracted from Figs. 5.10 and 5.11 closely match those obtained
from TCAD QMC simulations. A similar trend is observed for SG-mode NOR/INV gates and all
the LP- and MT-mode gates as well [Vb = 1.2V (0.2V) for pFinFETs (nFinFETs)]. Fig. 5.13
shows the distributions in leakage for input vector 10 in SG-, LP- and MT-mode NAND gates.
The LP-mode gates have the best leakage probability density function (PDF). While MT- and SGmode gates have similar means, the spread is smaller in the MT-mode case. In the next section, we
66
Normalized Occurence
0.9
Model
QMC data
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
11
10.5
10
9.5
Log (I
8.5
/ 1 A)
SG,LEAK
5.5 Methodology
In this section, we describe the methodology adopted to examine leakage in large FinFET circuits
using macromodels developed in the earlier section. Fig. 5.14 describes the simulation flow that
was adopted to extract the die-level leakage distributions of circuits synthesized using SG/LP/MTmode gates at various delay slacks, considering process variations with spatial correlations. First,
we synthesized a pure SG-mode circuit using Synopsys Design Compiler [96]. Then we fed the
output to a linear programming tool presented in [6], which produces a power-optimized netlist
consisting of SG-, LP- and MT-mode standard cells for a particular delay constraint. The SG-, LPand MT-mode libraries consisted of INV, NAND and NOR gates in five sizes: X1, X2, X4, X8 and
X16. We obtained the standard cell layouts using Magic [85] and used Timberwolf [86] to place
and route the cells.
67
Vdd
Vdd
Vhl
Vhl
Vdd
Vhl
Vlow
Vhl
Vlow
Vlow
(a) SG
(b) LP
(c) MT
68
00
01
10
11
0.1
0.2
/I
LEAK LEAK
0.05
) (m)
L log(I
0.15
0.25
0.01
0.015
0.02
0.025
0.03
0.035
0.04
L (m)
G
0.35
00
01
10
11
0.25
0.2
0.15
0.1
SI
log(I
/I
LEAK LEAK
) (m)
0.3
0.05
0
0.005
0.01
0.015
0.02
SI
0.025
0.03
(m)
Figure 5.11: SG-mode NAND leakage from TCAD for different TSI
69
Normalized Occurence
1
Model
QMC data
0.8
0.6
0.4
0.2
0
11.5
11
10.5
10
9.5
8.5
Log(I /1 A)
00
Figure 5.12: SG-mode NAND I00 distribution predicted by the model and TCAD QMC simulations
200
180
SGmode
LPmode
160
MTmode
Occurence
140
120
100
80
60
40
20
0
11
10.5
10
9.5
Log (I
10
8.5
/ 1A)
Figure 5.13: I10 distributions for SG-, LP-, and MT-mode NAND gates
70
FinE
Sentaurus
TCAD
Verilog netlist
SG-mode
library
Synopsys
Design Compiler
LP/MTmode
library
Linear
programming
tool
No
Residuals
negligible ?
Yes
Area
extraction
FinE
QMC
simulation
Model
coefficients
Leakage
spatial
correlation
matrix
Timberwolf
place &
route
Positive
semidefinite ?
Spatial grid
assignment
for gates
No
Yes
Latin hypercube
sampling with
correlation
Overall leakage
distribution
(5.8)
Here, LG0 represents the nominal gate length and LG (2, 2), LG (1, 1), and LG (0, 1) are the
zero-mean normal random variables in their corresponding regions. Consider the logic gates in
regions (2,2), (2,4) and (2,16). Gates in regions (2,2) and (2,4) have two common parent regions,
and thus LG (2, 2) and LG (2, 4) are tightly correlated, i.e., if the gate in (2,2) has less than nominal
LG , then it is highly probable that the gate in (2,4) will also have less than nominal LG . On the
other hand, the gate in region (2,16) only shares (0,1) as a parent region with the other two gates,
71
Corr(LG (2, 2), LG (2, 4)) = V ar(LG (1, 1)) + V ar(LG (0, 1))
(5.9)
where V ar(LG (1, 1)) and V ar(LG (0, 1)) are the variances of LG (1, 1) and LG (0, 1), respectively. On the other hand, Corr(LG (2, 2), LG (2, 16)) = V ar(LG (0, 1)). The number of
levels the grid is partitioned into typically depends on the processes affecting LG and TSI .
Using the input correlation matrices generated for LG and TSI , we generated the spatial correlation matrix of leakage currents between the logic gates. Using the Taylor expansion of the model
in Eq. (5.3), ILEAK can be approximated as
ILEAK = ecTSI +dLG
(5.10)
Here, , c and d are constants that depend on ISG0 , a1 , a2 and b1 . Clearly, ILEAK is a lognormal
random variable which can be expressed as eY , where Y is a normal random variable with mean
c TSI + d LG and variance c2 T2SI + d2 L2 G .
As mentioned in Section 5.4, the leakage current for a gate is also input vector dependent. For
each input vector, the basic template for ILEAK remains the same, but the constants a1 , a2 and b1
change. Therefore, the average leakage current through a gate is given by:
avg
ILEAK
=
X
i
72
i
pi ILEAK
(5.11)
i
Here, pi represents the probability of occurrence of the ith input vector state, and ILEAK
is the
leakage current in that state. The leakage probabilities corresponding to different input vector states
are obtained using Synopsys Design Compiler [96].
Denoting the gates in regions (2,2) and (2,4) in Fig. 5.15 as gate A and B, respectively, we find
avg
avg
Corr(ILEAK
, ILEAK
)=
A
B
XX
i
j
i
pi pj Corr(ILEAK
, ILEAK
)
A
B
(5.12)
j
i
where ILEAK
= Ai eYi , ILEAK
= Bj eZj , Yi = cA,i TSIA + dA,i LGA , and Zj = cB,j
A
B
j
i
Cov(ILEAK
, ILEAK
)
A
B
I i
LEAKA
(5.13)
I j
LEAKB
j
i
The covariance between ILEAK
and ILEAK
can be expressed as
A
B
j
i
, ILEAK
) = Cov(eYi , eZj )
Cov(ILEAK
A
B
=e
2 + 2
Y
Z
Yi +Zj + i 2 j
(5.14)
(e
Cov(Yi ,Zj )
2
1)
(5.15)
where Cov(TSIA , TSIB ) and Cov(LGA , LGB ) are obtained as described earlier. Using our assumptions, the grid-based method does not guarantee that the spatial correlation matrix will be positive
semi-definite and, hence, we use the algorithm from [102] to generate the closest correlation matrix
from a given symmetric matrix. Thereafter, we use Latin hypercube sampling on ILEAK for each
gate, by imposing the above correlation in leakage between the gates, to obtain the overall leakage
distribution for the circuit. In the next section, we present the results of our work using the above
methodology on various benchmark circuits.
73
Table 5.2: Comparison of SG-, SG + LP- and SG + MT-mode synthesis techniques for ISCAS 85
benchmarks at iso-delay
Total leakage current IT OT
Benchmark circuit
SG-mode
SG + LP-mode
SG + MT-mode
Mean (A) Std. (A) Mean (A) Std. (A) Mean (A) Std. (A)
c17
1.59
0.23
1.13
0.13
1.21
0.15
c432
25.60
0.79
12.79
0.38
16.98
0.61
c499
72.16
1.24
46.34
0.46
57.97
0.61
c880
72.11
1.68
30.96
0.49
45.83
0.64
c1908
64.86
1.49
35.57
0.60
52.02
0.83
c3540
160.15
2.50
85.53
0.89
128.12
1.42
c5315
192.30
2.60
89.73
1.89
148.66
1.02
c7552
259.46
2.08
143.54
2.04
204.61
1.22
Savings
0
0
47.5%
45.4%
22.7%
48.5%
mark circuits under process variations, synthesized with SG/LP/MT-mode logic gates using the
methodology described in Section 5.5. The parameters for LG and TSI were set to LG = 25nm,
TSI = 10nm and 3/ = 10%.
Table 5.3 shows the results of synthesizing ISCAS85 benchmarks using SG-, SG + LP- and
SG + MT-mode libraries at iso-delay. Major column I (II) gives the IT OT and IT OT for circuits
synthesized using only SG-mode (SG + LP-mode) gates. It can be seen that the IT OT (IT OT ) of
circuits synthesized using SG + LP-mode gates is, on an average, 47.5% (45.4%) lower than that
of the circuits synthesized using only SG-mode gates. The average number of LP-mode gates in
circuits synthesized using SG + LP-mode gates is around 60%. LP-mode gates are slower than SGmode gates and thus larger LP-mode gates are required to meet the same timing constraint. Though
circuits synthesized using a combination of SG + LP-mode gates have larger LP-mode gates, SG
+ LP-mode netlists have a superior leakage PDF because of the considerably reduced IT OT and
IT OT of the LP-mode gates as compared to that of the SG-mode gates. Major column III gives
IT OT and IT OT for circuits synthesized using SG + MT-mode gates. The IT OT (IT OT ) for
these circuits is, on an average, 22.7% (48.5%) lower than that of circuits synthesized using only
SG-mode gates. However, the IT OT is larger than that of the circuits synthesized using SG + LPmode gates. IT OT of SG + MT-mode circuits is, in general, larger than that of SG + LP-mode
74
Table 5.3: Comparison of SG-, SG + LP- and SG + MT-mode synthesis techniques for ISCAS 85
benchmarks at iso-delay
Total leakage current IT OT
Benchmark circuit
SG-mode
SG + LP-mode
SG + MT-mode
Mean (A) Std. (A) Mean (A) Std. (A) Mean (A) Std. (A)
c17
1.59
0.23
1.13
0.13
1.21
0.15
c432
25.60
0.79
12.79
0.38
16.98
0.61
c499
72.16
1.24
46.34
0.46
57.97
0.61
c880
72.11
1.68
30.96
0.49
45.83
0.64
c1908
64.86
1.49
35.57
0.60
52.02
0.83
c3540
160.15
2.50
85.53
0.89
128.12
1.42
c5315
192.30
2.60
89.73
1.89
148.66
1.02
c7552
259.46
2.08
143.54
2.04
204.61
1.22
Savings
0
0
47.5%
45.4%
22.7%
48.5%
circuits, except for the c5315 and c7552 benchmarks. This is because, as stated earlier, the mean
and variance of MT-mode gates is, on an average, lower than that of SG-mode gates but higher than
that of LP-mode gates.
Table 5.4: Mean and std. deviation of IT OT for ISCAS 85 benchmarks for TSI = 0 and LG = 0
Benchmark
TSI = 0
LG = 0
circuit
Mean (A) Std. (nA) Mean (A) Std. (A)
c17
0.04
0.04
1.67
0.23
c432
0.69
0.16
28.06
0.62
c499
1.97
0.25
82.63
0.99
c880
1.99
0.29
81.35
1.14
c1908
1.75
0.24
72.91
1.26
c3540
4.46
0.39
180.80
2.03
c5315
6.95
0.58
287.63
2.29
c7552
7.12
0.48
294.86
1.96
Table 5.4 presents the results for circuits synthesized using only SG-mode gates, due to the
variation in LG and TSI individually in Major columns I and II, respectively. For
= 10%,
the variation in IT OT due to LG is not substantial. However, when TSI alone varies, IT OT and
IT OT are similar to the case where LG and TSI vary together. This observation is consistent with
the leakage current trend of a single FinFET shown in Figs. 5.4 and 5.5 (a small variation about the
mean LG in Fig. 5.4 translates to a linear change in log(ILEAK /ILEAK0 ), whereas a small variation
about the mean TSI in Fig. 5.5 results in a quadratic change in log(ILEAK /ILEAK0 )).
Fig. 5.16 shows the IT OT PDF for the c880 benchmark circuit synthesized using only SG-mode
75
3.5
x 10
Uncorrelated
Relative Units
2.5
2
Correlated
1.5
1
0.5
0
2.6
TOT
(A)
x 10
Figure 5.16: Spreads in IT OT in the correlated and uncorrelated cases for benchmark circuit c880.
gates for the cases when LG and TSI are assumed to be correlated and uncorrelated. IT OT for both
the cases is similar, however, IT OT doubles in the correlated case. The latter occurs due to the
fact that gates with correlated dimensions are likely to have large leakage currents simultaneously,
leading to a wider spread.
Fig. 5.17 shows the normalized area, IT OT , and IT OT obtained by increasing the fraction
of LP-mode gates in a pure-SG mode netlist at iso-delay. Each point in the figure is obtained by
normalizing the metric of the SG + LP-mode circuit to a 100% SG-mode circuit synthesized for the
same delay. The normalized area increases sharply, while IT OT falls gradually and IT OT drops
sharply as the percentage of the LP-mode gates is increased in the netlist. Fig. 5.18 shows the
effect of increasingly mixing LP-mode (MT-mode) gates in a pure-SG mode netlist for the c880
benchmark at an increasing output arrival time constraint. There is a 80% (87%) improvement in
IT OT (IT OT ) as we move from the 100% SG-mode circuit to the 80% SG-mode circuit. On the
other hand, the gain decreases rapidly as we increase the fraction of LP-mode gates in the circuits.
This happens because initially all the large SG-mode gates get substituted with LP-mode gates and,
hence, there is a large reduction in IT OT and IT OT . As we substitute more and more LP-mode
76
1.6
1.4
1.2
Norm. area
Norm. (ITOT )
Norm. (ITOT )
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
77
1
0.9
SG + LPmode norm. (I
TOT
0.8
SG + MTmode norm. (I
TOT
0.7
0.6
0.5
0.4
SG + MTmode
SG + LPmode
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
78
gates, the smaller SG-mode gates are replaced eventually. Since the difference in IT OT (IT OT )
between a small SG-mode and LP-mode gates is small, the gain starts diminishing. A similar trend
is observed for the MT case, where IT OT (IT OT ) improves by 76% (83%) as we move towards
0.8
40% SG +
60% LPmode
40% SG +
60% MTmode
100% SGmode
0.6
0.4
0.2
TOT
0
2
ITOT (A)
7
5
x 10
Figure 5.19: Cumulative distribution function of IT OT for 100% SG-mode vs. 40% SG + 60%
LP-mode (MT-mode) gates at iso-delay for benchmark circuit c880.
Fig. 5.19 shows the cumulative distribution function (CDF) for c880 synthesized using SG, SG + LP- and SG + MT-mode logic gates at iso-delay. It can be seen that the slope of the
SG CDF is smaller than the slope of SG + LP (SG + MT) CDF, implying larger variance for
circuits synthesized using only SG-mode gates. The area of the SG-mode netlist is 31.2% (29.0%)
smaller than the area of the SG + LP-mode (SG + MT-mode) netlist. The solid vertical lines
show the different leakage current constraints. If the primary design objective is area along with a
reasonable margin for leakage current, then circuits can be synthesized using SG-mode gates under
a small failure probability. However, if the leakage constraints are tight, then the circuit needs to
be synthesized using SG + LP-mode gates. The SG + LP-mode circuit can meet all three leakage
constraints. On the other hand, the SG-mode circuit can meet only one leakage constraint. Again,
79
c880 synthesized using SG + MT-mode gates lies in the middle of the spectrum and is able to meet
two leakage current constraints. Thus, SG + MT-mode circuits offer greater yield as compared to
circuits synthesized using SG-mode gates, but lower yield as compared to circuits synthesized using
SG + LP-mode gates.
80
Chapter 6
81
standard cell behavior under variations. In this chapter, we concentrate on delay. This needs to be
done for various FinFET logic styles. The delay model can also be very useful while performing
statistical static timing analysis (SSTA) of a FinFET circuit. To the best of our knowledge, this is
the first work to study the most critical process parameters affecting FinFET delay and then extend
the framework to develop variation-aware FinFET delay models.
The major contributions of this work can be summarized as follows:
We identify the most critical parameters that affect the saturation current (Ids ) of the SGFinFET and IG-FinFET. Since Ids directly influences the delay of a logic gate, these parameters impact the delay arcs of logic cells as well.
We show that the dependence of Ids (hence, delay) on the process variation of gate length LG
is remarkably different from the case of conventional bulk MOSFETs.
We develop delay RSM models for FinFET standard cells using CCRD under environmental
and lithographic variations.
We show that the delay RSM models are in close agreement with MC simulations.
We extend the delay RSM models to incorporate the effect of temperature on delay.
The rest of the chapter is organized as follows. In Section 6.2, we analyze the effects of process
and environmental variations on the delay of FinFETs. In Section 6.3, we describe the design of
experiments (DOEs). In Section 6.4, we demonstrate the efficacy of delay-based RSM models under
process variations. In Section 6.5, we enhance the delay RSM models to incorporate the effect of
temperature variation on delay. We conclude in Section 6.6.
6.2
Delay modeling
In this section, we study the effects of process and environmental variations on FinFET Ids . The delay of a logic gate is directly correlated with the Ids of the transistor [103]. Thus, we analyze Ids for
studying the effects of variations on delay. We first analyze the effects of temperature variation on
FinFET delay and thereafter identify the critical process parameters whose variations affect FinFET
82
delay the most. We study the effect of variations on both SG- and IG-mode FinFETs. The FinFET
device parameter values are shown in 6.1.
Table 6.1: FinFET device parameters
PARAMETERS
LGF , LGB (nm)
20
TOXF , TOXB (nm)
1
TSI (nm)
10
HF in (nm)
50
HGF , HGB (nm)
20
LSP F , LSP B (nm)
20
LU N (nm)
10
NBODY (cm3 )
1015
n (eV )
4.4
p (eV )
4.8
NSD (cm3 )
1020
VDD (V )
1
6.2.1
There are two kinds of variations in integrated circuits: environmental and physical (or spatial).
Most works on process variations analyze the physical variations in the fundamental process parameters. However, environment-based temporal variations may also be manifested due to varying
operating conditions. They can occur at a frequency of nanoseconds to years [10]. For example,
effects, such as negative or positive bias temperature instability, lead to variations in Vth over the
circuit lifetime. On the other hand, varying computing workload leads to temporal variations in the
chip temperature. Thermal packaging and heat dissipation issues become an important concern for
FinFETs because of their SOI structure. FinFETs may attain a very high temperature at large input
switching activity [51]. Therefore, it is very important to validate FinFET circuit designs at various
temperature corners. We analyze nFinFETs. (A similar analysis is applicable to pFinFETs.)
The dependence of Ids on temperature can be understood through the following equation:
Ids = COX
Wef f
(Vgs Vth )
Lef f
(6.1)
where , COX , Wef f , Lef f and Vgs are the mobility, gate capacitance, effective width of the transistor, effective channel length and gate-source voltage, respectively [103]. Wef f is equal to 2HF in
83
for an SG-FinFET and HF in for an IG-FinFET. Vth incorporates the effects of TSI and n . The
temperature dependence of Ids originates from the dependence of and Vth on temperature. As
temperature T increases, Vth decreases because of the increased intrinsic carrier concentration at
the channel surface. This increased concentration results in the shifting of the Fermi level towards
the conductivity band, and thus lowering of Vth . An analytical model for Vth is presented in [104]:
Vth = M S
KT
q 2 ni TSI TOX
ln(
)
q
4OX KT
(6.2)
where M S is the difference in the Fermi level of metal and semiconductor. K, q, ni , and OX
are the Boltzmann constant, electron charge, intrinsic carrier concentration, and permittivity of
the oxide, respectively. It is evident from this equation that Vth has a negative correlation with
temperature T . However, with an increase in temperature, the gate drive (Vgs Vth ) increases. On
the other hand, a temperature increase aggravates lattice scattering, thus reducing electron mobility.
Hence, these two effects counteract each other, making the change in Ids dependent on the relative
sensitivities of and Vth at that temperature.
In order to investigate the temperature effect, we simulated an SG-nFinFET and IG-nFinFET,
with gate and drain tied to VDD and source shorted to ground, at two different temperatures: 25o C
and 125o C. Fig. 6.1 shows Ids with varying voltages at the two temperatures. At VDD = 1.0V , Ids of
an SG-nFinFET (IG-nFinFET) at T = 25o C is 12% (3%) larger than Ids at T = 125o C. However,
at VDD = 0.5V , Ids of an SG-nFinFET (IG-nFinFET) at T = 25o C is 3% (33%) lower than Ids at
T = 125o C. At VDD = 0.54V (0.9V ), Ids is independent of temperature for an SG-nFinFET (IGnFinFET). Thus, IG-nFinFETs can be seen to be affected by temperature more than SG-nFinFETs.
Hence, depending on the supply voltage, Ids may decrease, increase or remain constant with varying
temperature. This is due to the counteracting effects of and (Vgs Vth ).
The Ids of IG-nFinFETs behaves differently with temperature than Ids of SG-nFinFETs because
the back-gate bias controls the electron concentration of the channel, thus controlling the Vth and
of the electrons in the channel. Thus, it is extremely important to characterize both FinFET modes
under temperature variations, not just the SG mode.
It should be noted that Ids has a strong dependence on LG and TSI . Thus, the temperatureinsensitive delay point will also have a strong dependence on LG and TSI . Fig. 6.2 shows Ids at
84
x 10
SGnFinFET
25 C
ds
(A)
4
3
125 C
o
125 C
25oC
1
0
1
0
IGnFinFET
0.2
0.4
0.6
Vds=Vgs=1.0V
0.8
Figure 6.1: Variation of nFinFET saturation current with voltage and temperature
various voltages and TSI , at 25o C and 125o C, both for SG- and IG-nFinFET. The temperatureinsensitive delay point for SG-nFinFET (IG-nFinFET) is at 0.56V (1.0V ) for TSI = 5nm. The
point shifts towards the origin with increasing TSI . This is because the electron mobility decreases
with decreasing TSI . A reduction in TSI leads to a narrow confinement of volume-inverted charge
in the real space, increasing phonon scattering [105]. Also, surface roughness scattering increases
with a reduction in TSI , resulting in a lower Ids .
6.2.2
In this section, we discuss the effect of fundamental process parameters on Ids . We also identify the
most critical parameters that affect Ids of both the SG- and IG-nFinFET.
Fig. 6.3(a) (6.3(b)) shows the variation of normalized Ids for SG-nFinFET (IG-nFinFET) with
normalized LG , TSI , TOX and n . LGo , TSIo , TOXo and no denote the nominal device parameters
shown in Table 6.1. Idso is the current at the nominal device parameter values. The polynomials
were fitted to simulation data with a root mean square (RMS) error of less than 3%. It can be seen
from the figures that Ids changes quadratically (quadratically) with LG , cubically (quadratically)
85
x 10
TSI=20nm
25 C
125 C
o
25 C
125oC
ds
(A)
TSI=12nm
25oC
TSI=5nm
125 C
0
1
0
0.2
0.4
V =V
ds
gs
0.6
(V)
0.8
(a) SG-nFinFET
5
x 10
TSI=20nm
25oC
2.5
125oC
2
Ids (A)
25 C
1.5
TSI=12nm
1
125 C
0.5
TSI=5nm
0
o
o
25 C 125 C
0.5
0
0.2
0.4
V =V
ds
gs
0.6
(V)
0.8
(b) IG-nFinFET
86
1.2
1.05
0.06x20.37x+1.31
0.49x 0.08x+0.58
0
ds ds
I /I
I /I
ds ds
1.1
1
0.9
0.8
0.9
0.95
1.1
L /L
0.9
1.1
/T
OX
OX
2.2x37.42x2+8.46x2.26
6.18x+7.37
0
Ids/Ids
I /I
ds ds
1.1
1
1.5
0.9
0.8
0.9
TSI/TSI
0.5
0.9
1.1
0.95
1.05
n/n
1.1
(a) SG-nFinFET
1.15
1.3
0.28x +0.39x+0.32
ds ds
1.1
1
0.9
0.95
0.8
0.8
0.9
0.9
0.8
1.1
LG/LG
1.05
0.9
TOX/TOX
1.1
2.5
0.02x2+0.61x+0.35
10.18x+11.20
ds ds
1
0.95
I /I
Ids/Ids
0.13x20.66x+1.54
1.05
I /I
ds ds
I /I
1.1
0
1.2
0.9
0.85
0.8
0.8
1.5
1
0.5
0.9
TSI/TSI
0
0.9
1.1
0.95
1.05
1.1
(b) IG-nFinFET
Figure 6.3: Saturation current dependence on process parameters for SG- and IG-nFinFET
87
with TSI , quadratically (quadratically) with TOX , and linearly (linearly) with n for SG-nFinFET
(IG-nFinFET). Though Ids varies linearly with n , Ids varies the most when there is a 10% variation
in n . More precisely, in the case of the SG-nFinFET, Ids varies from 0.6Idso to 1.77Idso when there
is a 10% variation in n . On the other hand, Ids only varies from 0.83Idso to 1.18Idso , 0.94Idso to
1.01Idso , and 0.97Idso to 1.02Idso , when there is 10% variation in LG , TSI and TOX , respectively.
The impact of variation in n is also the most profound in the case of the IG-nFinFET. Thus, though
Ids varies quadratically, cubically and quadratically with LG , TSI and TOX , respectively, a small
variation in n can manifest itself as a large variation in Ids because of the strong coefficients in
its linear model. However, if the process variation in LG , TSI and TOX is larger than that in n , it
can result in a strong variation in Ids because of the corresponding polynomial dependence. This
implies that all four parameters are critical for modeling the delay of standard cells under process
variations.
Another important point to note is that Ids monotonically increases with LG . This is in stark
contrast to bulk CMOS where Ids decreases with an increase in LG . This is owing to a slight
difference in the fabrication processes of FinFETs and conventional MOSFETs. To explain this
phenomenon, we first discuss the traditional planar MOSFET fabrication technology. We highlight
the part that determines Ids behavior controlled by LG of the device. Next, we show how FinFET
fabrication differs, resulting in a different Ids behavior with process variations in LG .
SiO
Poly-silicon gate
Poly-silicon gate
L G1
L G2
SiO
n+
SiO
SiO
n+
n+
n+
Si-substrate
Si-substrate
Figure 6.4: Effect of process variation on physical gate length of traditional planar MOSFETs
We explain the impact of LG on Ids with the help of a bulk nMOS transistor (similar explanation
is also applicable to a pMOS transistor). First, an oxide layer is created on the silicon substrate
followed by deposition of a polysilicon layer, which is used as gate material. Next, both layers are
88
etched to create the channel for the device. Then, the open area is doped using ion implantation to
create the source and the drain. Hence, any variation in the process of etching away the polysilicon
gate layer determines the closeness of the source and drain regions. This can be seen from Fig. 6.4.
It shows two nMOS transistors with different channel lengths: LG1 and LG2 . With increasing LG ,
the distance between source and drain increases, which leads to a reduction in Ids because now the
electrons have to traverse a longer path.
(a) FinFET with small gate length (b) FinFET with large gate length
89
resistance offered by the undoped body. Thus, as LG increases the resistance between source and
drain decreases in a FinFET. This, in effect, gives rise to a larger Ids .
To quantitatively analyze the impact of process parameter variation on FinFET delay, the sensitivity analysis of Ids to process parameter (P r) is next performed. Ids is calculated at the point
where Vds = Vgs = 1.0V . A simple three-point experiment is performed. For each P r, the slope of
Ids is first calculated between the and + 3 points, and then between the and 3 points.
The average of the two slopes defines the total sensitivity of Ids to P r. The average of the two
slopes is calculated to account for the inherent nonlinearity of Ids with respect to process parameter
variation. In order to make sensitivity dimensionless, it is divided by the nominal current, Idsnom ,
and multiplied by the nominal process parameter, P rnom . Making it dimensionless makes it easier
to compare sensitivities across various process parameters. Thus, dimensionless sensitivity S is
given by:
S=
4Ids
Idsnom
4P r
/
P rnom
(6.3)
Fig. 6.6 shows the absolute value of S for various P r, i.e., LG , TSI , TOX and n , both for SGand IG-nFinFETs. Ids can be seen to be most sensitive to variations in n , followed by variations
in LG , TSI and TOX . This corroborates the results presented in Fig. 6.3. It can be seen that IGnFinFETs are more prone to Ids variations as compared to SG-nFinFETs. Since Ids is directly
related to FinFET delay, it can be safely concluded that variations in n also impact delay the most.
However, due to lithographic effects, such as line edge roughness, the variations in LG , TSI and
TOX can also be substantial. Thus, we consider all four process parameters for delay modeling.
91
(6.4)
The goal is to optimize the response variable (y). It is assumed that the independent variables are
continuous.
To model the above equation, we need to evaluate f at various input vectors. Also, three distinct
input values are needed for each variable if one needs to model a quadratic-level response surface.
Hence, two-level factorial designs cannot be used. Also, full factorial designs are not preferred
because of the huge computation time associated with them. An effective alternative to factorial
designs is the CCRD, originally developed by Box and Wilson and later improved upon by Box and
Hunter [107].
Fig. 6.7 shows the geometrical representation of a CCRD for three variables. It consists of eight
factorial cube points, six axial points, and a center point. In general, the number of tests required for
a k-variable CCRD is 2k factorial points, 2k axial points and a center point, for a total of 2k +2k +1
experiments. However, the factorial portion can also be a fractional factorial design [108]. The
factorial points generate the coefficients for the linear terms and the axial points for the quadratic
terms. The axial points are chosen such that they allow rotatability, which ensures that the variance
of model prediction is constant at all points equidistant from the design center. After the range of
input variables has been fixed, they are coded as 1 for factorial points, for axial points, and
0 for the center point. The coded values are calculated as functions of the range of interest of each
variable, as shown in Table 6.2 [109]. Here, xmax (xmin ) denotes the maximum (minimum) value
of the variable and = 2k/4 . In our case, k = 5. However, we have chosen fractional factorial
design. Hence, the number of factorial points is 16 (2(51) ). Also, = 2(51)/4 = 2 for these
fractional factorial designs [108].
92
1
0
+1
+
In order to simulate the delay response effect of the process variation parameters, CCRD was
simulated for five different parameters: LG , TSI , TOX , n , and p . A five-factor and five-coded
level CCRD was used to determine the delay response for the standard cells in the library (factors
are essentially the process parameters in the present context). The total number of tests required
for the five-factor design is 27. Table 6.3 shows each of the process parameters along with its level
for CCRD. Table 6.4 shows the coded and actual values of variables for each of the experiments
conducted in the design space.
We next investigated response surface models, which are essentially a quadratic model of the
predictor variables. The RSM involves a group of statistical techniques for empirical model building
and model utilization. RSM models seek to relate a response variable to the levels of the predictors.
The most widely used RSM models are low-order polynomials. Second-order polynomials have a
general form given by
Yi = 0 + 1 Xi + 2 Xi Xj +
(6.5)
Here, Yi is the response variable and Xi s are the predictor variables. 0 , 1 , and 2 are the co93
Table 6.3: Process parameters along with their levels for CCRD
Process parameter
LG (nm)
TSI (nm)
TOX (nm)
n (eV )
p (eV )
Lowest
18
9.0
0.9
4.30
4.70
Highest
+
22
11.0
1.0
4.50
4.90
Table 6.4: Coded process parameters along with their actual values
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
2
2
0
0
0
0
0
0
0
Code
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
2
2
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
2
2
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
2
2
0
LG
19
19
19
19
19
19
19
19
21
21
21
21
21
21
21
21
18
22
20
20
20
20
20
20
20
20
20
94
TSI
9.5
9.5
9.5
9.5
10.5
10.5
10.5
10.5
9.5
9.5
9.5
9.5
10.5
10.5
10.5
10.5
10.0
10.0
9.0
11.0
10.0
10.0
10.0
10.0
10.0
10.0
10.0
TOX
0.95
0.95
1.05
1.05
0.95
0.95
1.05
1.05
0.95
0.95
1.05
1.05
0.95
0.95
1.05
1.05
1.00
1.00
1.00
1.00
0.90
1.10
1.00
1.00
1.00
1.00
1.00
n
4.35
4.45
4.35
4.45
4.35
4.45
4.35
4.45
4.35
4.45
4.35
4.45
4.35
4.45
4.35
4.45
4.40
4.40
4.40
4.40
4.40
4.40
4.30
4.50
4.40
4.40
4.40
p
4.85
4.75
4.75
4.85
4.75
4.85
4.85
4.75
4.75
4.85
4.85
4.75
4.85
4.75
4.75
4.85
4.80
4.80
4.80
4.80
4.80
4.80
4.80
4.80
4.70
4.90
4.80
Coeff. values
(1e-09)
0.6456
-0.0006
-0.0039
-0.0134
-0.1694
-0.0977
0.0
-0.0002
0.0001
-0.0002
0.0001
0.0001
0.0001
0.0024
-0.0019
-0.0025
0.0
0.0002
0.0086
0.0207
0.0114
0.2%
0.997
95
efficients of regression, and is the predictor noise. The predictor coefficients can be obtained by
minimizing across the whole sample space. Regression analysis aims to minimize the residual
sum of squares to calculate 0 , 1 and 2 . Ybi = 0 + 1 Xi + 2 Xi Xj is the predicted value of the
response variable through the regression equation. We used MATLAB 7.0 to minimize the residual
sum of squares and calculate the regression coefficients. Table 6.5 gives the regression coefficients
and the average error between simulation data and model predictions for SG-INV. The average absolute fitting error is 0.2%, i.e., the error encountered in fitting the 27 CCRD simulations to the
quadratic model. In the table, x1, x2, x3, x4 and x5 correspond to LG , TSI , TOX , n , and p , respectively. In order to determine the strength of the relationship between the response and predictor
variables, the coefficient of determination R2 is used [108]. The expression for R2 is given by (Yi
refers to the average across all Yi s)
Pn
(Yi Ybi )2
R = 1 Pi=1
n
2
i=1 (Yi Yi )
2
(6.6)
R2 is a statistic that gives us information on the goodness of the fit of the model. A value of
R2 = 1 corresponds to a perfect fit between the regression line and the data. R2 is the proportion
of variation in the dependent variable Yi that can be explained by predictors Xi in the regression
model. Table 6.5 indicates the value for R2 for the quadratic delay model of SG-INV, indicating a
very good fit.
RSM delay models were similarly developed for SG-NAND, LP-INV and LP-NAND, as shown
in Table 6.6. R2 values above 0.99 show that the model fits the data quite well. The methodology is
very general and can be used to characterize other standard cells as well.
6.4
In this section, we show that gate delays obtained through the RSM delay models developed using
CCRD simulations closely approximate delays obtained through MC simulations.
We first use TCAD-based MC simulations to obtain the golden intrinsic delay values of SG-INV
at 1000 random n values. The rise and fall times of the input were set to 5ps. Fig. 6.8 shows the
probability density function (PDF) for the inverter delay. As can be seen, there is a good match
96
Table 6.6: RSM delay model coefficients for SG-NAND, LP-INV and LP-NAND
Variable
const
x1
x2
x3
x4
x5
x1.x2
x1.x3
x1.x4
x1.x5
x2.x3
x2.x4
x2.x5
x3.x4
x3.x5
x4.x5
x12
x22
x32
x42
x52
Error
R2
Table 6.7: Average testing error for SG-INV, SG-NAND, LP-INV, and LP-NAND
SG-INV SG-NAND LP-INV LP-NAND
Parameters
Error
Error
Error
Error
n , p , LG , TSI , TOX
2.1%
0.2%
4.2%
1.2%
between RSM and MC based delays, with an average absolute testing error of only 0.9%. Similar
results were obtained when LG , TSI , TOX and p were assumed to have Gaussian distributions.
The absolute testing error was 1.0%, 0.8%, 0.3% and 0.4%, respectively, for these parameters.
Table 6.7 shows the average error obtained by using our RSM based delay models for SG-INV,
SG-NAND, LP-INV, and LP-NAND with n , p , LG , TSI and TOX assumed to have Gaussian
distributions. All the above process parameters were varied simultaneously in MC simulations. The
average absolute error ranged from 0.2% to 4.2%. The average speedup of RSM models across all
cells was 40, relative to MC simulations.
97
11
x 10
RSM Model
MC Simulation
0
6
9
Delay (s)
10
11
12
12
x 10
Figure 6.8: MC and RSM based delay distributions for SG-INV with n assumed to have a Gaussian
distribution
Yi = (0 + 1 Xi + 2 Xi Xj )T
(6.7)
Fig. 6.9(a) shows the plot of SG-INV delay at varying temperatures and at two different supply
voltages, 0.5V and 1.0V. At 1.0V (0.5V), the delay increases (decreases) with increasing temperature. These simulation results further support the results reported in Fig. 6.1. The explanation of
this behavior was given in Section 6.2.1. T can be calculated by fitting polynomials to the delaytemperature curve. For example, in Fig. 6.9(a), delay is linearly (quartically) dependent on T at
1.0V (0.5V).
98
10
9.5
0.5V
9
8.5
8
7.5
1.0V
2.06T1+3.62
6.5
6
5.5
1.05
1.1
1.15
1.2
1.25
1.3
T1 = T/300 (K)
(a) SG-INV
60
55
67.94T12211.77T1+200.1
50
0.5V
Delay (ps)
45
40
35
30
0.8T1 +4.6T1+11.79
25
1.0V
20
15
1.05
1.1
1.15
1.2
1.25
1.3
T1 = T/300 (K)
(b) LP-INV
99
Fig. 6.9(b) shows the delay-temperature curves for LP-INV. The decrease in delay at 0.5V with
increasing temperature for LP-INV is much sharper than that observed for SG-INV. The decrease in
delay for LP-INV (SG-INV) is 29.7% (0.3%) when the temperature changes from 300o K to 390o K.
This is because of the sharp increase in Ids for LP-INV at 0.5V (as explained in Section 6.2.1).
However, at 1.0V, the delay increases slightly with an increase in temperature.
Similarly, T was obtained for SG-NAND and LP-NAND at the two voltages. The delay trends
observed were similar to the delay trends of SG-INV and LP-INV.
100
Chapter 7
while the hole mobility is highest along the < 110 > channel orientation. Thus, using < 110 >
transistors in the pull-up network of the logic gate and < 100 > transistors in its pull-down network
can lead to better delay.
FinFETs still suffer from the effects of process variations due to factors such as line edge roughness and temperature variations. However, they do not suffer from the random dopant fluctuation
effect encountered in bulk transistors, since their body is undoped. Lithographic variations can lead
to deviation in FinFET parameters, such as LG , TSI , TOX . Further, these variations can be intra-die
or inter-die in nature. G is heavily dependent on the processing temperature and, hence, temperature variations during processing can lead to deviations in the value of G . Since both leakage
and delay heavily depend on the above process parameters, it is extremely important to characterize
variations in FinFET delay/leakage with variations in these parameters.
In Chapter 1, we outlined the obstacles in the scaling of conventional bulk MOSFETs. We
discussed several short-channel effects and how DGFETs can circumvent such problems. We also
discussed the different kinds of DGFETs proposed in the literature. Thereafter, we systematically
showed why FinFETs have emerged dominant among DGFETs.
In Chapter 2, we detailed related work in the field of FinFETs. We first discussed various
lithographic techniques used to fabricate FinFETs. We pointed out that spacer lithography provides
double the fin density when compared to optical lithographic techniques. Further, spacer lithography
produces uniform fins, which enables better short-channel control. Thereafter, we reviewed work
done in the area of FinFET logic synthesis. We discussed various innovative FinFET standard cells
along with logic synthesis algorithms specifically tailored to FinFETs. We also reviewed work done
in the area of FinFET SRAMs, specifically, how the dual-gate structure of FinFETs can be exploited
to improve various SRAM metrics, such as the read margin, write margin and cell stability. We also
studied how metal gate workfunction engineering can serve as a substitute for sizing in SRAM cells.
Finally, we reviewed work done in the area of FinFET process variations.
In Chapter 3, we proposed a low-power FinFET circuit synthesis methodology using multiple
supply and threshold voltages. We proposed a mechanism called TCMS for improving the power
efficiency of FinFET circuits. This scheme represents a significant divergence from conventional
multiple-supply voltage schemes. It also obviates the need for voltage level-converters. We employed accurate delay and power estimates using table look-up methods based on HSPICE sim102
ulations for supply voltage and threshold voltage optimization. Experimental results demonstrate
that TCMS can provide power savings of 67.6% and device area savings of 65.2% under relaxed
delay constraints. We also proposed two variants of TCMS that yield similar benefits. We compared our scheme to ECVS, a popular dual-Vdd scheme presented in the literature. ECVS makes
use of voltage level-converters. Even when it is assumed that these level-converters have zero delay,
thus significantly favoring ECVS in time-constrained power optimization, TCMS still outperforms
ECVS.
In Chapter 4, we proposed a low-power FinFET circuit synthesis methodology using surface
orientation optimization. FinFETs with channel surface along the <110> plane can be easily fabricated by rotating the fins by 45o from the <100> plane. By designing logic gates, which have
pFinFETs in the <110> plane and nFinFETs in the <100> plane, the gate delay can be reduced by
as much as 14%, compared to the conventional <100> logic gates. The delay reduction depends
upon the type of logic gate, dielectric constant of the oxide, and the technology node. The reduction in delay can be traded off for reduced power in FinFET circuits. We proposed a low-power
FinFET-based circuit synthesis methodology based on surface orientation optimization. We studied
various logic design styles, which depend on different FinFET channel orientations, for synthesizing low-power circuits. We used BSIM, a process/physics based double-gate model in HSPICE, to
derive accurate delay and power estimates. We designed layouts of standard library cells containing
FinFETs in different orientations to obtain an accurate area estimate for the low-power synthesized
netlists after place-and-route. We used a linear programming based optimization methodology that
gives power-optimized netlists, consisting of oriented gates, at tight delay constraints. Experimental
results demonstrated the efficacy of our scheme.
In Chapter 5, we proposed a die-level leakage power analysis algorithm for FinFET circuits
under process variations. We modeled the leakage probability density function in SG-, IG/LP-, and
MT-mode FinFET standard logic cells, and examined the leakage trade-offs in benchmark circuits
synthesized using combinations of SG-, LP-, and MT-mode logic cells under the effect of process
variations. Using quasi-Monte Carlo mixed-mode device simulations in Sentaurus TCAD, we developed simple macromodels to capture the physical effects that influence the leakage spread in
SG- and IG-mode FinFET devices, and extended it to stacked devices in NAND/NOR gates. We
also implemented a methodology to obtain the overall leakage current distribution for large circuits
103
(synthesized using SG/LP/MT-mode logic cells) using Latin hypercube sampling, considering spatial correlation on a quad-tree based grid. Results indicated that, starting from a 100% SG-mode
circuit, the leakage spread/yield point can be improved considerably by suitably introducing LPmode and MT-mode gates at iso-delay. We also showed that increasing the fraction of LP/MT-mode
gates (to reduce the mean and variance in leakage) in an SG-mode circuit, by permitting a delay
slack, yields diminishing returns. Mixing LP- and MT-mode gates with SG-mode gates appeared
to be a promising synthesis strategy that can leverage the leakage trade-offs offered by FinFET
standard cells.
In Chapter 6, we proposed a statistical delay characterization of FinFET standard cells under
design of experiments using response surface methodology (RSM). We statistically characterized
the delay of FinFET standard cells under spatial and environmental variations, using central composite rotatable design (CCRD) based on RSM. We identified the most critical parameters that affect
timing arcs of logic cells under lithographic process variations. We also showed that the delay trend
based on variations in a key process parameter is completely opposite of what one would expect in
conventional CMOS technology. These results formed the foundation of variation-aware (environmental and lithographic) delay models for FinFET standard cells (NAND and INV) implemented
in different logic styles, e.g., SG and LP. Results showed that the delay obtained from RSM models
developed for various standard cells are in close agreement with the delay obtained from Monte
Carlo simulations of the logic cells.
In summary, in this dissertation, we discussed some innovative low-power synthesis algorithms/tools
that exploit the unique characteristics of FinFETs. Further, we also proposed a variation-aware synthesis algorithm that takes into account the subthreshold leakage of logic gates. We also proposed
a methodology to calculate the probability density function of die-level leakage power of FinFET
circuits. We statistically characterized the delay of various FinFET standard cells using CCRD.
There are several areas related to the present work that can be explored further in the future:
In Chapter 5, we proposed a method for calculating die-level leakage power of FinFET circuits under process variations. However, the scheme does not take into account spatial or
temporal variations in the die temperature. Since FinFETs are likely to suffer from the ill
effects of self-heating, it is important to analyze chip-level leakage distribution of FinFET
104
105
Bibliography
[1] E. J. Nowak, I. Aller, T. Ludwig, K. Kim, R. V. Joshi, C.-T. Chuang, K. Bernstein, and
R. Puri, Turning silicon on its edge, IEEE Circuits and Devices Magazine, vol. 20, no. 1,
pp. 2031, Jan.-Feb. 2004.
[2] H.-S. P. Wong, K. K. Chan, and Y. Taur, Self-aligned (top and bottom) double-gate MOSFET with a 25 nm thick silicon channel, in Proc. Int. Electronic Device Mtg., Dec. 1997, pp.
427430.
[3] T.-J. King, FinFETs for nanoscale CMOS digital integrated circuits, in Proc. Int. Conf.
Computer-Aided Design, Nov. 2005, pp. 207210.
[4] 2007
International
Technology
Roadmap
for
Semiconductors,
http://www.itrs.net/Links/2007ITRS/Home2007.htm.
[5] Y.-K. Choi, T.-J. King, and C. Hu, Nanoscale CMOS spacer FinFET for the terabit era,
IEEE Electronic Device Lett., vol. 23, no. 1, pp. 2527, Jan. 2002.
[6] A. Muttreja, N. Agarwal, and N. K. Jha, CMOS logic design with independent gate FinFETs, in Proc. Int. Conf. Computer Design, Oct. 2007, pp. 560567.
[7] A. N. Bhoj and N. K. Jha, Pragmatic design of gated-diode FinFET DRAMs, in Proc. Int.
Conf. Computer Design, Oct. 2009, pp. 747751.
[8] K. Bernstein, C.-T. Chuang, R. V. Joshi, and R. Puri, Design and CAD challenges in sub90nm CMOS technologies, in Proc. Int. Conf. Computer-Aided Design, Nov. 2003, pp. 129
136.
106
[9] L. Chang, M. Ieong, and M. Yang, CMOS circuit performance enhancement by surface
orientation optimization, IEEE Trans. Electron Devices, vol. 51, pp. 16211627, Oct. 2004.
[10] S. Ganapath et al., Circuit propagation delay estimation through multivariate regressionbased modeling under spatio-temporal variability, in Proc. Design Automation & Test Europe Conf., Mar. 2010, pp. 417422.
[11] TSMC, http://www.eetimes.com/electronics-news/4213622/TSMC-to-make-FinFETs-in450-mm-fab.
[12] J.-H. Yang, Y.-S. Jin, H.-R. Lee, K.-S. Rha, J.-A. Choi, S.-K. Bae, S. Maeda, Y.-W. Kim,
and K.-P. Suh, Fully working 1.25m2 6T-SRAM cell with 45nm gate length triple gate
transistors, in Proc. Int. Electronic Device Mtg., Dec. 2003, pp. 2.1.12.1.4.
[13] 22nm FinFET SRAM, http://www.eetimes.com/electronics-news/4199830/IBM-partnersto-report-22-nm-FinFET-SRAM.
[14] Infineon FinFET chip, http://www.dailytech.com/Infineon+Tests+3D/article5208.htm.
[15] B. Doyle, B. Boyanov, S. Datta, M. Doczy, S. Hareland, B. Jin, J. Kavalieros, T. Linton,
R. Rios, and R. Chau, Tri-gate fully-depleted CMOS transistors: Fabrication, design and
layout, in Proc. Int. Symp. VLSI Technology, June 2003, pp. 133134.
[16] R. Dennard, F. Gaensslen, V. Rideout, E. Bassous, and A. LeBlanc, Design of ion-implanted
MOSFETs with very small physical dimensions, IEEE J. Solid-State Circuits, vol. 9, no. 5,
pp. 256268, Oct. 1974.
[17] Y.-K. Choi, N. Lindert, P. Xuan, S. Tang, D. Ha, E. Anderson, T.-J. King, J. Bokor, and C. Hu,
Sub-20 nm CMOS FinFET technologies, in Proc. Int. Electronic Device Mtg., 2001, pp.
19.1.119.1.4.
[18] X. Huang, W.-C. Lee, C. Kuo, D. Hisamoto, L. Chang, J. Kedzierski, E. Anderson,
H. Takeuchi, Y.-K. Choi, K. Asano, V. Subramanian, T.-J. King, J. Bokor, and C. Hu, Sub50nm FinFET: PMOS, in Proc. Int. Electronic Device Mtg., 1999, pp. 6770.
107
[19] D. Frank, Y. Taur, and H.-S. P. Wong, Future prospects for Si CMOS technology, in Proc.
Device Research Conf., 1999, pp. 1821.
[20] Y.-K. Choi, D. Ha, T.-J. King, and C. Hu, Threshold voltage shift by quantum confinement
in ultra-thin body device, in Proc. Device Research Conf., 2001, pp. 8586.
[21] A. Datta, A. Goel, R. T. Cakici, H. Mahmoodi, D. Lakshmanan, and K. Roy, Modeling
and circuit synthesis for independently controlled double gate FinFET devices, IEEE Trans.
Computer-Aided Design, vol. 26, no. 11, pp. 19571966, Nov. 2007.
[22] J. Ouyang and Y. Xie, Power optimization for FinFET based circuits using genetic algorithms, in Proc. IEEE Int. SOC Conf., Sept. 2008, pp. 211214.
[23] R. A. Thakker, C. Sathe, A. B. Sachid, M. Shojaei-Baghini, V. R. Rao, and M. B. Patil,
A novel table-based approach for design of FinFET circuits, IEEE Trans. Computer-Aided
Design, vol. 28, no. 7, pp. 10611070, July 2009.
[24] T. Ludwig, I. Aller, V. Gernhoefer, J. Keinert, E. Nowak, R. Joshi, A. Mueller, and
S. Tomaschko, FinFET technology for future microprocessors, in Proc. Int. SOI Conf.,
Oct. 2003, pp. 3334.
[25] K. Anil, K. Henson, S. Biesemans, and N. Collaert, Layout density analysis of FinFETs, in
Proc. European Conf., Solid-State Device Research, 2003, pp. 139142.
[26] M. Alioto, Analysis and evaluation of layout density of FinFET logic gates, in Proc. Int.
Conf. Microelectronics, Dec. 2009, pp. 106109.
[27] , Analysis of layout density in FinFET standard cells and impact of fin technology, in
Proc. Int. Symp. Circuits & Systems, May/June 2010, pp. 32043207.
[28] , Comparative evaluation of layout density in 3T, 4T, and MT FinFET standard cells,
IEEE Trans. VLSI Systems, vol. 19, no. 5, pp. 751762, May 2011.
[29] R. Joshi, K. Kim, and R. Kanj, FinFET SRAM design, in Proc. Int. Conf. VLSI Design,
Jan. 2010, pp. 440445.
108
[40] A. Agarwal, D. Blaauw, V. Zolotov, and S. Vrudhula, Computation and refinement of statistical bounds on circuit delay, in Proc. Design Automation Conf., June 2003, pp. 348353.
[41] L. Scheffer, Explicit computation of performance as a function of process variation, in
Proc. ACM/IEEE Int. Wkshp. on Timing Issues in the Specification and Synthesis of Digital
Systems, 2002, pp. 18.
[42] A. Agarwal et al., Statistical timing analysis for intra-die process variations with spatial
correlation, in Proc. Int. Conf. Computer-Aided Design, Nov. 2003, pp. 900907.
[43] H. Chang and S. Sapatnekar, Statistical timing analysis considering spatial correlations using a single PERT-like traversal, in Proc. Int. Conf. Computer-Aided Design, Nov. 2003, pp.
621625.
[44] H. Chang, V. Zolotov, S. Narayan, and C. Visweswariah, Parameterized block-based statistical timing analysis with non-Gaussian parameters, nonlinear delay functions, in Proc.
Design Automation Conf., June 2005, pp. 7176.
[45] A. Srivastava, R. Bai, D. Blaauw, and D. Sylvester, Modeling and analysis of leakage power
considering within-die process variations, in Proc. Int. Symp. Low Power Electronics &
Design, 2002, pp. 6467.
[46] R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester, Statistical estimation of leakage current
considering inter- and intra-die process variation, in Proc. Int. Symp. Low Power Electronics
& Design, Aug. 2003, pp. 8489.
[47] A. Agarwal, K. Kang, and K. Roy, Accurate estimation and modeling of total chip leakage considering inter-and intra-die process variations, in Proc. Int. Conf. Computer-Aided
Design, Nov. 2005, pp. 736741.
[48] V. W. S. Zhang and K. Banerjee, A probabilistic framework to estimate full-chip subthreshold leakage power distribution considering within-die and die-to-die P-V-T variations, in
Proc. Int. Symp. Low Power Electronics & Design, 2004, pp. 156161.
110
[49] H. Dadgour, S.-C. Lin, and K. Banerjee, A statistical framework for estimation of fullchip leakage-power distribution under parameter variations, IEEE Trans. Electron Devices,
vol. 54, no. 11, pp. 29302945, Nov. 2007.
[50] J. Gu, J. Keane, S. Sapatnekar, and C. H. Kim, Statistical leakage estimation of double gate
FinFET devices considering the width quantization property, IEEE Trans. VLSI Systems,
vol. 16, pp. 206209, Feb. 2008.
[51] J. H. Choi, J. Murthy, and K. Roy, The effect of process variation on device temperatures in
FinFET circuits, in Proc. Int. Conf. Computer-Aided Design, Nov. 2007, pp. 747751.
[52] H. Khan, D. Mamaluy, and D. Vasileska, Simulation of the impact of process variation on
the optimized 10-nm FinFET, IEEE Trans. Electron Devices, vol. 55, no. 8, pp. 21342141,
Aug. 2008.
[53] S. Xiong and J. Bokor, Sensitivity of double-gate and FinFET devices to process variations,
IEEE Trans. Electron Devices, vol. 50, pp. 22552261, Nov. 2003.
[54] S. Rasouli, K. Endo, and K. Banerjee, Variability analysis of FinFET-based devices and
circuits considering electrical confinement and width quantization, in Proc. Int. Conf.
Computer-Aided Design, Nov. 2009, pp. 505512.
[55] B. Yu et al., FinFET scaling to 10nm gate length, in Proc. Int. Electronic Device Mtg.,
2002, pp. 251254.
[56] B. Swahn and S. Hassoun, Gate sizing: FinFETs vs. 32nm bulk MOSFETs, in Proc. Design
Automation Conf., July 2006, pp. 528531.
[57] K. Usami and M. Horowitz, Clustered voltage scaling technique for low-power design, in
Proc. Int. Symp. Low Power Electronics & Design, Aug. 1995, pp. 38.
[58] K. Usami, M. Igarashi, F. Minami, T. Ishikawa, M. Kanzawa, M. Ichida, and K. Nogami,
Automated low-power technique exploiting multiple supply voltages applied to a media
processor, IEEE J. Solid-State Circuits, vol. 33, no. 3, pp. 463472, Mar. 1998.
111
[59] K. Roy, L. Wei, and Z. Chen, Multiple-Vdd and multiple-Vth CMOS (MVCMOS) for lowpower applications, in Proc. Int. Symp. Computer Architecture, Oct. 1999, pp. 366370.
[60] P. Mishra, A. Muttreja, and N. K. Jha, Evaluation of multiple supply and threshold voltages
for low-power circuit synthesis, in Proc. Int. Symp. Nanoscale Architectures, June 2008, pp.
7784.
[61] , Low-power FinFET circuit synthesis using multiple supply and threshold voltages,
ACM J. Emerging Technologies in Computing Systems, July 2009.
[62] H. Mahmoodi, S. Mukhopadhyay, and K. Roy, High performance and low power domino
logic using independent gate control in double-gate SOI MOSFETs, in Proc. Int. SOI Conf.,
Oct. 2004, pp. 6768.
[63] L. Wei, Z. Chen, and K. Roy, Double gate dynamic threshold voltage (DGDT) SOI MOSFETs for low power high performance designs, in Proc. Int. SOI Conf., Oct. 1997, pp. 8283.
[64] P. Beckett, Low-power circuits using dynamic threshold voltage devices, in Proc. Great
Lakes Symp. VLSI, Apr. 2005, pp. 213216.
[65] M.-H. Chiang, K. Kim, C. Tretz, and C.-T. Chuang, Novel high-density low-power logic
circuit techniques using DG devices, IEEE Electronic Device Lett., vol. 52, no. 10, pp.
23392342, Oct. 2005.
[66] W. Zhang, J. G. Fossum, L. Mathew, and Y. Du, Physical insights regarding design and
performance of independent-gate FinFETs, IEEE Electronic Device Lett., vol. 52, no. 10,
pp. 21892206, Oct. 2005.
[67] T. Cakici, H. Mahmoodi, S. Mukhopadhyay, and K. Roy, Independent gate skewed logic in
double-gate SOI technology, in Proc. Int. SOI Conf., Oct. 2005, pp. 8384.
[68] A. Muttreja, P. Mishra, and N. K. Jha, Threshold voltage control through multiple supply
voltages for power-efficient FinFET interconnects, in Proc. Int. Conf. VLSI Design, Jan.
2008.
112
[69] V. P. Trivedi, J. G. Fossum, and W. Zhang, Threshold voltage and bulk inversion effects in
nonclassical CMOS devices with undoped ultra-thin bodies, Solid-State Electronics, vol. 1,
pp. 170178, Dec. 2007.
[70] M. Popovich, E. G. Friedman, M. Sotman, and A. Kolodny, On-chip power distribution
grids with multiple supply voltages for high performance integrated circuits, in Proc. Great
Lakes Symp. VLSI, Apr. 2005, pp. 27.
[71] W. Zhao and Y. Cao, New generation of predictive technology model for sub-45nm design exploration, in Proc. Int. Symp. Quality of Electronic Design, May 2006, pp. 585590,
http://www.eas.asu.edu/ ptm.
[72] , Predictive technology model for nano-CMOS design exploration, ACM J. Emerging
Technologies in Computing Systems, vol. 3, no. 1, pp. 117, Apr. 2007.
[73] F. Wang, Y. Xie, K. Bernstein, and Y. Luo, Dependability analysis of FinFET circuits, in
Proc. Symp. Emerging VLSI Technologies and Architectures, Mar. 2006, pp. 399404.
[74] T. Sairam, W. Zhao, and Y. Cao, Optimizing FinFET technology for high-speed and lowpower design, in Proc. Great Lakes Symp. VLSI, Mar. 2007, pp. 7377.
[75] A. U. Diril, Y. S. Dhillon, A. Chatterjee, and A. D. Singh, Level-shifter free design of low
power dual supply voltage CMOS circuits using dual threshold voltages, IEEE Trans. VLSI
Systems, vol. 13, no. 9, pp. 11031107, Sept. 2005.
[76] L. Chang, S. Tang, T.-J. King, J. Bokor, and C. Hu, Gate length scaling and threshold
voltage control of double-gate MOSFETs, in Proc. Int. Electronic Device Mtg., Dec. 2000,
pp. 719722.
[77] D. Sylvester and K. Keutzer, Getting to the bottom of deep submicron, in Proc. Int. Conf.
Computer-Aided Design, Nov. 1998, pp. 203211.
[78] D. Chinnery and K. Keutzer, Linear programming for sizing, Vdd and Vth assignment, in
Proc. Int. Symp. Low Power Electronics & Design, Aug. 2005, pp. 149154.
113
[79] A. Srivastava and D. Sylvester, Minimizing total power by simultaneous Vdd /Vth assignment, in Proc. Asia South Pacific Design Automation Conf., Jan. 2003, pp. 400403.
[80] L. Su et al., Measurement and modelling of self-heating in SOI nMOSFETs, IEEE Electronic Device Lett., vol. 41, pp. 6975, Jan. 1994.
[81] S. Gangwal, S. Mukopadhyay, and K. Roy, Optimization for surface orientation for highperformance, low-power and robust FinFET SRAM, in Proc. Custom Integrated Circuits
Conf., Sept. 2006, pp. 433436.
[82] M. V. Dunga et al., BSIM-MG: A versatile multi-gate FET model for mixed-signal design,
in Proc. Int. Symp. VLSI Technology, June 2007, pp. 6061.
[83] D. D. Lu, M. V. Dunga, C. Lin, A. Niknejad, and C. Hu, A multi-gate MOSFET compact
model featuring independent gate-operation, in Proc. Int. Electronic Device Mtg., Dec. 2007,
pp. 565568.
[84] P. Mishra and N. K. Jha, Low-power FinFET circuit synthesis using surface orientation
optimization, in Proc. Design Automation & Test Europe Conf., Mar. 2010.
[85] J. K. Ousterhout, C. T. Hamachi, R. N. Mayo, W. S. Scott, and G. S. Taylor, Magic: A VLSI
layout system, in Proc. Design Automation Conf., June 1984, pp. 152159.
[86] C. Sechen and A. Sangiovanni-Vincentelli, The Timberwolf placement and routing package, in Proc. Custom Integrated Circuits Conf., May 1984, pp. 522527.
[87] J. Colinge, FinFETs and Other Multi-gate Transistors. Springer, New York, 2008.
[88] P. Mishra, A. Bhoj, and N. K. Jha, Die level leakage power analysis of FinFET circuits
considering process variations, in Proc. Int. Symp. Quality of Electronic Design, Mar. 2010,
pp. 347355.
[89] M. Agostinelli, M. Alioto, D. Esseni, and L. Selmi, Design and evaluation of mixed 3T4T FinFET stacks for leakage reduction, in Proc. Int. Wkshp. Power and Timing Modeling,
Optimization, and Simulation, Sept. 2008.
114
[90] A. Kumar, B. A. Minch, and S. Tiwari, Low voltage and performance tunable CMOS circuit
design using independently driven double gate MOSFETs, in Proc. Int. SOI Conf., Oct.
2004.
[91] S. A. Tawfik and V. Kursun, High speed FinFET domino logic circuits using independent
gate-biased double-gate keepers providing dynamically adjusted immunity to noise, in Proc.
Int. Conf. Microelectronics, Dec. 2007, pp. 175178.
[92] H. Ananthan and K. Roy, A fully physical model for leakage distribution under process
variations in nanoscale double-gate CMOS, in Proc. Design Automation Conf., July 2006,
pp. 413419.
[93] R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester, Statistical estimation of leakage current
considering inter-and intra-die process variation, in Proc. Int. Symp. Low Power Electronics
& Design, Aug. 2003, pp. 8489.
[94] H. Chang and S. S. Sapatnekar, Full-chip analysis of leakage power under process variations,
including spatial correlations, in Proc. Design Automation Conf., June 2005, pp. 523528.
[95] J. Fossum et al., A process-physics based compact model for nanoclassical CMOS device
and circuit design, Solid-State Electronics, vol. 48, pp. 919926, June 2004.
[96] Sentaurus TCAD, HSPICE, Design Compiler manuals. http://www.synopsys.com.
[97] A. Singhee and R. A. Rutenbar, From finance to flip flops: A study of fast quasi-Monte
Carlo methods from computational finance applied to statistical circuit analysis, in Proc.
Int. Symp. Quality of Electronic Design, Mar. 2007, pp. 685692.
[98] Y. Taur et al., A continuous, analytic drain current model for DG MOSFETs, IEEE Electronic Device Lett., vol. 25, no. 2, pp. 107109, Feb. 2004.
[99] W. Zhang, J. G. Fossum, L. Mathew, and Y. Du, Physical insights regarding design and performance of independent-gate FinFETs, IEEE Trans. Electron Devices, vol. 52, pp. 2198
2206, Oct. 2005.
115
[100] S. Bhardwaj, S. Vrudhula, P. Ghanta, and Y. Cao, Modeling of intra-die process variations
for accurate analysis and optimization of nano-scale circuits, in Proc. Design Automation
Conf., July 2006, pp. 791796.
[101] J. Xiong, V. Zolotov, and L. He, Robust extraction of spatial correlation, in Proc. Int. Symp.
Quality of Electronic Design, Aug. 2007, pp. 619631.
[102] N. Higham, Computing the nearest correlation matrix - a problem from finance, IMA Journal of Numerical Analysis, pp. 329343, July 2002.
[103] J. Rabaey, A. Chandrakashan, and B. Nikolic, Digital Integrated Circuits, 2nd ed.
Prentice
116