Sei sulla pagina 1di 6

mmi2010040080.

3d 29/7/010 16:10 Page 80

Prolegomena
Department Editors: Kevin Rudd and Kevin Skadron
..................................................................................................................................................................................................................

Can Subthreshold and


Near-Threshold Circuits
Go Mainstream?
BENTON H. CALHOUN
University of Virginia

DAVID BROOKS
Harvard University

....... Many research teams have gate-to-source voltage and ID is the drain the longer delay) and creates a minimum
demonstrated the ability to operate digi- current), transistors in a digital circuit energy point. The voltage that minimizes
tal complementary metal-oxide semicon- with a subthreshold VDD are always energy per operation occurs in the sub-
ductor (CMOS) chips in the subthreshold off. However, the transistor drain cur- threshold region for most digital designs,
or near-threshold region in recent years, rent does not immediately fall to zero and, as we discuss later, minimizing en-
but no commercial applications have when VDD drops below VT. Instead, it ergy consumption is the primary goal
yet adopted this approach. Subthreshold decreases exponentially (ID  exp(VGS  for a broad class of severely energy-
operation refers to using a supply voltage VT)) in the subthreshold region. Thus, a constrained applications.
(VDD) that is less than a single transistors nonzero gate voltage that is less than In addition to the lower speed, sub-
threshold voltage (VT). Near-threshold VT will still produce a drain current that threshold circuits face a few other key
operation refers to using a VDD that is is larger than the off current (IOFF ID challenges. First, the lower ION/IOFF
close to (that is, slightly above or below) when VGS 0). This positive ratio of ratio can lead to functional problems.
VT. Lowering the supply voltage reduces ION/IOFF lets subthreshold digital gates For example, certain circuit structures
power consumption and also increases behave statically in a similar fashion to with multiple parallel leaking paths
energy efficiency by lowering the energy strong inversion. Their transient behavior (such as SRAM bitlines or wide NOR
consumed per operation. Subthreshold is much slower because the on-current gates) degrade this ratio to the point
operation involves using a supply voltage in the subthreshold region is orders of that the circuit does not work properly.
in the 0.2 V to 0.5 V range. This is sub- magnitude less than in strong inversion. Second, although variations in transistor
stantially lower than nominal supply vol- Figure 1 shows circuit delay as a VT affect circuits at nominal VDD, their im-
tages, which fall in the 0.9 V to 1.2 V function of voltage. The exponential in- pact on current is exponential in the sub-
range for modern process technologies. crease in delay resulting from decaying threshold regime. This means that
Lowering the voltage so far can reduce ION in subthreshold is obvious. The figure process variation can severely degrade
energy consumption by more than also shows the benefit of lower energy the already weakened ION/IOFF ratio. For
10 times because active energy is pro- during subthreshold operation. The total example, circuits that depend on a ratio
portional to VDD2. energy per operation decreases quadrati- of transistor sizes to reliably resolve a
By the traditional definition (for VGS < cally (because EACTIVE  VDD2) until leak- fight between two transistors (such as
VT, ID 0, where VGS is the MOSFET age energy becomes dominant (due to a dynamic logic keeper or an SRAM
cell during write) will fail in subthreshold
............................................................................................................................... because VT variation will alter those de-
Editors Note vices strengths beyond the ability of
Recent research has shown the potential benefits of subthreshold or near-threshold size to compensate. Figure 2 demon-
operation, which gives up a substantial degree of speed in order to reduce energy per strates how VT variation increases the
operation. This is an excellent trade-off for many tasks, such as cyberphysical systems. This spread of critical path delays distribution
prolegomenon summarizes the benefits and challenges of subthreshold or near-threshold at lower voltages.
operation. Despite the challenges of subthreshold
operation, researchers have successfully
..............................................................

80 Published by the IEEE Computer Society 


0272-1732/10/$26.00 c 2010 IEEE
mmi2010040080.3d 29/7/010 16:10 Page 81

developed techniques to build robust


low-voltage circuits for logic, memory,
and processors. We believe that sub- Sub-VT Strong inversion
threshold and near-threshold circuits 101

Energy
will make their way into commercial 12x e-cost
products. They will first appear in
Only 1.3x e-cost
ASICs for ultralow power and low-
energy applications as a means of meet-
ing extremely limited energy budgets. 100
Next, they will show up as integrated
components or special modes in high-
performance chips, including main- 10x speedup
102
stream processors. Finally, near-threshold
Delay
operation will emerge as a competitive
choice to address the stringent power 101
crisis for highly parallel peta- or exascale
10x speedup
processing.
100
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Subthreshold circuits for Supply voltage VDD (volts)
ultralow energy and ultralow
power Figure 1. Normalized energy per operation (top) and normalized delay (bottom)
Most applications of subthreshold cir-
of a digital circuit as a function of VDD. Subthreshold circuits minimize energy
cuits so far have focused on operating
consumption at a cost of slower speed.
scenarios that are severely energy or
power constrained. For example, wear-
able body sensors or wireless sensor
nodes must be small (tiny battery with
little energy stored) and long lasting 250
(low energy), but they do not require 1V
200 800 mV
high operating frequencies. These sorts
500 mV
Occurrences

of energy-constrained applications are 300 mV


perfect for subthreshold circuits, which 150 200 mV
can provide digital signal processing up
to several tens of megahertz at ex- 100
tremely low energy per operation.
50
For example, we have demonstrated
a 0.13-mm CMOS mixed-signal system
0
on chip (SoC) that implements an elec- 1010 109 108 107 106 105
trocardiogram (ECG). The SoC includes
Delay (seconds)
an analog front end (instrumentation
amp [IA] and analog-to-digital converter
Figure 2. Distribution of a critical paths delay at different VDD values showing
[ADC]) and a subthreshold microproces-
the increase of variability at low voltage.
sor.1 The processor consumes only
1.5 pJ at 280 mV (700 nW), and we use
it for signal processing and for controlling
the analog circuits. This chip is ideal for rate by more than 500 times, allowing Other recent examples apply similar prin-
monitoring patients for cardiac arrhyth- for reduced use of a power-hungry ciples and rely on subthreshold circuits.
mias, whose presence can be identified radio.1 The heart rate algorithm can con- These include the integration of an analog
by abnormalities in the heart rate, making tinue to operate accurately even when front end, ADC, and digital seizure classi-
full ECG transmission only necessary the analog components operate at lower fier for a complete electroencephalogram
when actual arrhythmic events occur. voltages to save power. The SoC con- channel at 120 mW2; and an 8.75 mm2
By extracting the heart rate interval on sumes only 2.6 mW while providing either multichip module that includes a lith-
chip, we can reduce the wireless data computed heart rate or raw ECG data. ium battery, solar cell, and intraocular
....................................................................

JULY/AUGUST 2010 81
mmi2010040080.3d 29/7/010 16:10 Page 82

..........................................................................................................................................................................................................................
PROLEGOMENA

pressure sensing chip with 7.7 mW of they will see that subthreshold operation wake up, and oversee power manage-
active power.3 is a valuable tool for healthcare sensors ment. One existing example is a standby
Power-constrained systems differ and other such systems. leakage-reduction system that uses a
subtly from energy-constrained applica- For subthreshold circuits to help real- feedback loop and canary SRAM cells
tions, but they also benefit from sub- ize this vision, they will first be designed to set the standby VDD to a value that
threshold operation. For example, RFIDs as tightly integrated pieces of a complete minimizes leakage while protecting data
or energy-harvesting systems, where system, such as a sensor node. In this in the main SRAM.4 Additionally, many
power is limited but ample time is avail- way, the designer can ensure that the high-performance systems spend a large
able, might require circuit operation at chip provides the desired functionality fraction of time in which their processing
voltages below even the minimum en- without being tied to a specific VDD or load is much lower than the maximum.
ergy point to reduce power consumption. frequency, and different chips can oper- Subthreshold operation during these
Because the power source remains pres- ate at different points in the energy- times can reduce the processors energy
ent, the fact that the operation takes delay space based on their variation, consumption, which can lower on-die
more energy is less important than keep- workload, current operating mode, and temperature and abate energy costs.
ing the circuit power below the available so on. Making subthreshold components One practical implementation combines
power constraint. available (for example, as standalone multi-VDD with a small set of block-level
We believe that subthreshold and chips, IP blocks, layers in a 3D inte- power switches to implement a flexible
near-threshold operations are critical grated circuit, or system in package) form of dynamic voltage scaling that eas-
for enabling new energy- or power- will require a similar shift in component ily supports a subthreshold mode.5,6
constrained systems. However, engi- specifications. Traditional data sheets Finally, the growing push toward
neers might have to adjust their design carefully specify operating parameters large-scale chip multiprocessors com-
philosophies to incorporate these operat- such as VDD and fCLK. However, a sub- bined with hard power constraints even
ing modes. The most important realiza- threshold chips speed will vary dramati- for high-performance systems has
tion to help with this adjustment is that cally at a single voltage. opened a new research avenue for sub-
electronic goods consumers want excit- Figures 1 and 2 reveal one option for threshold and near-threshold computing.
ing functionality, novel products, and bet- solving this problem. For VDD well above The design styles energy-efficiency
ter performance. To consumers, better VT, delay and energy are both roughly lin- advantages make subthreshold and
performance does not just mean fre- ear with VDD, such that a tenfold change near-threshold computing a natural fit
quency, which has been the dominant in delay costs roughly 10 times more en- for highly parallel architectures such as
circuit-related metric (at least from a ergy. In the subthreshold region, how- many-core or wide-SIMD processors.7
marketing viewpoint) for many years. In ever, raising VDD slightly increases Such architectures are amenable to
the context of portable computing (cell speed exponentially with only a minor high-throughput applications with abun-
phones and so on) and emerging applica- energy penalty. Further, Figure 2 shows dant parallelism; however, given power-
tions such as ubiquitous computing and that raising VDD rapidly reduces the scaling trends for conventional design
healthcare, performance means longer delay variability. The supply voltage is styles, subthreshold and near-threshold
lifetimes, less conspicuous form factors, therefore a powerful knob for adjusting designs might be the only practical way
or more functionality for the same size or speed and variability in subthreshold cir- to simultaneously power up all the
lifetime. For example, a user might want cuits. Thus, rather than defining a sub- cores on future many-core chips.
his or her cell phone to provide personal threshold part by specifying a certain Such designs are not without a wide
health information, but the device can VDD and operating frequency, we should range of research challenges. These
only do so if sensors on the users shirt define it by its primary function and let challenges include traditional issues
cuff, shoe insert, or chest bandage are each chip set its own VDD and fCLK to such as programming models, scalable
sending it relevant data. In addition, the meet its functionality obligations. architecture design, and memory band-
sensors should be so inconspicuous width. However, subthreshold and near-
(small and long lasting) that the user is Subthreshold circuits for high threshold design introduces additional
not even aware of them. As long as performance? challenges for high-performance sys-
the system does what the user wants, Although subthreshold circuits limit tems, particularly with respect to the
the user will not care whether the sen- operating frequency, they can still play impact of design variations. As we
sors processor achieves some target a role in high-performance circuits. mentioned earlier, reduced supply volt-
clock frequency. If designers prioritize cor- When high-speed circuits enter a age exacerbates process, voltage, and
rect system behavior over component- standby state to reduce idle leakage temperature variations. Although low-
specific circuit metrics (such as power, some circuits must remain active performance systems might be relatively
frequency, data rate, or operating VDD), to monitor the system, decide when to insensitive to these variations, the
....................................................................

82 IEEE MICRO
mmi2010040080.3d 29/7/010 16:10 Page 83

activity patterns. Thus, high-level solu-


100
tions at the architecture and software
45 nm (VDD = 1.0 V)
layers have significant potential to tackle
32 nm (VDD = 0.9 V)
22 nm (VDD = 0.8 V) this seemingly low-level design problem.
90
16 nm (VDD = 0.7 V) Highly parallel near-threshold high-
performance multiprocessors will also
80 require on-die tuning similar to what
Peak frequency (%)

we described for the ultralow energy


applications. In addition to those tech-
70 niques, we need more research in meth-
ods to allow task assignment and task
migration with an awareness of process
60 variation, temperature, and so on.
Although challenges remain, sub-
threshold and near-threshold computing
50 provide energy efficiency that is in-
creasingly critical for modern integrated
circuits, ranging from the most energy-
40
0 10 20 30 40 50 constrained applications to massively
Margin (%) multicore supercomputers. These oper-
ating regimes will inevitably become
part of future products, but they will re-
Figure 3. Reduction in peak achievable frequency as a function of the voltage
quire designers at the circuit, architec-
margin imposed to protect against functionality problems that might result from
ture, system, and even software levels
VDD droop.
to modify traditional design practices to
exploit their benefits.

impact on high-performance systems voltage margins across four predictive ............................................................


could be substantial. For example, Intels technology model (PTM)9 technology References
80-core TeraFlops processor reported nodes. These simulations, based on 1. S. Jocke et al., A 2.6-mW Subthres-
an increase in the standard deviation an 11-stage ring oscillator consisting of hold Mixed-signal ECG SoC, Proc.
(sigma) of critical path delay of 45 per- fanout-of-4 inverters, show that at todays Intl Symp. Low Power Electronics
cent when moving from 1.2 V to 0.8 V.8 32-nm node, a 20 percent voltage margin and Design (ISLPED 09), ACM
For high-performance subthreshold translates to a 33 percent frequency Press, 2009, pp. 117-118.
and near-threshold design, voltage degradation, and at future technology 2. N. Verma et al., A Micro-Power EEG
noise might be one of the most difficult nodes the situation gets much worse. Acquisition SoC With Integrated Fea-
challenges to overcome. For a fixed The large amount of voltage noise ture Extraction Processor for a
power budget, current delivery require- present in high-performance processors Chronic Seizure Detection System,
ments increase linearly with reduced will likely make margin-based design IEEE J. Solid-State Circuits, vol. 45,
supply voltage. Increased current draw approaches impractical. However, alter- no. 4, 2010, pp. 804-816.
results in higher voltage swings due to native solutions might ameliorate the 3. G. Chen et al., Millimeter-Scale Nearly
nonzero impedance of the power distri- noise problem. Recent research sug- Perpetual Sensor System with Stacked
bution network. The traditional approach gests that voltage noise within a single Battery and Solar Cells, Proc. IEEE Intl
of dealing with voltage noise has been to processor core is highly predictable and Solid-State Circuits Conf. (ISSCC 10),
introduce voltage margins, effectively heavily correlated with certain microarch- IEEE Press, 2010, pp. 288-289.
operating the processor at a higher VDD itectural events and code paths.10 Given 4. J. Wang and B.H. Calhoun, Tech-
to accommodate droops, sacrificing en- that the root cause of noise events is niques to Extend Canary-based
ergy efficiency. high-level activity, current smoothing Standby VDD Scaling for SRAMs to
Figure 3 shows a relatively simple ex- through voltage-noise-aware code 45nm and Beyond, IEEE J. Solid-
ample of how larger voltage swings can scheduling could reduce the problem.11 State Circuits, vol. 43, no. 11, 2008,
significantly impact performance even For highly parallel systems, voltage pp. 2514-2523.
for nominal design styles. The figure noise will likely be heavily correlated 5. B.H. Calhoun and A. Chandrakasan,
shows peak FMAX while sweeping with core-to-core and core-to-memory Ultra-Dynamic Voltage Scaling
....................................................................

JULY/AUGUST 2010 83
[3B2-14] mmi2010040080.3d 30/7/010 16:37 Page 84

..........................................................................................................................................................................................................................
PROLEGOMENA

(UDVS) Using Subthreshold Operation 10. V.J. Reddi et al., Voltage Emergency from the Massachusetts Institute of
and Local Voltage Dithering, IEEE J. Prediction: A Signature-Based Approach Technology. He is a member of IEEE.
Solid-State Circuits, vol. 41, no. 1, To Reducing Voltage Emergen-
2006, pp. 238-245. cies, Proc. Intl Symp. High- David Brooks is a Gordon McKay
6. B.H. Calhoun et al., Flexible Circuits Performance Computer Architecture Professor of Computer Science in the
and Architectures for Ultra Low (HPCA 09), IEEE CS Press, 2009, School of Engineering and Applied
Power, Proc. IEEE, vol. 98, no. 2, pp. 18-27. Sciences at Harvard University. His
2010, pp. 267-282. 11. V.J. Reddi et al., Software-Assisted research interests include power-
7. B. Zhai et al., Energy Efficient Near- Hardware Reliability: Abstracting efficient computer system design,
Threshold Chip Multi-processing, Circuit-level Challenges to the Soft- variation-tolerant computer architec-
Proc. Intl Symp. Low Power Electron- ware Stack, Proc. 46th Design Auto- tures, and embedded system design.
ics and Design (ISLPED 07), ACM mation Conf. (DAC), ACM Press, Brooks has a PhD in electrical engineer-
Press, 2007, pp. 32-37. 2009, pp. 788-793. ing from Princeton University. He is a
8. S. Dighe et al., Within-Die Variation- member of IEEE and ACM.
Aware Dynamic Voltage-Frequency Benton H. Calhoun is an assistant
Scaling, Core Mapping and Thread professor in the Charles L. Brown De- Direct questions or comments about
Hopping for an 80-Core Processor, partment of Electrical and Computer this article to Benton Calhoun, 351
Proc. IEEE Intl Solid-State Circuits Engineering at the University of Virginia. McCormick Road, PO Box 400743, Char-
Conf. (ISSCC 10), IEEE Press, 2010, His research interests include low-power lottesville, VA 22904-4743; bcalhoun@
pp. 174-175. digital circuit design, subthreshold digital virginia.edu.
9. W. Zhao and Y. Cao, "New Generation circuits, SRAM design for end-of-the-
of Predictive Technology Modeling for roadmap silicon, variation-tolerant circuit
Sub-45nm Early Design Exploration," design methodologies, and low-energy
IEEE Trans. Electron Device, vol. 53, electronics for medical applications.
no. 11, 2006, pp. 2816-2823. Calhoun has a PhD in electrical engineering

....................................................................

84 IEEE MICRO
mmi2010040080.3d 29/7/010 16:10 Page 85

....................................................................

JULY/AUGUST 2010 85