Sei sulla pagina 1di 8

Power Estimation Techniques for Integrated Circuits

Farid N. Najm
ECE Dept. and Coordinated Science Lab.
University of Illinois at Urbana-Champaign
Abstract To address these needs, many researchers have
With the advent of portable and high-density mi- responded in various ways, such as by proposing
croelectronic devices, the power dissipation of very power estimation techniques, low-power library devel-
large scale integrated (VLSI) circuits is becoming a opment, low-power optimization techniques, and low-
critical concern. Accurate and ecient power estima- power synthesis tools.
tion during the design phase is required in order to Power estimation is needed at di erent points in
meet the power speci cations without a costly redesign the design process. Ideally, one would like to estimate
process. Recently, a variety of power estimation tech- the power of the design very early on, such as when
niques have been proposed, most of which are based on: only a high level (behavioral) description of the design
1) the use of simpli ed delay models, and 2) modeling is available. Such a capability would save precious de-
the long-term behavior of logic signals with probabili- sign time and would provide designers with power es-
ties. The array of available techniques di er in subtle timation at a time when the design is still suciently
ways in the assumptions that they make, the accuracy exible that major design changes can be made rather
that they provide, and the kinds of circuits that they cheaply. I refer to this as high-level power estimation.
apply to. In this tutorial, I will survey the many power While power estimation from a truly behavioral de-
estimation techniques that have been recently proposed scription is not feasible today, some techniques have
and, in an attempt to make sense of all the variety, I been proposed that work with a (moderately) high-
will try to explain the di erent assumptions on which level design description. Speci cally, some proposed
these techniques are based, and the impact of these as- techniques work at the Register Transfer Level (RTL),
sumptions on their accuracy and speed. i.e., when the circuit is described in terms of mem-
ory elements and combinational black boxes (described
1. Introduction only with Boolean equations).
While it is highly desirable, high-level power esti-
The continuing decrease in feature size and the mation is also inevitably inaccurate, or approximate.
corresponding increase in chip density and operating Thus, it is also important to accurately estimate the
frequency have made power consumption a major con- power once the low-level details of the circuit become
cern in VLSI design [1, 2]. Modern microprocessors available, such as its gate-level or switch-level descrip-
are indeed hot, with typical power dissipation values tion. This low-level power estimation problem has re-
ranging from 8 Watts to 60 Watts, for large chips. ceived much more attention in the literature, and many
Excessive power dissipation in integrated circuits is estimation techniques at this level have been proposed.
undesirable for two reasons: 1) High power dissipa- After a more detailed description of the power es-
tion causes overheating, which degrades performance timation problem, in the next section, the rest of the
and reduces chip lifetime. To control their tempera- paper will provide a discussion of many recently pro-
ture levels, high power chips require specialized and posed estimation techniques.
costly packaging and heat-sink arrangements. 2) The
demand for portable electronics has created a need for 2. Detailed problem description
very low-power chips to help prolong the battery life
of portable equipment. Thus, there is a need to limit By power estimation I will generally refer to the
the power consumption in many chip designs. Indeed, problem of estimating the average power dissipation of
the Semiconductor Industry Association has identi ed a digital circuit. This is di erent from estimating the
low-power design techniques as a critical technological worst case instantaneous power [4{6, 10], also referred
need [3]. to as the voltage drop problem. Another related prob-
Managing the power of an Integrated Circuit (IC) lem is that of providing an upper bound on the average
design adds to a growing list of problems that IC de- power without necessarily bounding the instantaneous
signers have to contend with. Computer-Aided Design power [20]. These techniques will not be discussed in
(CAD) tools are needed to help with the power man- this paper. Instead, I will focus on average power es-
agement tasks. Indeed, the overall design methodology timation, which is directly related to chip heating and
needs to be modi ed to account for power during the temperature and to battery lifetime.
design process by helping designers to make trade-o s A simple and straight-forward method of average
that reduce the power dissipation. In the same way power estimation is to simulate the circuit, say using a
that testability became an up-front design concern in circuit simulator, to obtain the power supply voltage
the 80s, power is now the up-front design concern in and current waveforms, from which the average power
the 90s. In the same way that scan design became can be computed. Techniques of this kind were the rst
part of mainstream design methodologies to guarantee to be proposed [11, 12]. Since they are based on circuit
testability, we now need general, easy to apply, auto- simulation, these techniques can be quite expensive.
matic design techniques for low-power design. In order to improve computational eciency, several
IEEE/ACM International Conference on Computer-Aided Design, 1995.
other simulation-based techniques were also proposed niques, one can determine when to stop the simulation
using various kinds of RTL, gate-, switch-, and circuit- in order to obtain certain user-speci ed accuracy and
level simulation [13{18, 46{50]. Given a set of input con dence.
patterns or waveforms, the circuit is simulated, and In the next section, I will discuss power estimation
a power value is reported based on the simulation re- techniques that operate at the gate level, while sec-
sults. Almost all of these techniques assume that the tion 4 presents techniques that work at a higher level
supply and ground voltages are xed, and only the of abstraction. Whenever possible, I will comment on
supply current waveform is estimated. the accuracy and speed of the di erent approaches.
Even though these simulation-based techniques However, accuracy comparisons are often hard to do
can be ecient, their utility in practice is limited be- because the published techniques were not all tested
cause the estimate of the power which they provide on a common set of benchmark designs.
corresponds directly to the input patterns that were
used to drive the simulation. This points to the central
problem in power estimation, namely that the power 3. Low-level power estimation
dissipation is input pattern-dependent. Indeed, in most Most techniques in this class simplify the problem
modern logic styles, the chip components (gates, cells) by assuming a simpli ed circuit model, as follows:
draw power supply current only during a logic transi- (1) It is assumed that the power supply and ground
tion (if we ignore the small leakage current). While this voltage levels throughout the chip are xed.
is considered an attractive low-power feature of these (2) It is assumed that the circuit is built of CMOS
technologies, it makes the power-dissipation highly de- logic gates and edge-triggered ip- ops (FFs), and
pendent on the switching activity inside these circuits. is synchronous, as shown in Fig. 1.
Simply put, a more active circuit will consume more (3) Only the charging/discharging current is con-
power. Since internal activity is determined by the sidered, so that the short-circuit current during
input signals, then the circuit power is input pattern- switching is neglected [7].
dependent. Therefore, the average power dissipation of a cir-
In practice, the pattern-dependence problem is a cuit can be broken down into (a) the power consumed
serious limitation. Often, the power dissipation of a by the ip- ops and (b) that consumed by the com-
circuit block may need to be estimated when the rest binational logic blocks. Correspondingly, one way to
of the chip has not yet been designed, or even com- estimate the power is to use the following two-step ap-
pletely speci ed. In such a case, very little may be proach:
known about the inputs to this block, and exact infor- 1. Solve for the FF power by examining the behav-
mation about its inputs would be impossible to obtain. ior of the whole circuit as a nite state machine
Furthermore, for a microprocessor or a Digital Signal (FSM), and measure statistics of the FF outputs.
Processing (DSP) chip, the exact data inputs can not 2. Use the statistics at the FF outputs, resulting
be determined a priori, because they depend on how from the FSM analysis, to compute the power for
the chip is deployed in the eld. the combinational circuit block.
Recently, several techniques have been proposed
to overcome this problem by using probabilities as a Clock Inputs
compact way to describe a large set of possible logic u1
signals, and then studying the power resulting from

Outputs
u2
the collective in uence of all these signals. In order
to use these techniques, the user only speci es typical um

behavior at the circuit inputs, in the form of transi- Combinational

tion probability, or average frequency. If typical in-


Flip-Flops xn Circuit

put pattern sets are available, then the required input Present
x2
Next

probability or frequency information can be easily ob-


State State

tained by a simple averaging procedure. The rest of x1

this paper is devoted to discussing these techniques. Clock


I will classify power estimation techniques as be-
ing either probabilistic or statistical. I call an approach
probabilistic when it is based on propagating a proba-
bility measure directly through the logic. To perform
this, special models for circuit blocks (gates) must be
developed and stored in the cell library. In contrast, Flip-Flops

other techniques, that I will refer to as statistical, do Sequential Circuit


not require specialized circuit models. Instead, they
use traditional simulation models and simulate the cir- Figure 1. A combinational circuit embedded
cuit, using existing simulation capabilities, for a lim- in a synchronous sequential design.
ited number of randomly generated input vectors while
monitoring the power. These vectors are generated
from user-speci ed probability information about the This process is easily formulated using probabili-
circuit inputs. Essentially, these techniques are based ties, by de ning probability measures that characterize
on statistical mean estimation resulting from a Monte the transitions made by a logic signal. We start with
Carlo procedure. Using statistical estimation tech- the following two:
-2/8-
De nition 1. (signal probability): The signal more ecient, it remains quite expensive, so that the
probability Ps (x) at a node is de ned as the aver-
x largest test case presented contains less than 30 FFs.
age fraction of clock cycles in which the steady state Better solutions are o ered by two recent pa-
value of is a logic high.
x pers [42, 43], which assume the FSM primary inputs
De nition 2. (transition probability): The tran- are independent, and which are based on solving a non-
sition probability t( ) at a node is de ned as the linear system that gives the present state line proba-
P x x
average fraction of clock cycles in which the value of bilities, as follows. Let a vector of present state sig-
x
at the end of the cycle is di erent from its initial value. nal probabilities in = [ 1 2
P n] be applied to
p ; p ; : : :; p
the combinational logic block and let the present
n

The signal probability is a relatively old concept state signals be independent. At the outputs of the
that was rst introduced to study circuit testability [9]. combinational logic, let the corresponding next state
In what follows, we will see how probabilities are rele- node probability vector be out. The mapping from
P

vant to power estimation as we consider separately the P in to out is some non-linear function that is deter-
P

computation of the FF power and the combinational mined by the Boolean function implemented by the
circuit power. logic. We denote this vector-valued function by (),
F
so that out = ( in) (assuming, for now, the FSM
3.1. Flip- op power
P F P
primary input probabilities are xed).
Whenever the clock triggers the FFs, some of If we now assume that in = is the vector of
P P
them will make a transition and will draw power. Thus present state probabilities, then we should also have
FF power is drawn in synchrony with the clock. If the P out = , because the state line probabilities are con-
P
transition probabilities t( ) at the FF outputs are
P x stant in steady-state. If we assume that the state lines
known, then the average power consumed by one ip- are independent, this translates to = ( ). The
P F P
op is simply: solution of this non-linear system gives the required
1 2 state line probability vector . It is solved using the
2 dd x t( )
P

c
V C P x
Newton-Raphson method in [42], and using the Picard-
T
Peano iteration method in [43].
where c is the clock period and x is the total capac-
T C Both techniques also try to correct for the state
itance at the FF output. line independence assumption. In [42], this is done by
Thus the computation of the FF power reduces accounting for m-wise correlations between state bits
to nding the FF transition probabilities. However, when computing their probabilities. This requires 2m
computing the probabilities t( i ) from the FSM input
P x additional gates and can get very expensive. Never-
signal and/or transition probabilities is not trivial. In theless, they show good experimental results. The ap-
fact, it can be shown that nding these probabilities proach in [43] is to unroll the combinational logic block
exactly is NP -hard. Even nding them approximately k times. This is less expensive than [42], and the au-
is not easy, because the feedback creates the dicult thors observe that with = 3 or so, good results can
k
situation where future signal values are related to their be obtained.
past and present values. In order to avoid the problem of assuming that
3.1.1. Probabilistic techniques the FSM inputs are independent, the technique in [21]
In trying to compute the probabilities at the state makes use of a user-speci ed input sequence. A new
bits of the FSM, it is tempting to consider nding FSM is constructed that automatically generates the
the state occupation probability, for every state in the user input sequence, called an Input Modeling FSM
FSM's state transition graph (STG), or the state tran- (IMFSM). The combination of IMFSM and the origi-
sition probability associated with every edge in the nal FSM are solved together as one autonomous FSM,
STG. This, however, gets very expensive due to the using [42].
exponential explosion in the number of states, even for 3.1.2. Statistical techniques
FSMs of moderate size. One technique, given in [28], An alternative type of method was proposed
completely ignores this problem and assumes that all in [27] that eliminates many of the shortcomings of
states (of the FSM) are equally probable, which is not the above probabilistic techniques. This is a statisti-
true in practice. cal method in which the circuit is simulated repeatedly
Other techniques [40{43] have been proposed that under randomly generated input vectors while moni-
are based on the simplifying assumption that the FSM toring the FF outputs, essentially a Monte Carlo ap-
is Markov [34] (so that its future is independent of proach. The simulation is stopped when the required
its past once its present state is speci ed). This as- FF probabilities have converged with user-speci ed ac-
sumption is somewhat restrictive because it is only curacy and con dence.
true when the sequence of input vectors at the FSM The technique has many advantages: 1) it makes
primary inputs are independent. Some of these tech- no assumptions about the FSM behavior (Markov or
niques compute only the probabilities (signal and tran- otherwise), 2) it makes no independence assumptions
sition) at the FF outputs, while others also compute about the state lines, 3) it allows the user to specify
the power. The approach in [40] solves directly for the the desired accuracy and con dence, and 4) it does
transition probabilities on the present state lines using not use large Binary Decision Diagrams [35] (BDDs),
the Chapman-Kolmogorov equations [33, 34], which so that memory usage is not a problem.
is computationally too expensive. Another approach For nodes inside the combinational block, only the
that also attempts a direct solution of the Chapman- steady state values (inside a clock cycle) are required.
Kolmogorov equations was given in [41]. While it is Therefore, a logic simulation using a zero-delay tim-
-3/8-
ing model may be safely used for the combinational tance at node i , and is the total number of circuit
x n
block. In fact, the combinational block may be simu- nodes that are outputs of logic gates or cells. Since
lated at a higher level of abstraction, say as a single this assumes at most a single transition per clock cycle,
Boolean black box. The advantage of this is that the then this is actually a lower bound on the true average
simulation can proceed much faster, which is impor- power. Nevertheless, the results of a zero delay anal-
tant because the number of cycles to be simulated can ysis may be useful as a rough technology-independent
be large. As a result, this approach has good speed and indication of the power requirements of a circuit, using
takes 4.5 hours to solve a 1500-latch/20k-gate cir- estimated or nominal gate capacitances.
cuit, on a SUN sparc-10, with 5% accuracy and 95% In order to compute the internal transition prob-
con dence. On small/moderate FSMs, the time re- abilities, it is common to start by nding the signal
quired is in the seconds or minutes. probabilities. This, by itself, is not easy and can be
3.2. Combinational circuit power shown to be NP -hard. The problem has to do with
whether the input signals to a logic gate (viewed as
Whereas ip- op power is drawn in synchrony random variables) are independent or not. In practice,
with the clock, the same is not true for gates inside logic signals may be correlated so that, for instance,
the combinational logic. Even though the inputs to two of them may never be simultaneously high, or they
a combinational logic block are updated by the FFs may never (or always) switch together. Primary inputs
(in synchrony with the clock), the internal gates of the to the combinational block may be correlated due to
block may make several transitions before settling to the feedback. And even if these inputs are assumed
their steady state values for that clock period. independent, other internal signals may be correlated
These additional transitions have been called haz- due to reconvergent fanout (a gate fans out into two
ards or glitches. Although unplanned for by the de- signals that eventually recombine as the inputs of some
signer, they are not necessarily design errors. Only gate downstream). However, it is computationally too
in the context of low-power design do they become a expensive to compute these correlations.
nuisance, because of the additional power that they Some have argued that the correlations do not
dissipate. It has been observed [8] that this additional seriously a ect the nal result, so that circuit input
power dissipation is typically 20% of the total power, and internal nodes may be assumed to be independent.
but can be as high as 70% of the total power in some We refer to this as a spatial independence assumption.
cases such as combinational adders. We have observed It leads to a signi cant simpli cation in computing the
that in a 16-bit parallel multiplier circuit, some nodes internal signal probabilities. If = is an AND gate
make as many as 20 transitions before reaching steady y ab
output, and and are independent, then s( ) =
state. This component of the power dissipation is com- a b
s( ) s ( ). For an OR gate, we have s( ) = s( ) +
P y

putationally expensive to estimate, because it depends P a P b P y


s( ). Thus the internal node probabilities are simply
P a

on the timing relationships between signals inside the P b


computed from those of the input nodes. The primary
circuit. Consequently, many proposed power estima- input node probabilities can be obtained as results of
tion techniques have ignored this issue. We will refer the analysis of the FSM, carried out previously.
to this elusive component of power as the glitch power. To nd the internal transition probabilities, we
Computing the glitch power is one main challenge in must deal with another independence issue of whether
power estimation. This and other challenging prob- the values of the same signal in two consecutive clock
lems that are speci c to combinational circuit power cycles are independent or not. If assumed independent,
estimation will be discussed below. In the second and then the transition probability can be easily obtained
third sub-sections, a survey of probabilistic and statis- from the signal probability according to:
tical techniques will be given.
3.2.1. Challenges Pt( ) = 2 s( ) s ( ) = 2 s( ) [1 ? s ( )] (2)
x P x P x P x P x

Recall the signal and transition probabilities, de- We refer to this as a temporal independence assump-
ned above, and suppose they are computed for every tion. If this assumption is not made, then one must
gate output node in the combinational block. It is somehow represent the correlation between successive
important to note that the resulting values are unaf- input vectors and internal signals. Given our for-
fected by the circuit internal delays. This is because, mulation of power estimation as a two-step process,
by de nition, they depend only on steady state signal the correlation between two consecutive primary in-
values in a clock cycle. Indeed, these values would re- put bit values (on the same input line) can be obtained
main the same even if a zero-delay timing model were as transition probabilities computed during the FSM
used. If this is done, however, the glitch power would analysis. But that does not account for all input corre-
be automatically excluded from the analysis. This is lations. Correlations across more than one clock edge
a serious shortcoming of techniques that are based on are not available, and correlations between one signal
these measures, as we will point out below. and previous values of other signals are also not avail-
If a zero-delay model is assumed and the transi- able. In principle, the required correlation information
tion probabilities are computed, then the power can is in nite, and only a nite amount of correlation can
be computed as: be considered in practice. Not only is computing the
(limited) correlations too expensive, but making use
= 1 2X n
of them during the computation of the combinational
P av 2 dd V i t ( i)
C P x (1) circuit power is also dicult.
T c i=1 The above problems become even worse in the
where c is the clock period, i is the total capaci-
T C case of non-zero delays. In this case, more detailed
-4/8-
probability measures are required to properly formu- 3.2.2. Probabilistic techniques
late the power dissipation problem. One such measure Recently, several probabilistic power estimation
is the transition density [25]. The transition density at techniques have been proposed for combinational cir-
node is the average number of transitions per second
x cuits. These techniques all use simpli ed delay models
at node , denoted ( ). Formally:
x D x for the circuit components and require user-supplied
De nition 3. (transition density) If a logic sig- information about typical input behavior. Thus, their
nal ( ) makes x ( ) transitions in a time interval of accuracy is limited by the quality of the delay models
x t n T
length , then the transition density of ( ) is de ned and of the input speci cation. Throughout the discus-
as:
T x t sion below, primary inputs and primary outputs will
refer to inputs and outputs of the combinational circuit
( ) := lim x ( )
D x
T !1
n T
(3) block.
T
3.2.2.1. Using signal probability
The density provides an e ective measure of In [19], a zero-delay model is used and temporal
switching activity in logic circuits in the presence of as well as spatial independence is assumed. The user is
any delay model. If the density at every circuit node expected to provide signal probabilities at the primary
is made available, the overall average power dissipation inputs. These are then propagated into the circuit to
in the circuit can be computed as: provide the probabilities at every node. In the paper,
the propagation of probabilities is performed at the
1 X n switch-level, but this is not essential to the approach.
av = 2 dd
P
2
Vi ( i) C D x (4) It is easier to propagate probabilities by working with
i=1 a gate-level description of the circuit. Once the signal
In a synchronous circuit, with a clock period c , the probabilities are computed at every node in the circuit,
relationship between transition density and transition
T
the power is computed by making use of (1) and (2),
probability is: based on the temporal independence assumption.
In general, if the circuit is built from Boolean com-
( )  t( ) P x
(5) ponents that are not part of a pre-de ned gate library,
the signal probability can be computed on the y by
D x
T c
where equality occurs in the zero-delay case. Thus using a BDD [35] to represent the Boolean functions,
the transition probability gives a lower bound on the as proposed in [25] and [37]. Since it uses a zero-delay
transition density. timing model, this method does not account for the
glitch power.
In order to complete the density formulation, an- 3.2.2.2. Probabilistic simulation
other measure is required: Let ( ) denote the equi-
P x
A probabilistic power estimation approach that
librium probability [25] of a logic signal ( ), de ned x t
does compute the glitch power and does not make
as the average fraction of time that the signal is high. the zero-delay or temporal independence assumptions,
Formally: called probabilistic simulation was proposed in [22].
De nition 4. (equilibrium probability) If ( ) is x t This approach requires the user to specify typical sig-
a logic signal (switching between 0 and 1), then its nal behavior at the circuit inputs using probability
equilibrium probability is de ned as: waveforms. A probability waveform is a sequence of
Z +2T values indicating the probability that the signal is high
( ) := lim 1
P x ()
T !1 T ?2T
(6) x t dt
for certain time intervals, and the probability that it
makes low-to-high transitions at speci c time points.
The transition times themselves are not random. This
In contrast to the signal probability, the equilib- allows the computation of the average, as well as the
rium probability depends on the circuit internal delays variance, of the current waveforms drawn by the indi-
since it describes the signal behavior over time, not vidual gates in the design in one simulation run. The
only its steady state behavior per clock cycle. In the average current waveforms can then be used to com-
zero-delay case, the equilibrium probability reduces to pute the average power dissipated in each gate and
the signal probability. the total average power of the circuit. Improvements
If all correlations are completely ignored, so that on this technique were proposed in [23, 24], where the
any two signals are completely independent both in accuracy and the correlation handling were improved
space and time, we say that we have a spatio-temporal upon.
independence assumption. If this is assumed, then the 3.2.2.3. Transition density
transition density at the output of a Boolean logic y In [25, 26], an ecient algorithm is presented to
cell (gate) can be easily computed [25] from the density propagate the density values from the inputs through-
at its inputs, 1 n, according to:
x ; : : :; x out the circuit, according to (7). The required input
X  
n speci cation is a pair of numbers for every input node,
( )= ( i)
@y
(7) namely the equilibrium probability and transition den-
D y

i=1
P
@x i
D x
sity. In this case, both signal values and signal transi-
tion times are random.
where @y=@xis the Boolean di erence of with re- y BDDs can be used [25] to compute the Boolean
spect to , de ned as
x := jx=1  jx=0 where 
@y=@x y y di erence probabilities, which are required in order to
denotes the exclusive-or operation. use the propagation algorithm (7). Recently, special-
-5/8-
ized BDD-based techniques have been proposed to fa- power being consumed. Eventually, the power will con-
cilitate this [39]. Improvements on this basic technique verge to the average power, based on (3) and (4). The
have also been proposed in [31, 51], providing more ac- issues are how to select the input patterns to be applied
curate gate models and improved correlation handling. in the simulations and how to decide when the mea-
3.2.2.4. A symbolic technique sured power has converged close enough to the true av-
The technique proposed in [28] attempts to han- erage power. Normally, the inputs are randomly gen-
dle both spatial and temporal correlations by using erated and statistical mean estimation techniques [38]
a BDD to represent the successive Boolean functions are used to decide when to stop - essentially a Monte
at every node in terms of the primary inputs, as fol- Carlo method.
lows. The circuit topology de nes a Boolean function 3.2.3.1. Total power
corresponding to every node that gives the steady state This approach [29, 30] uses Monte Carlo simula-
value of that node in terms of the primary inputs. The tion to estimate the total average power of the circuit.
intermediate values that the node takes before reach- It consists of applying randomly-generated input pat-
ing steady state are not represented by this function. terns at the primary inputs and monitoring the energy
Nevertheless, one can construct Boolean functions for dissipated per clock cycle using a simulator, until the
them by making use of the circuit delay information, cumulative power measured has converged to the true
assuming the delay of every gate is a speci ed xed average power. In practice, this technique was found
constant. As a result, the Boolean value at internal to be very ecient. Typically, as few as 10 vectors
nodes is symbolically represented in terms of the pri- may be enough to estimate the power of a large circuit
mary inputs at all time points inside a clock cycle. with thousands of gates. But perhaps the most useful
In order to compute the probabilities of internal tran- feature of this technique is that the user can specify
sitions, one can use the BDD [36] to construct the the required accuracy and con dence level up-front. It
exclusive-OR function of two consecutive intermediate also does not require an independence assumption for
states. internal nodes. It only requires the primary inputs to
One disadvantage of this technique is that it is be independent, but the approach can be extended to
computationally expensive. Since the BDD is built model and take into account the correlations between
for the whole circuit, there will be cases where the input nodes.
technique breaks down because the required BDD may Perhaps the only disadvantage of this approach
be too big. is that, while it provides an accurate estimate of the
3.2.2.5. Using correlation coecients total power, it does not provide the power consumed
Another probabilistic approach that is similar to by individual gates or small groups of gates. It would
probabilistic simulation was proposed in [24] whereby take many more transitions to estimate (with the same
the correlation coecients between steady state signal accuracy) the power of individual gates, because some
values (inside a clock cycle) are used as approxima- gates may switch very infrequently.
tions to the correlation coecients between the inter- 3.2.2.2. Power of individual gates
mediate signal values (at any time during the clock This technique [32] is a modi cation of the above
cycle). This allows spatial correlation to be handled approach that provides both the total and individual-
approximately, and is much more ecient than try- gate power estimates, with user-speci ed accuracy and
ing to estimate the dynamic correlations between in- con dence. One reason why one may want to estimate
termediate states. The steady state correlations are the power consumed by individual gates is to be able
estimated from the BDD by constructing the function to diagnose a high power problem, and nd out which
for the AND of two signals. The reported results have part of the circuit consumes the most power. Other
good accuracy, but the technique does require building reasons have to do with the fact that estimating the
the BDD for the whole circuit, which may not always individual gate power values is essentially equivalent
be feasible. to estimating the transition density values at all the
3.2.2.6 Handling spatio-temporal correlation nodes, which can then be used to estimate circuit relia-
Finally, a probabilistic technique [52, 53] has been bility, considering a variety of failure mechanisms [25].
recently proposed that improves on the basic signal A weakness of this approach may be its moderate
probability propagation method. This is done by us- speed. For a circuit with 16000 gates, about 2 cpu
ing transition probabilities to account for temporal cor- hours are required on a SUN sparc ELC.
relation (across one clock edge) and also using corre-
lation coecients in order to handle (approximately) 4. High-level power estimation
the spatial and temporal correlation inside the circuit As pointed out in the introduction, it would be
and improve the accuracy. The delay model remains very advantageous if one could estimate power dis-
zero delay, so that the glitch power is not included. sipation from a design description at a high level of
Although this method uses BDDs, they only have to abstraction. This would provide designers with an
construct local BDDs in terms of the immediate fan- early measure of power dissipation, before much de-
in inputs of every gate, so that there are no speed or sign e ort has been spent. To date, some techniques
memory problems. have been proposed that work at the structural RTL
3.2.3. Statistical techniques level. At this level of abstraction, the memory ele-
The idea behind these techniques is quite simple ments (register les, ip- ops, etc) are assumed to
and appealing: simulate the circuit repeatedly, using have been completely speci ed, but all other (com-
some timing or logic simulator, while monitoring the binational) logic remains at the (Boolean) functional
-6/8-
level. Essentially, the design description consists of techniques require specialized simulation models.
ip- ops and Boolean black boxes. Other, more recent, techniques have been pro-
In some cases, the high-level description of a de- posed for high-level power estimation. Given a descrip-
sign may be in terms of major circuit blocks that were tion of the design at the structural RTL level, these
used in previous designs. In this case, the detailed methods try to predict the power requirements that
implementation of the combinational black boxes will an implementation of this design would have.
be completely known. This is, for example, the case
in DSP designs, where the circuit blocks come from References
a library of well characterized adders, multipliers, etc, [1] R. W. Brodersen, A. Chandrakasan, S. Sheng, \Technolo-
and where the design task might be to determine which gies for personal communications," Symp. on VLSI cir-
type of adder or multiplier to use in a given chip de- cuits, pp. 5{9, 1991.
sign. In this case, it is still advantageous to carry out [2] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen,
the analysis at a high level of abstraction, because the \Low-power CMOS digital design," IEEE Journal of
analysis can be done much faster. I refer to techniques Solid-State Circuits, vol. 27, no. 4, pp. 473{484, April 1992.
of this kind as being bottom-up approaches - the low- [3] Workshop Working Group Reports, Semiconductor Indus-
level details are known, but we choose to ignore them try Association, pp. 22{23, Nov. 1992.
and use instead a simpli ed high-level model of the [4] S. Chowdhury and J. S. Barkatullah, \Estimation of max-
block behavior. This is essentially a macro-modeling imum currents in MOS IC logic circuits," IEEE Transac-
tions on Computer-Aided Design, vol. 9, no. 6, pp. 642{
for power approach. Bottom-up techniques have been 654, June 1990.
proposed in [54, 55], where black-box models (macro- [5] S. Devadas, K. Keutzer, and J. White, \Estimation of
models) are built for circuit blocks by a process of char- power dissipation in CMOS combinational circuits using
acterization that models the block power as a function Boolean function manipulation," IEEE Transactions on
of the input/output signal statistics (probabilities) of Computer-Aided Design, vol. 11, no. 3, pp. 373{383, March
the block. Other details are also included, such as the 1992.
bus width, average capacitance, etc. [6] H. Kriplani, F. N. Najm, and I. Hajj, \Pattern independent
In other cases, the low-level details of the circuit maximum current estimationin power and ground buses of
blocks may be truly unknown yet, because such a cir- CMOS VLSI circuits: algorithms, signal correlations, and
their resolution," IEEE Transactions on Computer-Aided
cuit block my never have been designed before. This Design, vol. 14, no. 8, pp. 998{1012, August 1995.
presents a harder problem to solve of extracting power [7] H. J. M. Veendrick, \Short-circuit dissipation of static
out of pure (Boolean) functionality. I refer to such CMOS circuitry and its impact on the design of bu er cir-
techniques as being top-down. Top-down techniques cuits," IEEE Journal of Solid-State Circuits, vol. SC-19,
no. 4, pp. 468{473, Aug. 1984.
have been proposed in [44, 45], and make use of en-
tropy of a logic signal as a measure of the amount of [8] A. Shen, A. Ghosh, S. Devadas, and K. Keutzer, \On av-
information that can be carried by that signal. The erage power dissipation and random pattern testability of
CMOS combinational logic networks," IEEE/ACM Inter-
rationale for this is that the power requirements of a national Conference on Computer-Aided Design, pp. 402{
circuit must be related to the amount of computational 407, November 1992.
work that the circuit performs, which has traditionally [9] K. P. Parker and E. J. McCluskey, \Probabilistictreatment
been modeled with the entropy measure. The entropy of general combinationalnetworks," IEEE Transactions on
Computers, vol. C-24, pp. 668{670, June 1975.
is directly related to the signal probability, so that the
probabilities at the ip- op outputs are used to com- [10] S. Manne, A. Pardo, R. I. Bahar, G. D. Hachtel, F.
Somenzi, E. Macii, and M. Poncino, \Computing the max-
pute the entropy input to the combinational blocks. imum power cycles of a sequential circuit," 32nd Design
All these techniques are fairly recent, and it is Automation Conference, pp. 23{28, June 1995.
not clear yet how useful they will be in practice, or [11] S. M. Kang, \Accurate simulation of power dissipation
how their performances compare/contrast in a practi- in VLSI circuits," IEEE Journal of Solid-State Circuits,
cal setting. vol. SC-21, no. 5, pp. 889{891, Oct. 1986.
[12] G. Y. Yacoub and W. H. Ku, \An accuratesimulationtech-
5. Summary nique for short-circuit power dissipation based on current
component isolation," IEEE International Symposium on
Most proposed power estimation techniques op- Circuits and Systems, pp. 1157{1161, 1989.
erate at a low level of abstraction and use simpli ed [13] A-C. Deng, Y-C. Shiau, and K-H. Loh, \Time domain cur-
delay models, so that they do not provide the same rent waveform simulation of CMOS circuits," IEEE Inter-
national Conference on Computer-Aided Design, pp. 208{
accuracy as, say, circuit simulation. But they are fast, 211, Nov. 1988.
which is very important because VLSI designers are [14] R. Tjarnstrom, \Power dissipation estimate by switch level
interested in the power dissipation of large designs. simulation," IEEE International Symposium on Circuits
Within the limitations of the simpli ed delay mod- and Systems, pp. 881{884, May 1989.
els, some of these techniques, e.g., the statistical tech- [15] U. Jagau, \SIMCURRENT - an ecient program for the
niques, can be very accurate. In fact the desired ac- estimation of the current ow of complex CMOS circuits,"
curacy can be speci ed up-front. The other class of IEEE International Conference on Computer-Aided De-
techniques, i.e., the probabilistic techniques, are not as sign, pp. 396{399, Nov. 1990.
accurate but can be faster. [16] T. H. Krodel, \PowerPlay - fast dynamic power estima-
tion based on logic simulation," IEEE International Con-
From an implementation standpoint, one major ference on Computer Design, pp. 96{100, October 1991.
di erence between statistical and probabilistic tech- [17] L. Benini, M. Favalli, P. Olivo, and B. Ricco, \A novel
niques is that statistical techniques can be built around approach to cost-e ective estimate of power dissipation
existing simulation tools and libraries, while prob- in CMOS ICs," European Design Automation Conference,
abilistic techniques cannot. Typically, probabilistic pp. 354{360, 1993.
-7/8-
[18] F. Dresig, Ph. Lanches, O. Rettig, and U. G. Baitinger, [38] I. Miller and J. Freund, Probability and Statistics for En-
\Simulation and reduction of CMOS power dissipation gineers, 3rd edition. Englewood Cli s, NJ: Prentice-Hall,
at logic level," European Design Automation Conference, Inc., 1985.
pp. 341{346, 1993. [39] B. Kapoor, \Improving the accuracy of circuit activity
[19] M. A. Cirit, \Estimating dynamic power consumption measurement," 31st ACM/IEEE Design Automation Con-
of CMOS circuits," IEEE International Conference on ference, pp. 734{739, June 1994
Computer-Aided Design, pp. 534-537, Nov. 1987. [40] A. A. Ismaeel and M. A. Breuer, \The probability of error
[20] F. N. Najm and M. Y. Zhang, \Extreme delay sensitivity detection in sequential circuits using random test vectors,"
and the worst-case switching activity in VLSI circuits," Journal of Electronic Testing, vol. 1, pp. 245{256, January
32nd Design Automation Conference, pp. 623{627, June 1991.
1995. [41] G. D. Hachtel, E. Macii, A. Pardo, and F. Somenzi,
[21] J. Monteiro and S. Devadas, \Techniques for the power \Probabilistic analysis of large nite state machines," 31st
estimation of sequential logic circuits under user-speci ed ACM/IEEE Design Automation Conference, pp. 270{275,
input sequences and programs," ACM/IEEE International June 1994.
Symposium on Low Power Design, pp. 33{38, April 1995. [42] J. Monteiro and S. Devadas, \A methodology for ecient
[22] F. Najm, R. Burch, P. Yang, and I. Hajj, \Probabilistic estimation of switching activity in sequential logic cir-
simulation for reliability analysis of CMOS VLSI circuits," cuits," ACM/IEEE 31st Design Automation Conference,
IEEE Transactions on Computer-Aided Design, vol. 9, no. pp. 12{17, June 1994.
4, pp. 439{450, April 1990 (Errata in July 1990). [43] C-Y Tsui, M. Pedram, and A. M. Despain, \Exact and
[23] G. I. Stamoulis and I. N. Hajj, \Improved techniques approximate methods for calculating signal and transition
for probabilistic simulation including signal correlation ef- probabilities in FSMs," ACM/IEEE 31st Design Automa-
fects," 30th ACM/IEEE Design Automation Conference, tion Conference, pp. 18{23, June 1994.
pp. 379{383, 1993. [44] F. N. Najm, \Towards a high-level power estimation ca-
[24] C-Y. Tsui, M. Pedram, A. M. Despain, \Ecient esti- pability," ACM/IEEE International Symposium on Low
mation of dynamic power consumption under a real de- Power Design, pp. 87{92, April 23-26, 1995.
lay model," IEEE International Conference on Computer- [45] D. Marculescu, R. Marculescu, and M. Pedram, \Informa-
Aided Design, pp. 224{228, November 1993 tion theoretic measures of energy consumption at register
[25] F. Najm, \Transition density: a new measure of activity in transfer level," ACM/IEEE International Symposium on
digital circuits," IEEE Transactions on Computer-Aided Low Power Design, pp. 81{86, April 1995.
Design, vol. 12, no. 2, pp. 310{323, February 1993. [46] W. T. Eisenmann and H. E. Graeb, \Fast transient power
[26] F. Najm, \Low-pass lter for computing the transi- and noise estimationfor VLSI circuits," IEEE/ACM Inter-
tion density in digital circuits," IEEE Transactions on national Conference on Computer-Aided Design, pp. 252{
Computer-Aided Design, vol. 13, no. 9, pp. 1123{1131, 257, November 1994.
September 1994. [47] H. K. Sarin and A. J. McNelly, \A power modeling and
[27] F. N. Najm, S. Goel, and I. N. Hajj, \Power estimation in characterization method for logic simulation," IEEE Cus-
sequential circuits," 32nd Design Automation Conference, tom Integrated Circuits Conference, pp. 363{366, May
pp. 635-640, June 1995. 1995.
[28] A. Ghosh, S. Devadas, K. Keutzer, and J. White, \Esti- [48] A. Pardo, R. I. Bahar, S. Manne, P. Feldmann, G. D.
mation of average switching activity in combinational and Hachtel, and F. Somenzi, \CMOS dynamic power estima-
sequential circuits," 29th ACM/IEEE Design Automation tion based on collapsible current source transistor model-
Conference, pp. 253{259, June 1992. ing," ACM/IEEE International Symposium on Low Power
[29] C. M. Huizer, \Power dissipation analysis of CMOS VLSI Design, pp. 111{116, April 1995.
circuits by means of switch-level simulation," IEEE Euro- [49] C. X. Huang, B. Zhang, An-Chang Deng, and B.
pean Solid State Circuits Conference, pp. 61{64, 1990. Swirski, \The design and implementation of PowerMill,"
[30] R. Burch, F. Najm, P. Yang, and T. Trick, \A Monte Carlo ACM/IEEE International Symposium on Low Power De-
approach for power estimation," IEEE Transactions on sign, pp. 105{109, April 1995.
VLSI Systems, vol. 1, no. 1, pp. 63{71, March 1993. [50] V. Tiwari, S. Malik, and A. Wolfe, \Power analysis of
[31] J-Y. Lin, T-C. Liu, and W-Z. Shen, \A cell-based power embedded software: a rst step towards software power
estimation in CMOS combinational circuits," IEEE/ACM minimization," IEEE/ACM International Conference on
International Conference on Computer-Aided Design, Computer-Aided Design, pp. 384{390, November 1994.
pp. 304{309, November 1994. [51] T-L. Chou, K. Roy, and S. Prasad, \Estimation of cir-
[32] M. Xakellis and F. Najm, \Statistical Estimation of the cuit activity considering signal correlations and simultane-
Switching Activity in Digital Circuits," 31st ACM/IEEE ous switching," IEEE/ACM International Conference on
Design Automation Conference, pp. 728{733, 1994. Computer-Aided Design, pp. 300{303, November 1994.
[33] S. M. Ross, Stochastic Processes, New York, NY: John [52] R. Marculescu, D. Marculescu, and M. Pedram, \Switching
Wiley & Sons, 1983. activity analysis considering spatiotemporal correlations,"
IEEE/ACM International Conference on Computer-Aided
[34] A. Papoulis, Probability, Random Variables, and Stochas- Design, pp. 294{299, November 1994.
tic Processes, 2nd Edition. New York, NY: McGraw-Hill [53] R. Marculescu, D. Marculescu, and M. Pedram, \Ecient
Book Co., 1984. power estimation for highly correlated input streams,"
[35] R. E. Bryant, \Graph-based algorithms for Boolean func- 32nd Design Automation Conference, pp. 628{634, June
tion manipulation," IEEE Transactions on Computer- 1995.
Aided Design, pp. 677{691, August 1986. [54] P. E. Landman and J. M. Rabaey, \Activity-sensitive
[36] K. S. Brace, R. L. Rudell, and R. E. Bryant, \Ecient im- architectural power analysis for the control path,"
plementationof a BDD package," 27th ACM/IEEE Design ACM/IEEE International Symposium on Low Power De-
Automation Conference, pp. 40{45, June 1990. sign, pp. 93{98, April 1995.
[37] S. Chakravarty, \On the complexity of using BDDs for [55] P. E. Landman and J. M. Rabaey, \Architectural power
the synthesis and analysis of Boolean circuits," 27th An- analysis: the dual bit type method," IEEE Transactions
nual Allerton Conference on Communication, Control, on Very Large Scale Integration (VLSI) Systems, vol. 3,
and Computing, pp. 730{739, September 1989. no. 2, pp. 173{187, June 1995.

-8/8-

Potrebbero piacerti anche