Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1. INTRODUCTION
With the advent of portable and high-density
micro-electronic devices, the power dissipation of very
large scale integrated (VLSI) circuits is becoming a
critical concern. Modern microprocessors are hot, and
their power consumption can exceed 30 or 50 Watts.
Due to limited battery life, reliability issues, and packaging/cooling costs, power consumption has become
a more critical design concern than speed and area
in some applications. Hence to avoid problems associated with excessive power consumption, there is a
need for CAD tools to help in estimating the power
consumption of VLSI designs.
A number of CAD techniques have been proposed
This work was supported by Rockwell, by Intel Corp.,
and by the National Science Foundation (MIP 96-23237
& 97-10235).
than input statistics. Since the number of possible input transitions for an n-input combinational circuit is
22n , they present a clustering algorithm to compress
the input transitions into clusters of input transitions
that have the same power values (approximately).
They use heuristics to implement the clustering algorithm, but it is not clear how efficient the method
would be on large circuits.
0.5
0.0
0.0
0.5
Pin
1.0
Din = 0.3
0.20
0.15
0.10
0.05
0.0
0.2
0.4
0.6
0.8
1.0
Pin
Din = 0.8
0.30
0.28
0.26
0.24
0.40
0.45
0.50
0.55
Pin
0.60
0.65
0.70
c432
c880
c1908
c2670
c3540
c5315
c6288
c7552
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
1.61%
1.77%
1.74%
2.43%
2.96%
1.76%
16.6%
3.37%
34.88%
40.46%
16.80%
-31.61%
35.77%
20.94%
-40.04%
19.02%
Pin
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
(1)
Pin
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
Din
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
Pin
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
Din
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
the power when primary inputs are correlated. Table 6 gives the RMS, average and maximum error,
when the inputs are correlated and the total power is
estimated using the 3-d table-based macromodel, over
a wide range of Pin , Din , and Dout values. It can be
seen from the table that the error is quite high. This
led us to consider other parameters to be included in
the macromodel.
Power, from 3d Macromodel (uW/MHz/gate)
2.0
1.5
1.0
0.5
0.0
0.0
0.5
1.0
1.5
2.0
Power, from Correlated Input Vector Stream (uW/MHz/gate)
0.40
0.30
0.20
0.10
T Ci = P (xi )
0.00
0.00
0.10
0.20
0.30
0.40
Power, from Correlated Input Vector Stream (uW/MHz/gate)
(3)
D(xi )
2
(4)
where T Ci , P (xi ) and D(xi ) are the temporal correlation, signal probability and transition density, respectively.
Proof: Let us denote the probability of a low-to-high
transition by Plh , and the probability of a high-to-low
transition by Phl . Since a low-to-high transition is
eventually followed by a high-to-low transition, then:
Plh = Phl
(5)
D(xi )
2
(6)
(7)
Hence proved.
Therefore, temporal correlation at the primary
inputs is taken care by P (xi ) and D(xi ) and we do
not need an additional parameter to represent it.
(8)
(9)
(10)
P (x) =
1
0 + 1
(12)
D (x) =
2
0 + 1
(13)
(14)
2 (1 P (x))
D (x)
(15)
1 =
0 =
(16)
3. CHARACTERIZATION
We assume that the combinational circuit is embedded in a larger sequential circuit, so that its input
nodes are the outputs of latches or flip-flops and that
they make at most one transition per clock cycle. We
assume that the sequential design is a single clock system and ignore clock skew, so that the combinational
circuit inputs x1 , x2, . . . , xn switch only at time 0.
At this point it is helpful to recall some definitions. The signal probability P (xi ) at an input node
xi is defined as the average fraction of clock cycles in
which the final value of xi is a logic high. The transition density D(xi ) at an input node xi is defined as
the average fraction of cycles in which the node makes
a logic transition (its final value is different from its
initial value). For brevity, in this section we will write
Pi and Di to represent P (xi ) and D(xi ). Both Pi and
Di are real numbers between 0 and 1.
Because the input signals xi make at most a single transition per cycle, there is a special relationship
between probability and density, given by:
Di
Di
Pi 1
2
2
Din
Din
Pin 1
2
2
(11)
0
0
0.5
P(x)
Pin =
1X
Pi
n i=1
Din =
1X
Di
n i=1
(17)
(18)
Similarly we can derive a special relationship between SCin and Pin , i.e., given Pin we can find lower
and upper bounds for SCin . Because SCin is a probability it can take values only between 0 and 1. Before
describing the bounds, we first recall the definition of
SCin :
SCin
n
X
n
X
2
=
P {xi = 1, xj = 1} (20)
n (n 1) i=1 j=i+1
where:
N
SCin
n
N
n
X
X
1X
2
=
xi,k xj,k
N
n (n 1)
i=1 j=i+1
k=1
2
=
n (n 1) N
N X
n
X
n
X
xi,k xj,k
(22)
and where
i,k is the ith bit in the kth vector. Notice
P
Pn x
n
that i=1 j=i+1 xi,k xj,k = number of bit pairs, in
kth vector, that are (1, 1). Therefore,
n
n X
X
xi,k xj,k =
i=1 j=i+1
2
N n (n 1)
N
X
n1 (k) (n1 (k) 1)
k=1
(23)
where:
1 X n1 (k)
N
n
k=1
N
n1 (k) = nPin
N
Therefore, a lower bound on SCin
is given by:
N
N
nPin
nPin
1
N
SCin
n (n 1)
(28)
(29)
2
nPin
Pin
(n 1)
(30)
N
To compute an upper bound on SCin
, we start
N
with the observation that the maximum value of SCin
in (24) will occur when as many n1 (k)s as possible
are set to their maximum value n, because of the
quadratic term. Since not all n1 (k)s can be set to
N
is achieved by having
n due to (26), the largest SCin
m < N vectors have n1 (k) = n 1s, one vector contain
the remaining r < n 1s, and the remaining vectors
contain all 0s. In other P
words, m is the largest inteN
ger for which mn + r = N
k=1 n1 (k) = N nPin , where
N
c, and
0 < r < n is an integer. With this, m = bN Pin
N
the largest possible value of SCin is given by:
m
r(r 1)
mn (n 1) + r (r 1)
=
+
N n (n 1)
N
N n(n 1)
(31)
From this, it follows that:
N
(24)
(25)
N
Pin
=
N
SCin = lim SCin
Pin
N
At this point, it will be helpful to define Pin
. For
a block of N vectors, Pin can be written as:
N
Pin = lim Pin
(27)
k=1
N
SCin
1X
n1 (k) nPin
N
(19)
(26)
(32)
N
(r/N n) so that
due to the fact that m/N = Pin
limN m/N = Pin .
Combining the lower and upper bounds gives:
2
nPin
Pin
SCin Pin
(n 1)
(33)
The shaded region in Fig. 7 shows the feasible region for Pin and SCin . Shown in Fig. 8 is the threedimensional plot showing the relationship between
Pin , Din , and SCin . The two shaded surfaces are the
lower and upper bounds for SCin for different values
of Pin and Din . It is evident from the figure that Din
1.0 Pin
1
0.8
SCin
0.6
0.4
0.2
0
1
0.8
N
SCin
1
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
Din
Pin
Dout
1
SCin
(34)
V1
X1 X2 X3
1 1 0
Xn-1 Xn
0 1
V2
V3
VN-1 0
VN
Din
Pin
1
Pin
15.0
10.0
5.0
0.0
0.00
0.02
0.04
0.06
0.08
Distance between actual and calculated values
0.10
Monte Carlo simulation also provides accurate estimation of Dout . The power values predicted by the
look-up table were compared to those from simulation, and the RMS, absolute average and maximum
errors were computed.
The results are summarized in Table 7 for the
case when total power is estimated. It is seen that
the RMS error is very good, under about 5%. The
largest maximum error is at 22.56% for c432, because
the estimated power value is very small and a slight
difference in power value causes a lot of error. The
average error in all cases is less than 6%, which shows
the accuracy of our macromodeling approach. The
combined scatter plot of all ISCAS-85 circuits showing
the accuracy of this approach is shown in Fig. 12. An
enlarged view of the lower section of this plot is given
in Fig. 13. Both these plots report normalized power
values, so that the results for all the circuits can be
examined on the same plot.
Table 7. Accuracy of the 4-d look-up tables,
when total power is estimated.
Circuit RMS.Error Average Error Max.Error Time
c432
0.868%
5.56%
22.56% 11.75hrs
c880
0.647%
3.73%
14.64%
6.24hrs
c1908
0.729%
3.85%
16.89%
4.59hrs
c2670
0.738%
3.08%
11.52% 14.23hrs
c3540
0.802%
3.61%
16.53% 21.32hrs
c5315
0.612%
2.48%
-14.58% 16.21hrs
c6288
4.14%
3.75%
18.23%
34.4hrs
c7552
0.847%
3.03%
-16.58%
58.4hrs
c499
0.497%
4.05%
16.4%
2.3hrs
c1355
0.5167%
4.19%
15.6%
2.09hrs
2.0
Power, from Simulation (uW/MHz/gate)
1.5
1.0
0.5
0.0
0.0
0.5
1.0
1.5
Power, from Macromodel (uW/MHz/gate)
2.0
0.40
0.30
0.20
0.10
0.00
0.00
0.10
0.20
0.30
Power, from Macromodel (uW/MHz/gate)
0.40
0.30
k=1
n21 (k)
N
X
N
n1 (k) = n (n 1) N SCin
(A.1)
k=1
0.25
0.20
N
X
0.15
0.10
N
n1 (k) = nN Pin
(A.2)
k=1
0.05
0.00
0.00
0.05
0.10
0.15
0.20
0.25
Power, from Macromodel (uW/MHz/gate)
0.30
which is a constant.
problem becomes:
minimize
N
X
n21 (k)
k=1
s.t.
N
X
(A.3)
N
n1 (k) = nN Pin
k=1
(A.4)
it can be solved by converting it (by introducing a Lagrangian) into an unconstrained problem [16], leading
to:
!
N
N
X
X
N
minimize
n21 (k) nN Pin
n1 (k)
k=1
k=1
(A.5)
2
n
(k)
where
is
a
constant.
Differentiating
k=1 1
PN
N
nN Pin k=1 n1 (k) with respect to n1 (k) and
setting it equal to 0 we get:
PN
n1 (k) =
(A.6)
n1 (k) =
N
nPin
(A.7)
(A.8)
Hence proved.
[1]
[2]
[3]
[4]
[5]
[6]
REFERENCES
F. Najm, A survey of power estimation techniques in VLSI circuits, IEEE Transactions on
VLSI Systems, pp. 446-455, Dec. 1994.
P. Landman, High-level power estimation, International Symposium on Low Power Electronics and Design, pp. 2935, Monterey, CA, August
1214, 1996.
M. Nemani and F. Najm, Towards a High-Level
Power Estimation Capability, IEEE Transactions on CAD , vol. 15 pp. 588-598, June 1996.
D. Marculescu, R. Marculescu and M. Pedram,
Information Theoretic Measures of Energy Consumption at Register
Transfer Level,
ACM/IEEE International Symposium on Low
Power Design, pp. 87-92, April 1995.
S. R. Powell and P. M. Chau, Estimating Power
Dissipation of VLSI signal Processing Chips: The
PFA technique, VLSI Signal Processing IV, pp.
250-259, 1990.
P. E. Landman and J. M. Rabaey, Architectural
Power Analysis: The Dual Bit Type Method,