Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1
and performance. Experiments show that in our case it is where is the propagation delay of the CMOS transistor,
possible to lower the processor’s input voltage from 1.5 V VT the threshold voltage, and V G the input gate voltage [1].
to as low as 0.8 V, dramatically reducing power consump- The propagation delay restricts the clock frequency in a mi-
tion. This power reduction comes at the price of reduced croprocessor. From Equations (1) and (2) we can see that
performance, because proper operation is only guaranteed there is a fundamental trade-off between switching speed
at a reduced clock frequency. We anticipate a considerable and supply voltage. Processors can operate at a lower sup-
reduction in power consumption since typical Ubicom ap- ply voltage, but only if the clock frequency is reduced to
plications cause a fluctuating processor demand; wearable tolerate the increased propagation delay. When we assume
devices are used interactively by a single user and activi- the dynamic power is the most dominant one, and the gates
ties like word processing, reading e-mail, and web browsing gk of the microprocessor form a collective switching capac-
generate bursts of processing. itance C with a common switching frequency f , we obtain
This paper investigates the trade-off between power con-
sumption and performance of processors supporting voltage P = C f VDD
2
(3)
scaling in detail. Unlike previous studies in voltage scaling Equation (3) shows that clock frequency reduction linearly
that rely on simulations we have realized an actual imple- decreases power and voltage reduction results in a quadratic
mentation being part of a wearable computer. This allows power reduction. The critical path of a processor is the
us to present raw measurements of the performance/power longest path a signal can travel. The implicit constraint is
trade-off in voltage scaling. From these numbers we are that the propagation delay of the critical path must be
able to derive the potential gain in a system where applica- smaller than f1 . In fact, the processor ceases to function
tions assist the operating system in adjusting the voltage by when VDD is lowered and the propagation delay becomes
specifying their (bursty) requirements. too large to satisfy internal timings at frequency f .
Pdynamic =
XM Ck fk VDD2
(1)
First, the processor is only a part of the total system.
Consequently, the benefits of voltage scaling are to be
weighted against the power consumed by the remainder of
k=1 the system. This issue will be addressed in Section 4.2.
where M is the number of gates in the circuit, C k the load Second, application performance is a combination of pro-
capacitance of gate g k , fk the switching frequency of g k , cessor speed and memory access latency. The performance
and VDD the supply voltage. It is clear from Equation (1) of the memory subsystem is not linearly related to clock fre-
that reduction of V DD is the most effective mean to lower quency, so application performance does not scale linearly
the power consumption. Lowering V DD , however, creates too. Third, certain applications like video players have real-
the problem of increased circuit delay. An estimation of time deadlines to meet. This limits the possibilities to apply
circuit delay is given by voltage scaling by reducing the clock frequency, because
the time needed to complete a task increases proportionally.
/ (V VDDV (2)
A simple heuristic suggested by Weiser et al. [2] is to dis-
G T )2 tinguish foreground tasks, background tasks, and periodic
2
Frequency f Voltage V DD Relative power
(MHz) (V) (%)
700 1.65 100
600 1.60 80.59
500 1.50 59.03
400 1.40 41.14
300 1.25 24.60
200 1.10 12.70
3
2.5
LART
SA1100 Destructive
voltage regulator A processor
+ A 0.6 - 2.3 V V 2
unregulated
Processor voltage in V
power supply V A
- voltage regulator DRAM 1.5 Specified operation area
3.3 V V FLASH
other
1
0
4 Results 74 103 133 162 192 221 251
CPU Clock Frequency in MHz
This section describes the effects of frequency and volt- Figure 3. Processor envelope.
age scaling on the power consumption, processor perfor-
mance, and memory performance of the LART. We ran sev-
eral micro benchmarks (dhrystone, lmbench) as well as a 1000 CPU intensive,
voltage scaling
full-blown H.263 decoder. During each run the clock fre-
quency and voltage were kept constant; dynamic voltage 800
scaling is discussed in Section 5. Power Consumption in mW
CPU intensive,
fixed voltage
600
4.1 Required voltage
400
The first experiments determine for each clock frequency
the minimum required voltage at which the SA-1100 pro-
cessor still functions proper. We used the H.263 decoder to 200 Idle mode
4
700 Memory intensive 7
Total cost
600 6
500 5
300 3
200 2
CPU power CPU only
100 1
0 0
74 103 133 162 192 221 251 44 74 103 133 162 192 221 251
CPU Clock Frequency in MHz CPU Clock Frequency in MHz
Figure 5. Power consumption for read. Figure 7. Energy breakdown for memory read.
5
1200
questionable predictions. The drawback of this approach
Total power
is that it requires modifications of the application source.
1000
This is, however, only required for bursty applications and,
more importantly, applications running on a resource-scarce
Power Consumption in mW
6
6 Related work
7
put voltage. Simulations have pointed out the potentials of for dynamic speed-setting of a low-power CPU”, Mobicom,
voltage scaling. Nov. 1995.
Within the UbiCom project we have assembled a wear- [4] Y. Lee, C.M. Krishna, “Voltage-clock scaling for low energy
able system that supports dynamic voltage scaling. It is consumption in real-time embedded systems”, 6th Int. Conf.
on Real-Time Computing Systems and Applications, 1998.
based upon the low-power embedded SA-1100 processor,
[5] T. Pering, T. Burd, R. Brodersen, “The simulation and evalu-
whose frequency can be varied from 59 MHz to 251 MHz.
ation of dynamic voltage scaling algorithms”, ISLPED, Aug.
The required input voltage varies from 0.8 V to 2.0 V. Mea- 1998.
surements show that the power consumption of the proces- [6] T. Pering, T. Burd, R. Brodersen, “Dynamic voltage scal-
sor at 59 MHz is 33.1 mW, while it consumes 696.7 mW at ing and the design of a low-power microprocessor system”,
251 MHz. It follows that the energy per instruction at min- ISCA, 1998.
imal speed is 1/5 of the energy required at full speed. This [7] T. Burd, R. Brodersen, "Processor design for portable sys-
result confirms the importance of voltage scaling. tems" Journal of VLSI Signal Processing, Aug/Sept 1996.
We have added kernel support to Linux that allows run- [8] T. Kuroda, et.al., “Variable supply-voltage scheme for low-
ning applications to scale the frequency and voltage from power high-speed CMOS digital design”, IEEE Journal of
user space in only 140 s. This allows power-aware ap- Solid-State Circuits, Vol. 33, No. 3, March 1998.
[9] O. Angin, A.T. Campbell, M.E. Kounavis, R.R.F. Liao, “The
plications to quickly adjust the performance level of the
MobiWare Toolkit: Programmable Support for Adaptive
processor whenever the workload changes. For example, a Mobile Networking”, IEEE Personal Communications Mag-
bursty application like an MPEG decoder can use a model to azine, Aug. 1998.
estimate the processing requirements for each frame. Cal- [10] J. Frenkil, “Tools and Methodologies for Low Power De-
culations indicate that with the small delay (140 s) to set sign”, Proc. Design Automation Conference, 1997
the performance level, the overhead can be less than 1%. [11] J. Lorch, “A complete picture of the energy consumption of a
Our future plans are to specify a processor-independent portable computer”, Masters Thesis, University of California
dynamic voltage-scaling framework as an enhancement of at Berkeley, Dec. 1995.
the ACPI standard, so that the voltage scaling technique [12] J.R. Lorch, A.J. Smith, “Scheduling techniques for reducing
processor energy use in MacOS”, Wireless Networks, No. 3,
can become available to a broad audience. Within Ubi-
1997.
Com we are developing a QoS framework such that the
[13] J.R. Lorch, A.J. Smith, “Software strategies for portable
processor can export its performance/power-consumption computer energy management”, IEEE Personal Communi-
trade-off to higher layers. When other components, such as cations, Jun 1998.
the radio transceiver, also export their performance/power- [14] B.D. Noble, et.al., “Agile application-aware adaptation for
consumption trade-offs, intelligent decisions can be made at mobility”, 16th ACM Symposium on OS principles, Saint-
the application level. For example, an audio-streaming ap- Malo, France, Oct. 1997.
plication can choose between little compression (low pro- [15] D. Raychaudhuri, D. Reininger, M. Ott, “Multimedia pro-
cessor load) and a large data stream over the radio, and the cessing and transport for the wireless personal terminal sce-
alternative scenario of heavy compression and low bit rate. nario”, VCIP 95, SPIE Int. Society for Optical Engineering,
Which is best depends on the current external circumstances Taipei, Taiwan, May. 1995
[16] Transmeta corporation, “TM5400 processor specifications”,
that influence the radio.
Jan. 2000.
[17] Intel, “Intel strongarm 1100 microprocessor developers man-
Acknowledgements ual”, Aug 1999.
[18] Compaq research, “Itsy V2 overview slides”, Jan 2000.
[19] L. McVoy, “Lmbench: portable tools for performance analy-
We would like to thank Jan-Derk Bakker and Erik Mouw sis”, USENIX, Jan. 1996.
for providing us with an excellent low-power platform, and [20] J.A. Pouwelse, K. Langendoen, H.J. Sips, “A feasible low-
assisting us with the measurements and their interpretation. power augmented-reality terminal”, 2nd International Work-
We thank Hylke van Dijk for commenting on the draft ver- shop on Augmented Reality, Oct. 1999.
sion of this paper. [21] J.D. Bakker, J.A.K. Mouw, “Linux Embedded Radio Termi-
nal design page”, http://www.lart.tudelft.nl/.
[22] Intel, microsoft, Toshiba, “Advanced Configuration and
References Power Interface Specifications”, revision 1.0b, Feb. 1999.
[23] T. Martin, “Balancing batteries, power, and performance:
[1] T. Ishihara, H Yasuura, “Voltage scheduling problem for dy- system issues in CPU speed-setting for mobile computing”,
namically variable voltage processors”, ISLPED, Aug. 1998. PhD. dissertation, Carnegie Mellon University, Aug. 1999.
[2] M. Weiser, B. Welch, A. Demers, S. Shenker, “Scheduling [24] R.L. Lagendijk, “UbiCom Technical Annex”, Delft Univer-
for reduced CPU energy”, OSDI, Nov. 1994. sity of Technology, Jan. 2000, http://www.ubicom.tudelft.nl.
[3] K. Govil, E. Chan, H. Wasserman, “Comparing algorithms