Chang Sik 2000

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO.
9, SEPTEMBER 2000
935
In summary, the MAXLINE2 algorithm is the clear winner in

terms of data buffer requirements and average execution time on i.i.d.
data. However, for applications involving correlated input data or for
real-time applications where worst-case execution time is important,
the MAXTREE and MAXTREE2 algorithms are the preferred choice.
Finally, it should be noted that although the Pitas algorithm from [5]
is, as noted above, uncompetitive as a software algorithm, it remains a
good choice for a parallel hardware implementation since it requires
only
registers and log2 ( ) comparators.
ACKNOWLEDGMENT
The author would like to thank the anonymous reviewers and Dr.
J. Chambers for their helpful comments which substantially improved
this paper.
REFERENCES
[1] G. R. Arce and M. P. McLoughlin, Theoretical analysis of the maxmedian filter, IEEE Trans. Acoust., Speech, Signal Processing, vol.
ASSP-35, pp. 6069, Jan. 1987.
[2] P. A. Maragas and R. W. Schafer, Morphological filtersPart I: Their
set theoretic analysis and relations to linear shift invariant filters,
IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp.
11531184, Aug. 1987.
[3] R. Martin, Spectral subtraction based on minimum statistics, in Proc.
EUSIPCO-94, Edinburgh, Scotland, Sept. 1994, pp. 11821185.
[4] I. Pitas, Fast algorithms for running ordering and max/min calculation,
IEEE Trans. Circuits Syst., vol. 36, pp. 795804, 1989.
[5] D. Coltuc and I. Pitas, On fast running max-min filtering, IEEE Trans.
Circuits Syst. II, vol. 44, pp. 660664, Aug. 1997.
[6] J. Garofolo et al., DARPA TIMIT acoustic-phonetic continuous speech
corpus (CD-ROM), National Institute of Standards and Technology,
1990.
Fig. 1.
(a) Tapered CMOS buffer and (b) its timing diagram.
A CMOS Buffer Without Short-Circuit

Power Consumption
Changsik Yoo
AbstractA new CMOS buffer without short-circuit power consumption is proposed. The gate- driving signal of the output pull-up (pull-down)
transistor is fed back to the output pull-down (pull-up) transistor to get
tri-state output momentarily, eliminating the short-circuit power consumption. The HSPICE simulation results verified the operation of the proposed
buffer and showed the power-delay product is about 15% smaller than conventional tapered CMOS buffer.
Index TermsCMOS buffer, short-circuit power consumption.
I. INTRODUCTION
With the high integration level of CMOS very large scale integration (VLSI), the capacitive load of periodic signals such as clock has
become very large. With such a large capacitive load, driving circuits
consume a relatively large portion of the total power of a VLSI. The
Manuscript received June 1999; revised June 2000. This paper was recommended by Associate Editor M. Bayoumi.
The author was with Integrated Systems Laboratory (IIS), Swiss Federal Institute of Technology, Zurich, Switzerland. He is now with Samsung Electronics,
Kiheung, Korea.
Publisher Item Identifier S 1057-7130(00)07752-1.
Fig. 2. (a) Feedback-controlled split-path CMOS buffer and (b) its timing
diagram.
power consumption of a CMOS buffer driving a capacitive load consists of dynamic switching power and short-circuit power. While the
switching-power consumption is unavoidable to drive a capacitive load,
short-circuit power is a waste of current and should be minimized or
even eliminated for low-power operation.
A conventional tapered CMOS buffer, shown in Fig. 1(a), consumes
both the dynamic switching power and short-circuit power due to simultaneous turn-on of the pull-up/pull-down transistors, as illustrated
in Fig. 1(b) [1]. Short-circuit power consumption can be eliminated by
tri-stating the output node momentarily before every output signal transition. In [2], asymmetric inverters were used as waveform shaper to
get momentary tri-state output period, but the propagation delay is increased by the asymmetric inverters. As an alternative, a feedback-controlled split-path (FS) CMOS buffer was proposed, where the output
signal is fed back to control the output pull-up and pull-down transistors, as shown in Fig. 2, tri-stating the output momentarily and thereby
eliminating the short-circuit power consumption [3]. But, in the FS
CMOS buffer, the logic states of the split output stage drivers change
10577130/00$10.00 2000 IEEE
936
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 9, SEPTEMBER 2000
TABLE I
TRANSISTOR SIZES OF THE PROPOSED
CMOS BUFFER IN FIG. 3
Fig. 3. (a) Proposed CMOS buffer without short-circuit power consumption.

(b) Timing diagram of the buffer.
Fig. 5. (a) Total power consumption and (b) propagation delay of CMOS
buffers as a function of load capacitance.
Fig. 4. Simulated waveform of the proposed CMOS buffer.
twice for every output signal transition, increasing the power consumption. In charge-transfer feedback-controlled split-path (CFS) CMOS
buffer, this additional power consumption is minimized by transferring
the large charge stored in the output-stage driver to the output node
[4]. In both the FS and CFS buffer, the feedback delay t1 and t2 should
be controlled very well because if t1 and t2 are too small, the output
transistors can be turned off before the complete output transition. The
TABLE II
TOTAL ACTIVE AREA OF CMOS BUFFERS
feedback delay t1 and t2 are dependent on the load capacitance, which

makes the control of t1 and t2 complicated.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 9, SEPTEMBER 2000
In this brief, a new CMOS buffer without short-circuit power consumption is proposed. The output pull-up and pull-down transistors
are driven by separate driving signals generated so the pull-up and
pull-down transistors do not turn on simultaneously.
II. A CMOS BUFFER WITHOUT SHORT-CIRCUIT POWER
CONSUMPTION
The schematic and timing diagram of the proposed CMOS buffer
are shown in Fig. 3(a) and (b), respectively. While the output signal
itself is fed back in case of the FS and CFS buffer, the gate driving
signal N1 (N2 ) of the output pull-up (pull-down) transistor is fed back
to the output pull-down (pull-up) transistor to get tri-state output momentarily, eliminating the short-circuit power consumption. The logic
states of the output-stage driver change only once for each output transition in the proposed buffer as opposed to twice in the FS and CFS
buffer. Since the gate driving signals are fed back instead of the output
signal itself, the feedback delay is independent of the output capacitive
load, making the optimization of the circuit much easier. The pull-up
and pull-down operations are explained respectively in the following.
937
IV. CONCLUSION
A new CMOS buffer has been proposed which has no short-circuit
power consumption. The output pull-up and pull-down transistors are
driven by separate driving stages which ensure pull-up and pull-down
transistors do not turn on simultaneously. The HSPICE simulation results show about 15% improvement in the power-delay product compared to a conventional tapered CMOS buffer, and thus the proposed
buffer is suitable for low-power operation.
REFERENCES
[1] N. Li, F. Haviland, and A. Tuszynski, A CMOS tapered buffer, IEEE
J. Solid-State Circuits, vol. 25, pp. 10051008, Aug. 1990.
[2] K. Y. Khoo and A. N. Wilson Jr., Low power CMOS clock buffer,
Proc. Int. Symp. Circuits and Systems, vol. 4, pp. 355358, 1994.
[3] H.-Y. Huang and Y.-H. Chu, Feedback-controlled split-path CMOS
clock buffer, Proc. Int. Symp. Circuits and Systems, vol. 4, pp. 300303,
1996.
[4] K.-H. Cheng, W.-B. Yang, and H.-Y. Huang, The charge-transfer feedback-controlled split-path CMOS buffer, IEEE Trans. Circuits Syst, II,
vol. 46, pp. 346348, Mar. 1999.
A. Output Pull-Up Operation

When the input signal IN rises from 0 to VDD , the internal node
N2 falls from VDD to zero, turning off the output pull-down transistor
M2 . Then, the node N4 rises from zero to VDD and after some delay,
the node N1 falls from VDD to zero. Now, the output pull-up transistor
M1 is turned on and the output voltage begins to rise from zero to VDD .
Since the node N2 is driven to zero before the node N1 , the pull-down
transistor M2 is turned off before the pull-up transistor M1 is turned
on. Therefore, there is no period when both the pull-up and pull-down
transistors are turned on simultaneously and thus no short-circuit power
consumption.
B. Output Pull-Down Operation
When the input signal IN falls from VDD to 0, the node N1 is driven
VDD , turning off the output pull-up transistor M1 . Then, the node
N3 falls from VDD to zero and after some delay, the node N2 rises
from zero to VDD . Now, the output pull-down transistor M2 is turned
on and the output voltage begins to fall from VDD to zero. Since the
node N1 is driven to VDD before the node N2 , there is no period when
both the pull-up and pull-down transistors are turned on simultaneously
and thus no short-circuit power consumption in this case, as well.
to
III. SIMULATION RESULTS

The proposed CMOS buffer has been simulated by HSPICE with
0.25-m 2.5-V CMOS parameters. The sizes of the transistors are listed
in Table I. The simulated waveforms of the proposed buffer is shown
in Fig. 4 when the input clock frequency is 200 MHz and the output
load capacitance is 50 pF. It can be seen the gate driving signals N1
and N2 are generated, so the transistors M1 and M2 do not turn on
simultaneously, eliminating the short-circuit power consumption. The
total power consumption and the propagation delay of the proposed
buffer are compared with those of conventional tapered CMOS buffer
and FS buffer in Fig. 5. From the figure, it is clear the power consumption of the proposed buffer is smaller than that of earlier works. The
power-delay product of the proposed buffer is smaller by about 15%
than conventional tapered CMOS buffer, although the proposed buffer
occupies larger silicon area than a conventional tapered CMOS buffer
because of the separate control of pull-up and pull-down paths, as compared in Table II.
RNS Arithmetic Multiplier for Medium and Large Moduli

Ahmad A. Hiasat
AbstractIn implementing Residue Number System (RNS) arithmetic

multipliers, ROM-based structures are very efficient for small moduli.
However, due to their exponential growth, ROM implementations are
not suitable for medium and large moduli. This paper introduces an
architecture for a RNS-based multiplier which combines the use of
small-size ROMs and arithmetic components. The design is most suitable
for medium and large moduli. Compared with other implementations, the
VLSI layout implementation of this new approach is shown to be more
efficient in terms of area and delay requirements.
Index TermsArea and time complexity, modular multiplication, multioperand modular adder, residue number system, VLSI.
I. INTRODUCTION
Residue Number System (RNS) has the advantage of carry-free
arithmetic operations. Therefore, using residue arithmetic would, in
principle, increase the speed of computations. Specifically, addition,
subtraction, and multiplication can be carried out on each residue digit
concurrently and independently. RNS has demonstrated a high efficiency in implementing different types of digital filters, which depends
mainly on the above mentioned operations. It has been successfully
implemented in applications involving the design of fast number
theoretic transform, discrete Fourier transform, and many other areas
[1]. Therefore, designing an efficient modular multiplier has been
an important task in realizing different RNS-based applications and
processors. The modular multiplication is defined as evaluating
jXY jm , where X; Y 2 [0; m). Defining Z as Z = jXY jm , then
Z is the least nonnegative remainder when dividing the product
Manuscript received July 1998; revised May 2000. This paper was recommended by Associate Editor W. Liu.
The author is with Electronics Engineering Department, Princess Sumaya
University, Amman 11941, Jordan.
Publisher Item Identifier S 1057-7130(00)07746-6.
10577130/00$10.00 2000 IEEE

Chang Sik 2000

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chang Sik 2000

Caricato da

Copyright:

Formati disponibili

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO.

In summary, the MAXLINE2 algorithm is the clear winner in

(a) Tapered CMOS buffer and (b) its timing diagram.

A CMOS Buffer Without Short-Circuit

10577130/00$10.00 2000 IEEE

Fig. 3. (a) Proposed CMOS buffer without short-circuit power consumption.

Fig. 4. Simulated waveform of the proposed CMOS buffer.

feedback delay t1 and t2 are dependent on the load capacitance, which

A. Output Pull-Up Operation

III. SIMULATION RESULTS

RNS Arithmetic Multiplier for Medium and Large Moduli

AbstractIn implementing Residue Number System (RNS) arithmetic

10577130/00$10.00 2000 IEEE

Potrebbero piacerti anche