Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
5.1 Introduction
Addition is the most commonly used arithmetic operation on microprocessors, digital signal processor (DSP), etc. So, binary adders are crucial building blocks in very large scale integrated (VLSI) circuits. Often is the speed limiting element as well. Therefore, careful optimization of the adder is of most importance. This optimization can proceed either at the logic or circuit level. Typical logic level optimizations try to rearrange the Boolean equations so that a ,,m.,,/m, styles. Six 1-bit full adder circuits based on these logic styles are chosen for the extensive evaluation. From the previous works it has been concluded that there is no ideal cell that can be used by all types of applications. Therefore, many different circuits for binary addition have been proposed over the last several decades, covering a wide range of performance characteristics to satisfy the constraints enforced by different applications. The logic style used in adder cells basically influences the speed, size and power dissipation of the circuits. The circuit delay is determined by the number of inversion levels, the number of transistors in series, and transistor sizes. Circuit size depends on the number of transistors and their sizes. Finally power dissipation is determined by the switching activity and the node capacitance made up of gate, diffusion, and wire capacitance. All these characteristics may vary considerably from one logic style to another and thus make a proper choice of logic style crucial for circuit designer to satisfy their needs. In this chapter the complementary CMOS logic, complementary pass transistor, double pass transistor, transmission gate, pseudo NMOS and a combinational of XOR and transmission gate, all of which belongs to the class of the static logic, are used as a basis for comparison. Different design requirements such as area, speed and power consumption generally translate into use of different logic styles. Proper choice of the logic of the logic style can considerably improve different aspects of the performance of a 1-bit full adder cell. A major distinction has been made between static and dynamic logic styles. In static logic each output of the gates assume at all times the value of the Boolean function implemented by the circuit. This means that at every point time, each output is connected to either VDD, or VSS, via a low resistance path. Static logic is viable candidate for low power circuit design because this logic style eliminates the pre-charging and decreases extra power dissipation by the clocking. The
complementary CMOS, complementary pass-transistor (CPL), pass transistor logic styles), double pass transistor (DPL) and single-rail pass transistor (LEAP), the pseudo NMOS are the most known static logic styles.
Figure 5.1 Complementary CMOS full adder schematic This is in contrast to the dynamic circuit class, which relies on temporary storage of signal values on the capacitance of high-impedance circuit nodes. The latter approach results in simpler and faster form. Its design and operation are however more involved and prone to failure due to an increased sensitivity to noise. One of the advantages of the complementary CMOS full adder cell is high noise margins and thus reliable operation at low voltages and arbitrary transistor sues (ratio less logic). The layout of CMOS gates is straight forward due to the complementary transistor pairs. The above figure shows a 3 input logic gate where all inputs are distributed to both the pull-up and pull-down networks. The function of the PUN is to provide a connection between the output and VDD anytime the output of the logic gate is meant to be 1 (based on
the inputs). Similarly, the function of the PDN is to connect the output to VSS when the output of the logic gate is meant to be 0. The PUN and PDN networks are constructed in a mutually exclusive fashion such that one and only one of the networks are conducting in steady state. In this way, once the transients have settled, a path always exists between VDD and the output , realizing a high output (one), or, alternatively, between VSS and F for a low output (zero). This is equivalent to stating that the output node is always a low-impedance node in steady state. A transistor can be thought of as a switch controlled by its gate signal. An NMOS switch is on when the controlling signal is high and is off when the controlling signal is low. A PMOS transistor acts as an inverse switch that is on when the controlling signal is low and off when the controlling signal is high. The PDN is constructed using NMOS devices, while PMOS transistors are used in the PUN. The primary reason for this choice is that NMOS transistors produce strong zeros, and PMOS devices generate strong ones. The output capacitance is initially charged to VDD. Two possible discharge scenarios are shown. An NMOS device pulls the output all the way down to GND, while a PMOS lowers the output no further than |VTP| the PMOS turns off at that point, and stops contributing discharge current. NMOS transistors are hence the preferred devices in the PDN. Similarly, there are two alternative approaches to charge a capacitor with the output initially at GND. A PMOS switch succeeds in charging the output all the way to VDD, while the NMOS device fails to raise the output above VDD VTN. This explains why PMOS transistors are preferentially used in a PUN. Complementary CMOS gates inherit all the nice properties of the basic CMOS inverter. They exhibit rail to rail swing with VOH = VDD and VOL = GND. The circuits also have no static power dissipation, as the circuits are designed in such a way that the pull-down and pull up networks are mutually exclusive. While complementary CMOS is a very robust and simple approach for implementing logic gates, there are two major problems associated with using this style as the complexity of the gate (i.e., fan-in) increases. First, the number of transistors required to implement an N fan-in gate is 2N. This can result in significant implementation area. The second problem is that propagation delay of a complementary CMOS gate deteriorates rapidly as a function of the fan-in. The large number of transistors (2N) increases the overall capacitance of the gate. For this gate, the output capacitance increases linearly with the fan-in since the number of PMOS devices connected to the output node increases linearly with the fan-in. Also, a series connection of transistors in either the PUN or PDN slows the gate as well, because the effective (dis)charging resistance is increased. For the same full adder gate, the effective
resistance of the PDN path increases linearly with the fan-in. Since the output capacitance increase linearly and the pull-down resistance increases linearly, the high-to-low delay can increase in a quadratic fashion. The fan-out has a large impact on the delay of complementary CMOS logic as well. Each input to a CMOS gate connects to both an NMOS and a PMOS device, and presents a load to the driving gate equal to the sum of the gate capacitances. At first glance, it would appear that the increase in resistance for larger fan-in can be solved by making the devices in the transistor chain wider. Unfortunately, this does not improve the performance as much as expected, since widening a device also increases its gate and diffusion capacitances, and has an adverse affect on the gate performance. For the N-input NAND gate, the low-to-high delay only increases linearly since the pull-up resistance remains unchanged and only the capacitance increases linearly. An often mentioned disadvantage of complementary CMOS full adder cell is the substantial number of large PMOS transistors, resulting in high input loads, more power consumption and larger silicon area. This adder cell uses CO signal to generate Sum, which produces an unwanted additional delay. Another drawback of CMOS is the relatively weak output driving capability due to series transistors in the output stage.
since the pull-down devices are turned off when the output is pulled high (assuming that VOL is below VTN). On the other hand, the nominal low output voltage is not 0 V since there is a fight between the devices in the PDN and the grounded PMOS load device. This results in reduced noise margins and more importantly static power dissipation.
The sizing of the load device relative to the pull-down devices can be used to tradeoff parameters such as noise margin, propagation delay and power dissipation. Since the voltage swing on the output and the overall functionality of the gate depends upon the ratio between the NMOS and PMOS sizes, the circuit is called ratio-ed. This is in contrast to the ratio-less logic styles, such as complementary CMOS, where the low and high levels do not depend upon transistor sizes. The value of VOL is obtained by equating the currents through the driver and load devices for VIN = VDD. At this operation point, it is reasonable to assume that the NMOS device resides in linear mode (since the output should ideally be close to 0V), while the PMOS load is saturated. On the negative side is the static power consumption of the pull-up transistor as well as the reduced output voltage swing, which makes this cell more susceptible to noise. To increase the output swing two CMOS inverters are added to this circuit, which increases the total transistors count of this cell to 18 transistors. A major disadvantage of the pseudoNMOS gate is the static power that is dissipated when the output is low through the direct current path that exists between VDD and GND. The trade-off between the static and dynamic properties is apparent. A larger pull-up device improves performance, but increases static power dissipation and lowers noise margins (i.e., increases VOL).
The static power dissipation of pseudo-NMOS limits its use. However, pseudoNMOS still finds use in large fan-in circuits. When area is most important, the reduced transistor count compared to complimentary CMOS is quite attractive.
The most widely-used solution to deal with the voltage-drop problem is the use of transmission gates. It builds on the complementary properties of NMOS and PMOS transistors: NMOS devices pass a strong 0 but a weak 1, while PMOS transistors pass a strong 1 but a weak 0. The ideal approach is to use an NMOS to pull-down and a PMOS to pull-up. The transmission gate combines the best of both device flavors by placing a NMOS device in parallel with a PMOS device (Fig3.5).The control signals to the transmission gate (Cin and Cout) are complementary. The transmission gate acts as a bidirectional switch controlled by the gate signal Cin. When Cin = 1, both MOSFETs are on, allowing the signal to pass through the gate.
Figure 3.5 16Transistor Full Adder The design of 16Transistor implemented using S-edit and its corresponding LAYOUT as shown in figure 3.5.1.
occupies less area compared with complementary CMOS full adder cell. In terms of power dissipation this cell is superior; this is due to its low activity factor and passing a strong signal in less number of pass logic unlike the other cells where the signal had to go through more number of logic.
Figure 3.5 XOR and Transmission Gate Full Adder schematic The pass-transistor and the transmission gate are, unfortunately, not ideal switches, and have a series resistance associated with it. To quantify the resistance, consider the circuit in Figure 3.6, which involves charging a node from 0 V to VDD. In this discussion, we use the large-signal definition of resistance, which involves dividing the voltage across the switch by the drain current. The effective resistance of the switch is modeled as a parallel connection of the resistances RN and RP of the NMOS and PMOS devices, defined as (VDD Vout)/IN and (VDD Vout)/IP, respectively. The currents through the devices are obviously dependent on the value of Vout and the operating mode of the transistors. During the low to- high transition, the pass-transistors traverse through a number of operation modes. The effective use of transmission gates is the popular XOR circuit shown in Figure 3.5. The complete implementation of this gate requires only six transistors (including the inverter used for the generation of B), compared to the twelve transistors required for a complementary implementation. To understand the operation of this circuit, we have to
analyze the B = 0 and B = 1 cases separately. For B = 1, transistors M1 and M2 act as an inverter while the transmission gate M3/M4 is off; hence F = AB. In the opposite case, M1 and M2 are disabled, and the transmission gate is operational, or F = AB. The combination of both results in the XOR function. Notice that, regardless of the values of A and B, node F always has a connection to either VDD or GND and is hence a low-impedance node. When designing static-pass transistor networks, it is essential to adhere to the low-impedance rule under all circumstances. Other examples where transmission- gate logic is effectively used are fast adder circuits and registers.
Fig1:XNOR
The 10Transistor consists of 5PMOS and 5NMOS devices. These PMOS and NMOS have width and length values. Based on PMOS and NMOS values the circuit is designed. This 10 transistor have three inputs A, B, C and two output values SUM and CARRY. Depending on input values A, B,C ,the output values will be generated
A0 B0
A1
B1
A2 B2
A3 B3
Ci
Co
S0
S1
S2
S3
The propagation delay of such a structure (also called the critical path) is defined as the worst case delay over all possible input patterns. The worst case delay of a ripplecarry adder happens when a carry generated at the least significant bit position propagates all the way to the most significant bit position. This carry is then consumed in the last stage to produce the sum. The delay is then proportional to the number of bits in the input words N and is approximated by where tcarry and tsum equal the propagation delays from C in to Cout and S, respectively [20].
(3.4)
Ci +1 = Gi + Pi
C1 = G0 + P0C0
(3.6)
(3.7)
C3 = G2 + P2 C2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C0 = G2:0 + P2:0 C0
(3.8)
A0,B0
A1,B1
An-1,Bn-1
Ci,0
P0
Ci,1 P1
Ci,n-1 Pn-1
S0
S1
..........................
Sn-1
The carry propagation process is decomposed into subgroup of two bits, Gi,j and Pi,j denote the generate and propagate functions, respectively, for a group of bits (from bit position i to j). Therefore, we call them block generate and propagate signals. Gi,j equals 1 if the group generates a carry, independent of incoming carry. The block propagates Pi,j is true if an incoming carry propagates through the complete group. This condition is equivalent to carry by pass. Another generalization is possible by treating the generate and propagate functions as a pair (Gi,j, Pi,j), rather than considering them as a separate functions. A new Boolean operator, called the dot operator (.), can be introduced. This operator on the pairs
and allows for the combination and manipulation of blocks of bits. Using this operator we can now decompose. The dot operator obeys the associative property, but it is not commutative. (G3:2 , P3:2 ) = (G3, P3). (G2, P2) (G, P). (G, P)= ( G +PG, PP) (3.11)
By exploiting the associative property of the dot operator, a tree can be constructed that effectively computes the carries at all 2i -1 positions (that is, 1, 3, 7, 15, 31, etc,) for i= 1log2(N). The crucial advantage is that the computation of the carry at position 2i -1 takes only log2 (N) time. This is a improvement over the previously described adders. For example, for an adder of 32bits, the propagation delay of a linear adder is proportional to 32. For a square-root select adder, it is reduced to 4, while, for a logarithmic adder, the proportionality constant is3. Now in this which frequently is referred to as a kogge-stone tree , is a member of trees. Radix-2 means that the tree is binary. It combines two carry words at a time at each level of hierarchy. The total adder requires 49 complex logic gates each to implement the dot operator. In addition, 16 logic modules are needed for the generation of the propagate and generate signals at the first level (Pi and Gi), as well as 16 sum-generation gates.
Designers sometimes trade off some delay for area and power by choosing less complex trees. A simpler tree structure computes only the carries to the power of two bit positions[brent82], as shown in figure signals only at positions 2 -1 .
i
(C0,0,0)= (G0, P0) . (Ci,o, 0) (C0,1,0)=[ (G1,P1) . (G0, P0) ] . (Ci,0, 0) = (G1:0, P1:0) . (Ci,0, 0)
(3.12)
(3.13)
(C0,3,0)=[ (G3:2 ,P3:2 ).(G1:0,P1:0) ] .(Ci,0, 0) = (G3:0,P3:0) . (Ci,0,0) (C0,7,0)=[(G7:4,P7:4).(G3:0 ,P3:0 )] . (Ci,0, 0) = = (G7:0,P7:0 ) . (Ci,0,0)
(3.14)
(3.15)
(C0:11,0) = [ (G11:7,P11:7).(G7:0 ,P7:0 )].(Ci,0, 0) = = (G11:0 ,P11:0 ) . (Ci,0,0) (C0,15,0)=[ (G15:11 ,P15:11 ). (G11:0,P11:0) ].(Ci,0, 0) = (G15:0,P15:0) . (Ci,0,0)
(3.16)
(3,17)
An option to reduce the depth of the tree is to combine four signals at a time at each level of the hierarchy. The resulting tree is now of class radix-4, because it uses building blocks of order 4 as shown in figure a 32-bit addition needs only two stages of carry logic. Be aware that each gate is more complex and that having less stages may not always result in faster operation.
A look-ahead adder is several times larger than a ripple adder, but has dramatic speed advantages for large operands. The logarithmic behavior makes it preferable over bypass or selects adders for large values of n. the exact value of the cross point depends heavily on technology and circuit design factors.
RESULTS:
The 16 transistor design of Full Adder is operated at supply voltage of 3v, In this input A is a sequence of 0000111100 of period 80ns, B sequence 0011001100 of period 40ns Cin sequence 0101010101 of period 20ns.The SUM output is 0110100101 from 0ns to 100ns and CARRY output is 0001011100. In figure 5.1 top two waveforms represent SUM and CARRY.
The 14 transistor design of Full Adder is operated at supply voltage of 3v, In this input A is a sequence of 0000111100 of period 80ns, B sequence 0011001100 of period 40ns Cin sequence 0101010101 of period 20ns.The SUM output is 0110100101 from 0ns to 100ns and CARRY output is 0001011100. In figure 5.1 top two waveforms represent SUM and CARRY.
Fig.5.2 14 Transistor full Adder The 8 transistor design of Full Adder is operated at supply voltage of 3v, In this input A is a sequence of 0000111100 of period 80ns, B sequence 0011001100 of period 40ns Cin sequence 0101010101 of period 20ns.The SUM output is 0110100101 from 0ns to 100ns and
Fig.5.3 10 Transistor full Adder The 8 transistor design of Full Adder is operated at supply voltage of 3v, In this input A is a sequence of 0000111100 of period 80ns, B sequence 0011001100 of period 40ns Cin sequence 0101010101 of period 20ns.The SUM output is 0110100101 from 0ns to 100ns and
CARRY output is 0001011100. In figure 5.1 top two waveforms represent SUM and
Delay of 16 Transistors
In fig.5.5 below the delay for 16Transistor is shown. The two lines in middle represent 50% of input voltage to the 50% output voltage.
Delay of 16 Transistor full Adder The Delay o f a 16Transistor 158.42p The Average power consumed of a 16Transistor 5.933604e-006 watts
Delay of 14Transistor
In fig.5.6 below the delay for 16Transistor is shown. The two lines in middle represent 50% of input voltage to the 50% output voltage.
Delay of 10Transistor
In fig.5.7 below the delay for 16Transistor is shown. The two lines in middle represent 50% of input voltage to the 50% output voltage.
Delay of 10 Transistor full Adder The Delay o f a 10Transistor 196.28p The Average power consumed of a 10Transistor 2.472365e-007 watts
Delay of 8 Transistor
In fig.5.8 below the delay for 16Transistor is shown. The two lines in middle represent 50% of input voltage to the 50% output voltage.
The Delay of a 8 Transistor 195.48p, The Average power consumed of a 8Transistor 3.854102e-007watts
The study of different types of Full Adder cells and their respect ive power consumption and delay has measured in 250nm CMOS process, from the analysis we observed that lower delay is observed in 14Transistor and power consumption in 10Transisitor as it is operated at supply voltage of 3.0v.
8 Transistors
Power Results:
vdd from time 1e-009 to 1e-007 Average power consumed -> 9.605227e-003 watts Max power 2.300812e-002 at time 7.0001e-008 Min power 6.478764e-012 at time 1e-009
10 Transistor : Waveform
Power Results:
vdd from time 1e-009 to 1e-007 Average power consumed -> 4.006624e-008 watts
Max power 5.492911e-003 at time 8.00005e-008 Min power 8.088337e-012 at time 5.111e-009
14 Transistor:
Waveform:
Power Results
vdd from time 1e-009 to 1e-007 Average power consumed -> 1.121040e-004 watts
Max power 1.117429e-002 at time 8.00005e-008 Min power 1.233788e-007 at time 1.50802e-008 Full Adder 8T 10T 14T 16T Power(w) 9.605227e-003 4.006624e-008 1.121040e-004 4.83e-005 Delay(ps) 163.12 180.2 120.23 140.06
Conclusion:
In this project 1-bit full adder circuits are studied using various CMOS circuit style and design. These circuits are then compared for power, speed and area. The various circuits were designed and compared i.e. 16 transistors, 14 transistors, 10 transistors, 8 transistors using Tanner tools 250nm technology, 45nm technology. The main problem with low power full adders is threshold voltage loss. It is observed that performance is degraded in the case of 45nm technology.