Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
5.1 Advantages of CMOS Over nMOS 5.2 CMOS Technologies 5.2.1 CMOS/SOI Technology 5.2.1.1 The CMOS/SOS Technology 5.2.2 CMOS/bulk Technology 5.2.2.1 p-well CMOS/Bulk process 5.2.2.2 n-well CMOS/Bulk process 5.2.2.3 Twin-tub CMOS/Bulk process 5.2.3 Latch-up in Bulk CMOS 5.2.3.1 Parasitic SCR structure 5.3 Static CMOS Design 5.4 Domino CMOS Structures 5.4.1 Domino CMOS logic examples 5.4.2 Cascaded domino CMOS logic gates 5.5 Charge Sharing 5.5.1 Solutions for charge sharing 5.6 Clocking 5.6.1 Clock generation 5.6.2 Clock distribution 5.6.3 Clocked storage elements
Figure 5.2: SOI process The SOI CMOS process is considerably more costly than the standard p & n-well CMOS process. Yet the improvements of device performance and the absence of latch-up problems can justify its use, especially for deep-sub-micron devices. 5.2.1.2 The CMOS/SOS Technology Silicon-on-sapphire (SOS) is the highest-performance SOI technology today. In this approach, silicon is grown on a sapphire substrate, and islands are formed by implant or diffusion. The n-channel and p-channel transistors are built on the islands as shown in figure 5.3. High performance is achieved due to a significant reduction in parasitic capacitance, and high gate density is achieved.
Figure 5.3: SOS process Sapphire (Al2O3) is a good insulator and the lattice constants of silicon and sapphire match well. When sapphire is used as the substrate, the epitaxial growth of silicon yields the monocrystalline material. Sapphire is not affected by radiation as bulk silicon is, which makes it a preferred material for military application which requires radiation-hardened devices. Disadvantages are: Manufacturing difficulty. High cost of sapphire wafers. Not competitive in high-volume, low-cost markets. 5.2.2 CMOS/bulk Technology
The CMOS/Bulk technologies are classified as follows: a. p-well CMOS/Bulk process b. n-well CMOS/Bulk process c. twin-tub CMOS/Bulk process 5.2.2.1 p-well CMOS/Bulk process The p-well CMOS/bulk uses p-type diffusion into an n-type bulk silicon substrate to form a p-well for n-channel transistors. The p-channel transistors are directly built into n-substrate as shown in figure 5.4.
Figure 5.4: p-well process 5.2.2.2 n-well CMOS/Bulk process The n-well CMOS/bulk uses n-type diffusion into a p-type bulk silicon substrate to form an n-well for p-channel transistors. The n-channel devices are built directly into the bulk psubstrate as shown in figure 5.5; hence nMOS gives good performance than pMOS. This process provides faster circuit than p-well CMOS process.
Figure 5.5: p-well process Both p-well & n-well need contacts and leave minimum spacing (dead space) between the edges of their wells. 5.2.2.3 Twin-tub CMOS/Bulk process The starting material is an n+ or p+ substrate, with a lightly doped epitaxial layer on top. This epitaxial layer provides the actual substrate on which the n-well and the p-well are formed. Two independent doping steps are performed for the creation of the well regions; the dopant concentrations can be carefully optimized to produce the desired device characteristics. In p- and n-well CMOS process, the doping density of the well region is higher than the substrate, which, among other effects, results in unbalanced drain parasitic. The twin-tub process avoids this problem. The process is costlier and more complex. The twin-tub process combines n-well and p-well technologies as shown in figure 5.6.
Figure 5.6: twin-well process Twin-tub process has highest overall performance compared to n-well & p-well process; it provides full freedom for the designer to optimize the performance of both the nchannel & p-channel devices. This technology provides the basis for separate optimization of the nMOS and pMOS transistors, thus making it possible for threshold voltage, body effect and the channel transconductance of both types of transistors to be tuned independently. 5.2.3 Latch-up in Bulk CMOS
CMOS devices have parasitic bipolar transistors which can cause latch-up. Latch-up, is a condition in which high current exist between VDD & GND. In latch-up, each collector of a parasitic BJT is feeding the base of another parasitic BJT in a positive feedback configuration
forming a SCR. CMOS ICs have parastic silicon-controlled rectifiers (SCRs). When powered up, SCRs can turn on, creating low-resistance path from power to ground. Latch-up can cause malfunctioning and even destroy devices. Latch-up is terminated when power to the SCR is interrupted. The latch-up can occur in both p-well and n-well CMOS processes. Causes for the latchup are internal transient currents or voltages during power-up, external glitches on I/O pads, and external radiation. The triggering methods for the latch-up are current injected into the npn emitter, current injected into the pnp emitter, and drastic current/voltage changes on any mode. 5.2.3.1 Parasitic SCR structure Parasitic bipolar transistors (npn and pnp) exists in a CMOS structure, as shown in figure 5.7. The well and the substrate have resistances Rw and Rs respectively.
Figure 5.7: Parasitic SCR structure Latch-up Prevention Two basic concepts (for reducing loop gain) Reduce Rwell and Rsubstrate Reduce parasitic npn and pnp transistors ( i.e. reduce Ic1 and Ic2) Decrease the current gains of the parasitic transistors Two basic ways: Latch-up resistant CMOS process Layout techniques Internal latch-up prevention techniques: Every well must have a substrate contact of the appropriate type. Every substrate contact should be connected to metal directly to a supply pad (i.e., no diffusion or polysilicon underpasses in the supply rails) Use guard rings around the p- and/or n-wells, and making frequent contacts to the rings.
Place substrate contacts as close as possible to the source connection of transistors connected to the supply rails (i.e., Vss in n-devices, Vdd in p-devices). This reduces the value of Rsubstrate and Rwell. A very conservative rule is place one substrate contact for every supply (Vss or Vdd) connection. Otherwise a less conservative rule is to place a substrate contact for every 5-10 transistors or every 25-100m.
Figure 5.8: Domino CMOS gate During precharge phase (when = 0) the output node of the dynamic CMOS stage is precharged to a high level, and the output of the CMOS inverter becomes low. During evaluation phase (when = 1) there are two possibilities: The output node either discharged to a low level through nMOS circuitry, or It remains high
5.4.1 Domino CMOS logic Examples: Domino CMOS logic examples are given in figure 5.9. Dynamic CMOS logic gate stage is cascaded with static CMOS inverter stage.
5.4.2 Cascaded domino CMOS logic gates Cascading domino CMOS logic stages are as shown in figure 5.10.
Cascading domino CMOS logic gates with static CMOS logic gates is shown in figure 5.11.
Figure 5.11: Cascading domino CMOS logic gates with static CMOS logic gates Dynamic domino circuits are fast and draw no quiescent power, no glitches on output but they require a reasonable clock rate. Limitation is that number of inverting static load stages in cascade must be even, so that the inputs of the next domino CMOS stage experience only 0 to 1 transitions during the evaluation, only non-inverting structures can be implemented, and they have potential charge sharing problems. .
Figure 5.12: Charge sharing 5.5.1 Solutions for charge sharing A weak P device (with a small W/L ratio) is added for the dynamic CMOS stage output, compensates for charge loss due to charge sharing and leakage at low frequency clock operation as shown in figure 5.13 (a), since weak P device is always on, the static power dissipation increases. Other way to realize this is to have a weak pMOS pull-up device in a feedback loop can be used to prevent the loss of output voltage level due to charge sharing is shown figure 5.13 (b), weak P device conducts only when the output of static gate goes low. i.e. when precharge node voltage is kept high.
Another possible solution for charge sharing is to use separate pMOS transistors to precharge-high all intermediate nodes in the nMOS transistors, as shown in figure 5.14.
Figure 5.14: Precharge-high all intermediate nodes of nMOS transistors Other solution is obtained by graded sizing of nMOS transistors in series structures, where the nMOS transistor closest to the output node has smallest (W/L) ratio and nMOS transistor closest to the ground has highest (W/L) ratio.
5.6 CLOCKING
Synchronous systems use a clock to keep operations in sequence, this distinguishes from the previous or next and determine speed at which machine operates. Clock must be distributed to all the sequencing elements like flip-flops and latches and also distribute clock to other elements such as Domino circuits and memories. There are three requirements of the clocking system: Signals must occur at the correct time Clock must be able to drive the fan-out Rise & fall times of the clock pulses must be as short as possible Long transition times not only slow the circuit but also increase power consumption. Clocks must be laid out such that the delays from the source of each clock to clocked bistable elements are identical. Clock signals switches between VDD and GND. Two-phase, nonoverlapping clocking has no timing errors due to races or hazards. Clock Skew Absolute clock skew: difference in arrival of the edge of a clock phase at a destination in the circuit, with respect to the clock edge at the source of the clock signal. Relative clock skew: difference in local clock lag.
Clock skew for rising and falling clock signals need not be same. Careful design of layout is required to avoid the skew problems. Set-up time, Hold time and Minimum pulse width are very important for the clocks. Clock delays can be treated as any bus delay problem; fastest clocking should be established using suitable super-buffers (clock drivers) to drive the clock bus, or by scaling the clock-driver loads by a factor 2.7. The bus must be kept as short as possible, and in metal as much as possible. 5.6.1 Clock Generation All clock signals can be derived from a system clock signal, which is a square-wave. Multiphase clocks can be generated from a single square-wave input with two toggle flipflops and two AND gates as shown in figure 5.15.
Figure 5.15: Generation of two-phase clocking from a primary clock Other way of generation is as shown in figure 5.16.
5.6.2
Clock distribution
On a small chip, the clock distribution network is just a wire and possibly an inverter for clkb. On practical chips, the RC delay of the wire resistance and gate load is very long. Variations in this delay cause clock to get to different elements at different times, called clock skew. Clock skew can be minimized by placing all gates of a tree on the same chip. Most chips use repeaters to buffer the clock and equalize the delay, reduce skew, as shown in figure 5.17. The physical layout of the clock network must conform to design rules that ensure the integrity of the clock signal by minimizing electrical coupling, switching currents, and impedance mismatches. Equalizing path delays also helps to reduce the skew.
The clock signals can cross under power lines using diffusion as shown in figure 5.18.
To reduce the clock skew, clock distribution network is required, which requires, plenty of metal wiring resources. Local Clock Gaters receives the global clock and produce the physical clocks required by clocked elements. Clock gaters are often used to stop or gate the clock to unused blocks of logic to save power. Different clock gaters are: Enabled or Gated clock Stretched clocks Nonoverlapping clocks Complementary clocks Delayed, Pulsed clocks Clock Doubler Clock Buffer
Some of the clock gaters are as shown in figure 5.19, with output waveforms.
5.6.3
A two-phase clocking scheme with combinational logic inserted between every pair of registers yields a simple pipelined structure. Feedback path added around a cascade of two combinational-logic blocks is shown in figure 5.20.