Design and Analysis of A New Loadless 4T SRAM

Second International Conference on Emerging Trends in Engineering and Technology, ICETET-09
Design and Analysis of a New Loadless 4T SRAM Cell in Deep Submicron CMOS Technologies
Sandeep R1, Narayan T Deshpande2,
Dept of ECE, BMSCE, Bangalore-560019, India 1 sandeepr90@gmail.com, 2ntd.bms@gmail.com
Abstract - The goal of this paper is to reduce the power and area of the Static Random Access Memory (SRAM) array while maintaining the competitive performance. Here the various configuration of SRAM array is designed using both the six-transistor (6T) SRAM cell and a new loadless fourtransistor (4T) SRAM cell in deep submicron (130nm, 90nm and 65nm) CMOS technologies. Then it is simulated using HSPICE to check for its functionality, Static Noise Margin (SNM), power dissipation, area occupancy and access time. Except the precharge circuits and the basic storage cells, remaining part of the circuitry is same for both 6T SRAM array and New Loadless 4T SRAM array. Compared to the conventional 6T SRAM array, the new loadless 4T SRAM array consumes less power with less area in deep submicron CMOS technologies. Also the SNM of the new loadless 4T SRAM cell is as good as that of the 6T SRAM cell for higher values of Cell Ratio (CR). Keywords 6T SRAM cell, new loadless 4T SRAM cell, SNM, low power and low area.
A R Aswatha,
Dept of ECE, DSCE, Bangalore-560078, India aswath.ar@gmail.com Cell [3], as shown in Figure 1(b). They will be designed and analysed in various configurations with respect to functionality, power dissipation, area occupancy, stability and access time.
(a)
I.
INTRODUCTION
(b) Figure 1. SRAM Cell. (a) Conventional 6T SRAM Cell. (b) New Loadless 4T SRAM Cell.
The on-chip caches in embedded microprocessors are implemented using arrays of densely packed Static Random Access Memory (SRAM) cells [1]. The number of transistors devoted to the on-chip caches is often a significant fraction of the total transistors devoted for the entire chip. According to International Technology Roadmap for Semiconductors (ITRS)-2005, SRAM is going to occupy more than 60% of the System-on-Chips (SoCs) in the future. Thus reducing the number of transistors in the basic cell leads to the overall reduction in the number of transistors in the SRAM array and thus leading to the overall reduction in the area occupancy of the SRAM array. Moreover, the caches consume a large fraction of the total power in embedded microprocessors. Improving the power efficiency of caches is therefore critical to the overall system power efficiency [2]. A few critical circuits in a system not only affect the design metrics but may fail to operate in deep submicron technology. Hence the SRAM arrays are designed, analysed and checked for its design metrics in deep submicron CMOS technologies. Two types of SRAM cells will be considered in this paper. (i) Conventional six-transistor (6T) SRAM cell, as shown in Figure 1(a). (ii) New Loadless four-transistor (4T) SRAM
A 6T SRAM cell consists of two cross-coupled inverters (M1-M3 and M2-M4) forming a latch and the access transistors (M5 and M6). In the new loadless 4T SRAM cell, two NMOS transistors (M3 and M4) are used as pass transistors to access the cell and two PMOS transistors (M1 and M2) are used as drivers for the cell. An SRAM cell must be designed such that it provides a non-destructive read operation and a reliable write operation. The working of the new loadless 4T SRAM cell can be found in [3] and the conventional 6T SRAM cell can be found in [4-9]. This paper is organized as follows. Section II deals with the Static Noise Margin (SNM) of both the SRAM cells. The precharge circuits for both the SRAM arrays are presented in section III. The Sense Amplifier (SA) for the SRAM arrays is presented in section IV. The decoder and the write driver circuits are presented in section V. The simulation environment and the results are discussed in section VI. Finally the conclusion is given in section VII.
978-0-7695-3884-6/09 $26.00 2009 IEEE
155
II.
SNM OF SRAM CELLS
The data stability of the SRAM cell has been a prominent topic in the SRAM cell design, as it examines the SRAM cell for its ability to retain the data. SNM is the metric used in this paper to characterize the stability of the SRAM cells. The SNM is defined as the minimum dc noise voltage necessary to flip the state of a SRAM cell [3, 10]. The most critical point in a SRAM cell is during a read, and hence read SNM is given more importance than the write SNM. The same method is been used for the evaluation of SNM in both the types of SRAM cells [10]. SNM is typically measured while holding the bitlines at the precharge value and the Word Line (WL) asserted. The schematic used for the SNM simulation in case of 6T SRAM cell is as shown in Figure. 2(a). The schematic used for the SNM simulation in case of New Loadless 4T SRAM cell is as shown in Figure. 2(b). The noise sources in the simulation were swept from 0 to 500mV in 200ns, which can be considered slow. The two nodes initially remain stable, but as the noise increases, the margin between the nodes diminishes. At some point the storage nodes flip and the cell settles in this new stable state. The point at which the storage nodes flip gives the value of SNM. The results from these simulations will be presented in section VI.
write and read cycle. The transistor M1 and M2 will precharge the bitlines while the transistor M3 will equalize them to ensure both bit lines within a pair are at the same potential before the cell is read. The same circuit topology is used for local precharge in combination with the SA in the corresponding SRAM arrays (Refer section IV for more details).
(a)
(b)
Figure 3. Precharge Circuits. (a) Precharge Circuit for 6T SRAM Array. (b) Precharge Circuit for New Loadless 4T SRAM Array.
IV.
SENSE AMPLIFIER
(a)
(b) Figure 2. SNM simulation Setup. (a) For Conventional 6T SRAM Cell. (b) For New Loadless 4T SRAM Cell.
III.
PRECHARGE CIRCUITS
The precharge circuit used for the new loadless 4T SRAM array is different from that of the 6T SRAM array. The function of the precharge circuit in the 6T SRAM array is to charge the Bit Line (BL) and Bit Line Bar (BLB) to VDD. In the new loadless 4T SRAM array the bitlines are precharged to ground instead of VDD and thus consuming less power than the 6T SRAM array. The schematic of the pre-charge circuit [4] for the 6T SRAM array is shown in Figure. 3(a) and that of the new loadless 4T SRAM array is shown in Figure. 3(b). The Precharge (PC) signal enables the bit-lines to be pre-charged at all times except during
One of the major issues in the design of SRAMs is the speed of read operation. For having high performance SRAMs, it is essential to take care of the read speed both in the cell-level design and in the design of SA. The primary function of a SA in SRAMs is to amplify a small analog differential voltage developed on the bit lines by a readaccessed cell to the full swing digital output signal thus greatly reducing the time required for a read operation. The choice and design of a SA defines the robustness of bit line sensing, impacting the read speed and power. High density memories commonly come with increased bitline parasitic capacitances. These large capacitances slow down voltage sensing and makes bitline voltage swings energy-consuming, which result in slower and more power hungry memories. The need for larger memory capacity, higher speed, and lower power dissipation, impose trade offs in the design of SA. Also since SRAMs do not feature data refresh after sensing, the sensing operation must be non-destructive. There are many types of SA. The one that is used in this paper is the latch-type SA [4, 8]. The SA is present in every column of SRAM array. Except the local precharge circuits both the versions of SRAM array use the same type of SA. Figure. 4(a) shows the latch-type SA with local precharge circuit for the 6T SRAM array and Figure. 4(b) shows the latch-type SA with local precharge circuit for the new loadless 4T SRAM array. It has cross coupled latch in its configuration which relaxes the gain requirement of the amplifier. The sizing of the transistors is done using the same methodology as that of the 6T SRAM cell. Using the approximate values, the simulations were run and the widths were optimized to get the best output. The read operation begins by precharging and equalizing both the bitlines, with simultaneously biasing the latch-type SA in the high-gain meta-stable region by precharging and equalizing its inputs. And then to read a particular word from the SRAM array, the corresponding row is selected by enabling the WL. Once a sufficient voltage difference is built between the bitlines, the SA is enabled by read enable (RE) signal. The SA will sense which bitline is heading towards high voltage and which
156
bitline is heading towards ground potential and then a full voltage swing is obtained at the output.
where m=log2n. The schematic of a 2:4 dynamic NAND decoder is shown in Figure. 5(a). Here all the outputs of the array are high by default, with the exception of the selected row, which is low. Since the interface between decoder and memory often includes a buffer, it can be made inverting to enable the WL. B. Write Driver Circuit The function of the SRAM write driver is to write the input data to the bitlines when the Write Enable (WE) signal is enabled; otherwise the data is not written onto the bitlines. Only one write driver is needed for each SRAM column. Thus the area impact of a larger write driver is not multiplied by the number of cells in the column and hence the write driver can be sized up if necessary. The schematic of the write driver circuit is shown in Figure. 5(b).
(a)
(a)
(b) Figure 4. Sense Amplifiers. (a) Latch-type SA with Local Precharge Circuit for 6T SRAM Array. (b) Latch-Type SA with Local Precharge Circuit for New Loadless 4T SRAM Array.
V.
DECODER AND WRITE DRIVER CIRCUITS
The decoder and the write driver circuits are same for both the type of SRAM arrays. The decoder circuit is presented in section V-A. The write driver circuit is presented in section V-B. A. Decoder Circuit A decoder is used to decode the given input address and to enable a particular WL. There are various types of decoders available. The one that is used in this paper is the dynamic decoder. Dynamic decoders [6] have the following advantages when compared to the other types of decoders. (a) The number of transistors used is less. (b) The layout of the decoder is simple and less time consuming. (c) The power consumption is less. (d) The speed of the decoder is also good. In particular dynamic NAND decoder is used in this paper rather than dynamic NOR decoder, as the former consumes less area and less power than the latter. For an nword memory, an m : n dynamic NAND decoder is used,
(b) Figure 5. Decoder and Write Driver Circuits. (a) 2:4 Dynamic NAND Decoder. (b) Write Driver
VI.
SIMULATION ENVIRONMENT AND RESULTS
The following configuration of SRAM arrays were designed and analysed using the conventional 6T SRAM Cell and the New Loadless 4T SRAM Cell: (a) 1*1 (b) 16*16 (c) 32*32. The various configurations were simulated using HSPICE [11], using the Nominal Predictive Technology Model (PTM) in 130nm, 90nm and 65nm CMOS technologies [12]. The functionality of 1*1 6T SRAM cell is shown in Figure. 6 and that of 1*1 New Loadless 4T SRAM cell is shown in Figure. 7. The
157
functionality of 32*32 6T SRAM cell is shown in Figure. 8 and that of 32*32 New Loadless 4T SRAM cell is shown in Figure. 9. For 1K-bit (32*32) configuration along with the relevant input control signals, only the signals for three input data bits (0th, 16th and 31st), three output data bits (0th, 16th and 31st), and the corresponding storage nodes of the appropriate cell is presented.
0. And both the cells operated correctly with different array configurations at a temperature of 270C, VDD=1.5V (in 130nm CMOS technology), VDD=1.2V (in 90nm CMOS technology) and VDD=1.1V (in 65nm CMOS technology). The frequency at which the both the cells were made to operate was 333.33MHz. Each bitlines was assumed to have a capacitance of 20fF. Also a load of 20fF was connected to each output line of SA. Similar procedure was used to check the functionality for other configurations of both the types of SRAM arrays in all the CMOS technologies.
(a)
(a)
(b)
(b)
(c) Figure 6. Write-Read Cycle of 1-Bit 6T SRAM. (a) In 130nm CMOS Technology. (b) In 90nm CMOS Technology. (c) In 65nm CMOS Technology.
(c) Figure 7. Write-Read Cycle of 1-Bit New Loadless 4T SRAM. (a) In 130nm CMOS Technology. (b) In 90nm CMOS Technology. (c) In 65nm CMOS Technology.
Following order is used to check the functionality of both the types of SRAM cells: write 1 read 1 write 0 read
158
(a)
(a)
(b)
(b)
(c) Figure 8. Write-Read Cycle of 1K-Bit 6T SRAM. (a) In 130nm CMOS Technology. (b) In 90nm CMOS Technology. (c) In 65nm CMOS Technology.
(c) Figure 9. Write-Read Cycle of 1K-Bit New Loadless 4T SRAM. (a) In 130nm CMOS Technology. (b) In 90nm CMOS Technology. (c) In 65nm CMOS Technology.
159
Following are the signals shown in the simulation results: pc corresponds to PC signal given to the Precharge circuits; dpc corresponds to the clock signal given to the decoder circuits; wl0 corresponds to the WL signal of row 0 in the SRAM array. This is the output of inverting buffer circuit; we corresponds to the write enable signal given to the write driver circuits; di0, di16, and di31 correspond to the 0th, 16th and 31st input data bits; lpc corresponds to the Precharge signal given to the Local Precharge circuits; re corresponds to the read enable signal given to the sense amplifier circuits; xm0_0.q, xm0_16.q, and xm0_31.q correspond to the 0th, 16th and 31st storage node (true node-q) of the SRAM cell in the row 0; dt0, dt16, and dt31 correspond to the 0th, 16th and 31st output data bits of the sense amplifier circuits. The SNM of both the types of SRAM cells, for different values of Cell Ratio (CR) were obtained using the same setup as given in section II. The results have been tabulated for 130nm CMOS technology in Table I, for 90nm CMOS technology in Table II and that of 65nm CMOS technology in Table III. It is observed that even for low value of CR the 6T SRAM cell is highly stable than that of the new loadless 4T SRAM cell. To match the stability of the new loadless 4T SRAM cell with that of 6T SRAM, the value of its CR must be made high. The Access time was measured for both the types of 1Kb SRAM array and the results are tabulated. The Read Access time is the time measured from the point at which the RE signal reaches 10% of VDD to the point at which the output signal becomes +/- 10% VDD of the required logic value. The Write Access time is the time measured from the point at which the WE reaches 50% of VDD to the point at which the storage node of the cell reaches 50% of VDD. The access times for both the types of SRAM arrays with CR=3 for 6T SRAM and CR=4 for New Loadless 4T SRAM in 130nm CMOS technology is shown in Table IV, for 90nm CMOS technology in Table V and that of 65nm CMOS technology in Table VI. It is observed that the read access time for the 1Kb 6T SRAM array is less than that of the 1Kb New Loadless 4T SRAM array and the write access time for the 1Kb 6T SRAM array is more than that of the 1Kb New Loadless 4T SRAM array. The total power dissipation (TPD) of various configurations of SRAM arrays using both the types of cells was measured and the results obtained have been tabulated. The comparison of TPD for different array configurations with CR=3 for 6T SRAM and CR=4 for New Loadless 4T SRAM in 130nm CMOS technology is shown in Table VII, for 90nm CMOS technology is shown in Table VIII and that of 65nm CMOS technology in Table IX. It is observed that for various configurations the new loadless 4T SRAM arrays consume less power than that of the 6T SRAM array. The total number of transistors used for various configurations of SRAM arrays using both the types of SRAM cells has been tabulated as shown in Table X. It is observed that for various configurations the new loadless 4T
SRAM arrays uses lesser number of transistors and hence the lower area than that of the 6T SRAM array.
TABLE I. COMPARISON OF SNM FOR DIFFERENT CRS IN 130NM CMOS TECHNOLOGY SNM-6T (in mV) 290 310 320 SNM-4T (in mV) 70 250 370
Cell Ratio 3 4 5 TABLE II.
COMPARISON OF SNM FOR DIFFERENT CRS IN 90NM CMOS TECHNOLOGY SNM-6T (in mV) 260 270 280 SNM-4T (in mV) 40 170 270
Cell Ratio 3 4 5 TABLE III.
COMPARISON OF SNM FOR DIFFERENT CRS IN 65NM CMOS TECHNOLOGY SNM-6T (in mV) 240 250 260 SNM-4T (in mV) 30 150 230
Cell Ratio 3 4 5
TABLE IV. ACCESS TIMES FOR BOTH THE TYPES OF SRAM ARRAYS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 130NM CMOS TECHNOLOGY. Metric Read Access Time Write Access Time 1K-Bit 6T SRAM 608ps 145ps 1K-Bit New Loadless 4T SRAM 996ps 118ps
TABLE V. ACCESS TIMES FOR BOTH THE TYPES OF SRAM ARRAYS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 90NM CMOS TECHNOLOGY. Metric Read Access Time Write Access Time 1K-Bit 6T SRAM 671ps 145ps 1K-Bit New Loadless 4T SRAM 1290ps 92.2ps
TABLE VI. ACCESS TIMES FOR BOTH THE TYPES OF SRAM ARRAYS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 65NM CMOS TECHNOLOGY. Metric Read Access Time Write Access Time 1K-Bit 6T SRAM 781ps 134ps 1K-Bit New Loadless 4T SRAM 1990ps 87.6ps
160
TABLE VII. COMPARISON OF TPD FOR DIFFERENT ARRAY CONFIGURATIONS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 130NM CMOS TECHNOLOGY. Configuration 1*1 16 * 16 32 * 32 TPD (6T) (in mW) 0.089408 1.4833 3.1688 TPD (4T) (in mW) 0.05309 0.8738 1.8397 Reduction in TPD 40.62% 41.09% 41.94%
ACKNOWLEDGMENT Sandeep R would like to thank I. K. Ravish Kumar, Project Manager, Intel Technology India Private Limited, Bangalore, for helpful discussions and providing the tool. REFERENCES
James S. Caravella, A Low Voltage SRAM for Embedded Applications, IEEE Journal of Solid-State Circuits, vol. 32, no. 3, pp. 428 432, March 1997. [2] Yen-Jen Chang, Shanq-Jang Ruan, and Feipei Lai, Design and Analysis of Low-Power Cache using Two-Level Filter Scheme, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 11, no. 4, pp. 568-580, August 2003. [3] Jinshen Yang and Li Chen, A New loadless 4-transistor SRAM cell with a 0.18m CMOS technology, Electrical and Computer Engineering, CCECE Canadian Conference, pp. 538 541, April 2007. [4] Andrei Pavlov and Manoj Sachdev, CMOS SRAM Circuit Design and Parametric Test in Nano-Scaled Technologies, Springer, 2008. [5] Nestoras Tzartzanis, High Performance Energy-Efficient Design, p 89-119, Springer 2006. [6] Jan.M.Rabaey, Anantha.P.Chandrakasan, and Borivoje Nikolic, Digital Integrated Circuits, PHI, 2003. [7] Sung-Mo Kang and Yusuf Leblebici, CMOS Digital Integrated Circuits, TMH, 2003. [8] Mohammad Sharifkhani, Design and Analysis of Low-Power SRAMs, PhD Thesis, University of Waterloo, 2006. [9] Tegze.P.Haraszti, CMOS Memory Circuits, Kluwer Academic Publishers, 2002. [10] Ingvar Carlson, Design and Evaluation of High Density 5T SRAM Cache for Advanced Microprocessors, Masters Thesis, Linkopings University, 2004. [11] HSPICE for Windows (Version: Z-2007.03), Inc, 2007 and StarHspice Manual, Release 2007.3. [12] Berkeley Predictive Technology Model website, http://www.eas.asu.edu/%7Eptm/ [1]
TABLE VIII. COMPARISON OF TPD FOR DIFFERENT ARRAY CONFIGURATIONS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 90NM CMOS TECHNOLOGY. Configuration 1*1 16 * 16 32 * 32 TPD (6T) (in mW) 0.048379 0.82611 1.7484 TPD (4T) (in mW) 0.026362 0.44497 0.91411 Reduction in TPD 45.51% 46.14% 47.72%
TABLE IX. COMPARISON OF TPD FOR DIFFERENT ARRAY CONFIGURATIONS WITH CR=3 FOR 6T SRAM AND CR=4 FOR 4T SRAM IN 65NM CMOS TECHNOLOGY. Configuration 1*1 16 * 16 32 * 32 TABLE X. TPD (6T) (in mW) 0.036 0.5918 1.2478 TPD (4T) (in mW) 0.0189 0.30994 0.6497 Reduction in TPD 47.50% 47.63% 47.93%
COMPARISON OF TOTAL NUMBER OF TRANSISTORS FOR DIFFERENT ARRAY CONFIGURATIONS. Total Number of Transistors
Configuration 1*1 16 * 16 32 * 32
6T SRAM array 31 2064 7232
New Loadless 4T SRAM array 29 1552 5184
Reduction 6.45% 24.81% 28.32%
VII. CONCLUSION The New Loadless 4T-SRAM cell is designed and analyzed in deep submicron (130nm, 90nm and 65nm) CMOS technologies, which establish the technology independence of the New Loadless 4T-SRAM cell and its consistent performance with respect to Conventional 6TSRAM cell in deep sub-micron regime. The New Loadless 4T SRAM array consumes low power with low area than that of the Conventional 6T SRAM array. The New Loadless 4T SRAM Cell operates with high stability for higher values of CR. The most significant feature of this new loadless 4T SRAM Cell is that there is no need to modify any of the fabrication process. Thus it can be used for on-chip caches in embedded microprocessors, highdensity SRAMs embedded in any logic devices, as well as for stand-alone SRAM applications.
Sandeep R received his B.E Degree (Electronics and Communication) from Visvesvaraya Technological University in 2006. Currently he is pursuing M.Tech Degree (Electronics) from Visvesvaraya Technological University. His main research interests include Analysis and design of Low Power Memories and Adders. Narayan T Deshpande received his B.E Degree from Bangalore University in 1990 and M.E Degree from Bangalore University in 1996. His main research interests include Signal Processing. A R Aswatha received his B.E Degree from Mysore University in 1991, M.Tech Degree from M.I.T Manipal in 1996, M.S. Degree from B.I.T.S. Pilani in 2002. Currently he has submitted the thesis for PhD degree from Dr. M.G.R University. His main research Interests include Analysis and design of Low Power VLSI Circuits and Image Processing.
161

Design and Analysis of A New Loadless 4T SRAM

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Design and Analysis of A New Loadless 4T SRAM

Caricato da

Copyright:

Formati disponibili

Second International Conference on Emerging Trends in Engineering and Technology, ICETET-09

978-0-7695-3884-6/09 $26.00 2009 IEEE

SNM OF SRAM CELLS

DECODER AND WRITE DRIVER CIRCUITS

SIMULATION ENVIRONMENT AND RESULTS

Cell Ratio 3 4 5 TABLE II.

Cell Ratio 3 4 5 TABLE III.

6T SRAM array 31 2064 7232

New Loadless 4T SRAM array 29 1552 5184

Reduction 6.45% 24.81% 28.32%

Potrebbero piacerti anche