Sei sulla pagina 1di 42

Static Read Access Memory (SRAM) Design

Abhinandan Majumdar MS. Computer Engineering am2993@columbia.edu

Srinivas Satish MS. Computer Engineering ssn2111@columbia.edu

December 10, 2007

Final Project EE 4321 VLSI Circuits Prof. Azeez Bhavnagarwala

I DEX
1. I TRODUCTIO ......................................................................................................1 1.1 Design .....................................................................................................................1 1.2 SRAM Operation ....................................................................................................2 1.3 Applications and Uses .............................................................................................3 2. DESIG .......................................................................................................................5 2.1 Block Diagram ........................................................................................................5 2.2 Decoder ...................................................................................................................6 2.2.1 2 Input And Gate Design ............................................................................7 2.2.2 3 Input And Gate Design ..........................................................................11 2.2.3 3x8 Decoder ..............................................................................................13 2.2.4 6x64 Decoder ............................................................................................14 2.2.5 Decoder Resizing ......................................................................................15 2.3 SRAM Cell and Array Design ..............................................................................17 2.3.1 Precharge Circuitry ...................................................................................17 2.3.2 SRAM Cell ...............................................................................................18 2.3.3 Read Sensing Circuit ................................................................................19 2.3.4 Write Driver ..............................................................................................19 2.3.5 SRAM Array ............................................................................................20 2.3.6 SRAM Cell with Decoder ........................................................................20 2.3.7 Read Stability ...........................................................................................21 2.4 DC Simulation ......................................................................................................22 2.4.1 Static Noise Margin (SNM) ......................................................................21 2.4.2 Cell Read Current .....................................................................................23 2.4.3 Effect of Threshold Voltage (Vt) ..............................................................24 3. LAYOUT ...................................................................................................................27 3.1 Decoder .................................................................................................................27 3.1.1 AND2 Gate ...............................................................................................27 3.1.2 AND3 Gate ...............................................................................................28 3.1.3 3x8 Decoder ..............................................................................................29 3.1.4 6x64 Decoder ............................................................................................29 3.2 SRAM ...................................................................................................................30 3.2.1 Precharge ...................................................................................................30 3.2.2 Read Sensing Circuit .................................................................................31 3.2.3 SRAM 64x64 Array ..................................................................................32

4. RESULTS ..................................................................................................................35 4.1 Simulation Results ................................................................................................35 4.1.1 Simulation of One SRAM Cell .................................................................35 4.1.2 Simulation of 64x64 SRAM Array ...........................................................36 4.2 DRC & LVS Results .............................................................................................37 5. CO CLUSIO ......................................................................................................... 38 6. REFERE CES .........................................................................................................39

1. I TRODUCTIO
Static random access memory (SRAM) is a type of semiconductor memory. The word "static" indicates that the memory retains its contents as long as power remains applied, unlike dynamic RAM (DRAM) that needs to be periodically refreshed.

Design

Fig 1.1 A six-transistor CMOS SRAM cell.

Random access means that locations in the memory can be written to or read from in any order, regardless of the memory location that was last accessed. Each bit in an SRAM is stored on four transistors that form two cross-coupled inverters. This storage cell has two stable states which are used to denote 0 and 1. Two additional access transistors serve to control the access to a storage cell during read and write operations. It thus typically takes six MOSFETs to store one memory bit. Access to the cell is enabled by the word line (WL in figure) which controls the two access transistors M5 and M6 which, in turn, control whether the cell should be connected to the bit lines: BL and BL. They are used to transfer data for both read and write operations. While it's not strictly necessary to have two bit lines, both the signal and its inverse are typically provided since it improves noise margins. During read accesses, the bit lines are actively driven high and low by the inverters in the SRAM cell. This improves SRAM speed compared to DRAMsin a DRAM, the bit line is connected to storage capacitors and charge sharing causes the bitline to swing upwards

or downwards. The symmetric structure of SRAMs also allows for differential signaling, which makes small voltage swings more easily detectable. Another difference with DRAM that contributes to making SRAM faster is that commercial chips accept all address bits at a time. By comparison, commodity DRAMs have the address multiplexed in two halves, i.e. higher bits followed by lower bits, over the same package pins in order to keep their size and cost down. The size of an SRAM with m address lines and n data lines is 2m words, or 2m n bits. 1.2. SRAM operation A SRAM cell has three different states it can be in: standby where the circuit is idle, reading when the data has been requested and writing when updating the contents. The three different states work as follows: a) Standby If the word line is not asserted, the access transistors M5 and M6 disconnect the cell from the bit lines. The two cross coupled inverters formed by M1 M4 will continue to reinforce each other as long as they are disconnected from the outside world. b) Reading Assume that the content of the memory is a 1, stored at Q. The read cycle is started by precharging both the bit lines to a logical 1, then asserting the word line WL, enabling both the access transistors. The second step occurs when the values stored in Q and Q are transferred to the bit lines by leaving BL at its precharged value and discharging BL through M1 and M5 to a logical 0. On the BL side, the transistors M4 and M6 pull the bit line toward VDD, a logical 1. If the content of the memory was a 0, the opposite would happen and BL would be pulled toward 1 and BL toward 0. c) Writing The start of a write cycle begins by applying the value to be written to the bit lines. If we wish to write a 0, we would apply a 0 to the bit lines, i.e. setting BL to 1 and BL to 0. This is similar to applying a reset pulse to a SR-latch, which causes the flip flop to change state. A 1 is written by inverting the values of the bit lines. WL is then asserted and the value that is to be stored is latched in. Note that the reason this works is that the bit line input-drivers are designed to be much stronger than the relatively weak transistors in the cell itself, so that they can easily override the previous state of the cross-coupled

inverters. Careful sizing of the transistors in a SRAM cell is needed to ensure proper operation. 1.3. Applications and Uses a) Characteristics SRAM is a little more expensive, but faster and significantly less power hungry (especially idle) than DRAM. It is therefore used where either speed or low power, or both, are of prime interest. SRAM is also easier to control (interface to) and generally more truly random access than modern types of DRAM. Due to a more complex internal structure, SRAM is less dense than DRAM and is therefore not used for high-capacity, low-cost applications such as the main memory in personal computers. b) Clock speed and power The power consumption of SRAM varies widely depending on how frequently it is accessed; it can be as power-hungry as dynamic RAM, when used at high frequencies, and some ICs can consume many watts at full speed. On the other hand, static RAM used at a somewhat slower pace, such as in applications with moderately clocked microprocessors, draw very little power and can have a nearly negligible power consumption when sitting idle in the region of a few microwatts. Static RAM exists primarily as: (i) General purpose products with asynchronous interface, such as the 28 pin 32Kx8 chips (usually named XXC256), and similar products up to 16 Mb per chip (ii) with synchronous interface, usually used for caches and other applications requiring burst transfers, up to 18 Mb (256Kx72) per chip Integrated on chip as RAM or cache memory in microcontrollers (usually from around 32 bytes up to 128 kilobytes) as the primary caches in powerful microprocessors, such as the x86 family, and many others (from 8 KB, up to several megabytes) on application specific ICs, or ASICs (usually in the order of kilobytes) in FPGAs and CPLDs (usually in the order of a few kilobytes or less)

c) Uses (i) Embedded Use

Many categories of industrial and scientific subsystems, automotive electronics, and similar, contains static RAM. Some amounts (kilobytes or less) is also embedded in practically all modern appliances, toys, etc that implements an electronic user interface. Several megabytes may be used in complex products such as digital cameras, cell phones, synthesizers, etc. SRAM in its dual-ported form is sometimes used for realtime digital signal processing circuits.

(ii)

In computers

SRAM is also used in personal computers, workstations, routers and peripheral equipment: internal CPU caches and external burst mode SRAM caches, hard disk buffers, router buffers, etc. LCD screens and printers also normally employ static RAM to hold the image displayed (or to be printed). Small SRAM buffers are also found in CDROM and CDRW drives; usually 256 KB or more are used to buffer track data, which is transferred in blocks instead of as single values. The same applies to cable modems and similar equipment connected to computers. The so called "CMOS RAM" on PC motherboards was originally a battery-powered SRAM chip, but is today more often implemented using EEPROM or Flash.

2. DESIG
2.1 Block Diagram The block diagram of 64x64 bit SRAM is given below

WL0 WL1 A5 A4 6x64 Decoder A0

64x64 bit SRAM Array

WL63

D0

D1

D63

Fig 2.1: 64x64 bit SRAM Cell Block Diagram There are two major blocks to be designed:

Address decoder: The address decoder takes in the 6 address lines a4:0 coming from the latch, and decodes them to generate 64 wordlines WL0-63 for the SRAM array.

SRAM array: Consists of an array of 64 x 64 bit SRAM cells. In addition to these blocks, the array also contains circuitry that allows data to be written into the array, and for precharging the bitlines to VDD before the read operation; these circuits are not shown in figure.

2.2

DECODER

To construct a 64x64 bit SRAM, we need 6x64 Address Decoder to select one of the word lines of 64 rows, each containing 64 1b SRAM cells. Hence we need to make the decoder logic fastest so as it doesnt become the bottleneck of our whole design. Hence considering speed and layout issues, we are taking up Domino Logic for all the intermediate nodes being used. For designing a 6x64 Decoder, we can either have three 2x4 decoders in 1st stage and perform ANDING of the corresponding outputs to have a 6x64 decoder logic, or we can have two 3x8. But for the former case, we need 64 three input AND gate and 12 two input AND gate and which is designed through domino logic, while the later design has 64 two input AND gates and 16 three input AND gate, hence considering the space limitations as three input AND gate takes much more area and offer higher gate capacitance, we choose the later design for 6x64 decoder.

2x4

2x4

Requires 64 three input and 12 two input AND Gates

2x4

Fig 2.2: 6x64 Decoder using 2x4 decoders

3x8 Requires 64 two input AND and 16 three input AND Gate

3x8

Fig 2.2: 6x64 Decoder design using two 3x8 decoders

2.2.1

2 Input A D Gate Design We designed 2 Input AND gate using Domino Logic. Here is the schematic of the design

Fig 2.3: Schematic Design of AND2 Gate

i)

Frequency Calculation. We kept input A & B at 1.2V, and saw how fast can it be operated at higher frequency, and we found that it atleast needs 0.4ns or 2.5Ghz.

Fig 2.4: Frequency Variation for AND2 Gate

ii)

PFET size calculation. We tried to simulate for varying Pfet size and found that we need to keep pfet minimal as well as optimum to charge the bitline faster at a given frequency of 2.5Ghz. We decided upon pfet size to be 715nm so as precharges at a faster rate.

Fig 2.5: Pfet width variation for AND2 Gate

iii)

Sizing of nfets We try to scale the nfet array so as the propagation delay could be minimized. Increasing the scaling decreases the propagation delay, hence decided upon a = 1.3

Fig 2.6: NFET Size variation for NFET

iv)

Keeper PFET sizing Keeper PFET is the one whose gate is driven by the output of the inverter, and prevents the voltage drop across the intermediate capacitance to drop below the VM of the inverter during evaluation stage. First graph is that of clock. Second graph shows that if we dont have any pfet, the output voltage rises by mV. If we connect it to a pfet and increase its size by b*(sum of the width of nfet array), we see the outout to be stable at 0 and randomness decreases by increase in b. Hence we find b = 0.15.

Fig 2.7: Keeper PFET sizing for AND2 gate

v)

Inverter Sizing. Though we should make the nfet stronger than pfet so as the voltage drop across intermediate capacitance is greater than VM of inveter. But making nfet stronger adds delay, so by adding a Keeper Pfet so as to keep the intermediate capacitance charged, we can increase our pfet to have same rise and fall time. Hence we find the beta ratio to be 2.45.

10

Fig 2.8: Inverter size variation for AND2 Gate

2.2.2

3 I PUT A D GATE. The ratios which we got for 2 INPUT AND Gate are kept same for 3 INPUT too, but the confusion should we use 2 cascaded AND2 gate for a 3 Input AND or single 3 INPUT AND. Hence we computed the propagation delay, and found following things. AND2_1 and AND2_2 is cascade 2 AND with changing line in 1st and 2nd AND respectively.

Gate

High to Low

Low to High

Propagation Delay

AND2 AND2_1 (cascaded) AND2_2 (cascaded) AND3

0 0 0 0

1.15ns 1.18ns 1.19ns 1.46ns

0.575ns 0.59ns 0.595ns 0.73ns

Hence cascaded AND2 would make our design faster but could make it asymmetrical, hence we chose AND3. 11

AND2 (Only one 2 Input AND)

AND2_1 (Cascaded 2 Input AND)

AND2_2 (Cascaded 2 Input AND)

AND3 (3 Input AND)

12

2.2.3

3x8 DECODER Here is the schematic for the Decoder.

Fig 2.9: 3x8 Decoder Schematic

And, here is the simulation graph,

13

Fig 2.10: Simulation of 3x8 Decoder

2.2.4

6x64 Decoder We used two 3x8 decoders and used 2 AND for having the 64x6 decoder logic. Here is the schematic

Fig 2.11: Schematic of 6x64 Decoder

We kept all inputs A1-A5 at 0 and sweeped A0 from 0 to 1.2V, and saw that Y0 dropping out and Y1 rising to 1.2V.

14

Fig 2.12: Propagation Delay at the Critical Path for 6x64 Decoder

2.2.5

Decoder Resizing.

The delay what we got after designing was 5.177ns 5.025ns = 0.152ns when running at 1Ghz and driving a capacitance of 39.931fF. We computed the end capacitance having the value of gate capacitance as 1fF/um and width capacitance as 0.2fF/um. In this case the AND3 nfets have W1 = 1u and rest being size by the ratio 1.3, inveter nfet has W2 = 1um, AND2 nfets have W3 = 1u and sized accordingly with ratio 1.3 and inverter has W4 = 1um.

To have minimal delay so as to have equal rise time and fall time, we optimized the sizes as follows, For AND3, NFET Array: 2u, 2.6u, 3.38u, 4.395u PFET: 3u Keeper PFET: 800nm Inverter: NFET 3u PFET 2.9u

15

For AND2, NFET Array: 5.8u, 7.54u, 9.8u PFET: 3.2u Keeper PFET: 2.2u Inverter: NFET 3u PFET 2.9u Heres the critical path

Fig 2.13: Schematic of Critical Path in 6x64 Decoder

We obtained a fall and rise time for the four stages as follows 33.94ps, 34,94ps, 33.23ps, 34.99ps. By this, our propagation delay got reduced from 152ps to 89ps (1.594ns 1.505ns = 89ps). Hence we stick to this sizes.

16

Fig 2.14: Propagation of Critical Path in 6x64 Decoder after Optimization

2.3 SRAM cell and array design 2.3.1 Precharge circuitry

The schematic of the precharge circuit is shown below. The pfet are of 1um width. This large width of the pfet is required to be able to charge the bitline quickly during the pre-charge phase. The huge width ensures that the bit-line BIT and BIT_B are charged to VDD in half the clock cycle.

17

Fig 2.15: Schematic of Precharge Circuit

2.3.2

SRAM Cell.

Schematic of the cell is shown below. The sizes of the access transistors, inverter nfet, pfet widths are as per the ones given in the layout.

Fig 2.16: Schematic of SRAM Cell

18

2.3.3 Read Sense Circuit Schematic of the read large sense circuit is shown below. The basic NAND gate is sized with nfet=280nm and pfet width of 560nm a ratio of 4.8:1. This is the required ratio in the 90nm process with channel length=80nm for achieving ideal rise and fall times.

Fig 2.17: Schematic of Read Sense Circuit

2.3.4 Write driver The write driver is enabled by a Write_enable line. The schematic is shown below.

Fig 2.18: Schematic of Write Circuit

19

2.3.5 The complete SRAM Array Following is the schematic of 64x64 bit SRAM cell

Fig 2.19: Schematic of SRAM Array

2.3.6 SRAM Array with Decoder Here is the schematic of the complete SRAM with DECODER,

Fig 2.20: Schematic of SRAM Array with 6X64 Decoder

20

2.3.7 Read Stability This is an important characteristic of the SRAM Cell. During a read-operation one of the bitlines either BIT or BIT_B is discharged though the access transistor and an nfet of the inverter. During this discharge process, a large amount of current flows through node A ( shown below). Read stability is a measure of the potential at node A, this potential should not exceed the switching threshold of the other inverter. If it does then the state of the SRAM has changed. An analogous analysis was done in identifying tradeoffs in Read Current and Static Noise Margin. Following is the READ STABILITY Graph.

Fig 2.21: Simulation of Read Stability

21

2.4

DC SIMULATIO 2.4.1 STATIC OISE MARGI

Here is the schematic of the SRAM for Static Noise Margin Measurement. We sweep the left voltage and measure the right voltage and do vice versa and find the min edge of the max box that can fit into the butterfly curve.

Fig 2.22: Schematic of SRAM Array with 6X64 Decoder

(i)

HOLD operation. We keep the gate of the pass transistors at GND and get the following curve. The SNM for this is 0.4604.

Fig 2.23: Hold operation

22

(ii)

READ - The SNM we got was 0.1616V. The graph is as follows.

Fig 2.24: Static Noise Margin estimation of SRAM Cell

2.4.2

Cell Read Current

Cell read current equals the current that flows through the pass gate nfet connected to the BL draining charge on the BL into the cell ground terminal. The larger the current the faster BL gets discharged and develops a signal for the sensing circuit to detect. Having a very large Read Current flowing through the discharge path from bit line to the ground could result in the exceeding the read stability threshold. This can be avoided by optimally choosing the sizing of the access nfet and the discharge nfet of the respective inverted during a read operation cycle.

23

Fig 2.25: Cell Read Current Simulation

2.4.3

Effect of Threshold Voltage (Vt)

We change Vt by 25mV, 50mV, 100mV and 200mV by adding a ve voltage to the gate and got following values.

Vt 25mV 50mV 100mV 200mV

Pass nfet 0.1638 0.1725 0.1900 0.2246

Pull down nfet 0.1626 0.1655 0.1732 0.1778

Pfet 0.1518 0.1483 0.1422 0.1252

24

Fig 2.26 - Effect of SNM by increasing Vt at pass nfet

Fig 2.27- Effect of SNM on increasing Vt at pull down nfet

25

Fig 2.28- Effect of increasing Vt at one end of pfet and measuring other side.

26

3. LAYOUT
3.1 DECODER 3.1.1 A D2 Gate.

Here is the layout of AND2 gate which passes both DRC and LVS

Fig 3.1- DRC and LVS results for AND2 Gate along with layout.

27

3.1.2

A D3 Gate.

Here is the layout of AND3 gate which passes both DRC and LVS

Fig 3.2- DRC and LVS results for AND3 Gate along with layout.

28

3.1.3

3x8 DECODER

Here is the layout of 3x8 Decoder which passes both DRC and LVS

Fig 3.3- DRC and LVS results for 3x8 Decoder along with layout.

3.1.4

6x64 DECODER

Here is the layout of 3x8 Decoder which passes both DRC and LVS

29

Fig 3.4- DRC and LVS results for 6x64 Decoder along with layout.

3.2 SRAM 3.2.1 Precharge circuit layout

The width of the entire precharge circuit layout should be equal to the width between the two bit lines BIT and BIT_B. Below is an image of our layout of this circuit with its DRC and LVS results.

30

Fig 3.5- DRC and LVS results for Precharge Circuit along with layout

3.2.2

Read Sense Amp Circuit

In the layout of the read circuit, care has to be taken to ensure that it fits exactly in between the two bitlines. The symmetric lateral reflection layout of the SRAM cells adds some degree of complexity, this being due to the fact that now we would have a series of BIT, BIT_B, BIT_B, BIT followed by the same pattern. For a read it is sufficient to sense one of the bit lines, either BIT or BIT_B. Two read sense amps would have to be fit between the two BIT lines. The LVS results and the layout of the Read Sense amp can be found in the image below.

31

Fig 3.6 DRC and LVS results for Read Sense Amplifier along with layout

3.2.3

SRAM 64 X 64 array

Using the SRAM Cell provided from the standard library, we created a symmetrical and laterally inverted 2 X 2 network of SRAM cells. This was done to achieve a good sharing of the power rails and to reduce the bit line noise reduction. Though not done in our layout cross coupling bit lines would reduce the bit line noise to a very good extent.

Using an instance of 2 X 2 SRAM cells the entire array of 64 X 32 top half and 64 X 32 bottom halves as shown in the schematic of phase two was laid out. Following this is the insertion of the Read Sense Amplifiers in between the top half and bottom halves of the

32

entire SRAM array layout. To the left of the image below is the layout of the 2 X 2 network of SRAM cells and to the right the 64 X 64 layout of SRAM cells.

Fig 3.7- Array of SRAM Cells, 2 X 2 and 64 X 64 arrays.

Image below shows the DRC test results:

Fig 3.8: DRC results for the 64 X 64 SRAM array

33

Heres the complete layout of SRAM cell with decoder.

Fig 3.9: 64 X 64 SRAM array along with 6x64 Decoder

34

4. RESULTS
4.1 Simulation Results 4.1.1 Simulation for One Cell SRAM We simulated a single cell SRAM with following schematic

Fig 4.1 One Cell SRAM Schematic

35

Below is a graph showing the Write 1 Read 1 Write 0 simulation on a single SRAM cell.

Fig 4.2 One Cell SRAM Simulation

4.1.2

Simulation for 64x64 bit SRAM Array

Here is the schematic used for 64x64 bit SRAM Array

Fig 4.3 64x64 SRAM Array

and here are the simulation results, when din<0> = 1, din<1> = 0, and din<2> = 1 with address line as 000000, and clock running at 1 Ghz.

36

Fig 4.4 Simulation for complete 64x64 SRAM cell Array

4.2 DRC and LVS Results The DRC and LVS were checked for each component individually. The following is a summary of the results:

Functional Component 6 X 64 Decoder Precharge Read Sense Amp 64 X 64 SRAM array

DRC Passed Passed Passed Errors

LVS Passed Passed Passed Errors

Please find all reports to these tests at the following location on http://vlsi2.cisl.columbia.edu /home/user5/fall07/ssn2111/LVS_FinalReports /home/user5/fall07/ssn2111/DRC_FinalReports

37

5. CO CLUSIO
As a SRAM project for EE 4321 VLSI course, we designed 64x64 bit SRAM cell both at the schematic and layout level. We attempted to design the 6x64 decoder using 3x8 decoder using two and three input AND gates using Domino Logic. We could successfully simulate and verify the functionality of the components which we targeted to design. Though we couldnt successfully pass the DRC and LVS of entire unit because of the primary reason that the unit cell being provided to us failed at DRC and LVS level, but we could successfully pass the DRC and LVS of other individual components including Pre-Charge, Read Sensing Circuit and 6x64 Decoder.

The experience on working for such a design oriented project gave us a thorough insight what all critical issues we need to consider while designing a simple unit. This also made us familiar with the different approaches to implement the same design and decide what the tradeoffs between different alternatives are. Also, it made us aware of the critical physical implementation issues which we not only have to consider during actual layout but also during schematic level design. It also gave a hand-on experience upon CAD tools like Cadence, Virtuoso, Spice and Spectre widely used both at industrial and academic level for circuit designing. Overall, it was a nice experience both at learning, practicing and designing a most critical part of processor unit widely used in any Computer Architecture.

38

6. REFERE CES
1. http://en.wikipedia.org/wiki/Static_random_access_memory 2. Cmos Logic Uyemura 3. CMOS VLSI Design Weste & Harris 4. Static-Noise Margin Analysis of CMOS SRAM Cells EVERT SEEVINCK, SENIOR MEMBER, IEEE, FRANS J. LIST, AND JAN LOHSTROH, MEMBER, IEEE. 5. Analyzing Static Noise Margin for Subthreshold SRAM in 65nm CMOS Benton H. Calhoun and Anantha Chandrakasan 6. Transistor Sizing for Reliable Domino Logic Design in Dual Threshold Voltage Technologies by Seong-Ook Jung, Ki-Wook Kim, Sung-Mo (Steve) Kang

39

Potrebbero piacerti anche