Sei sulla pagina 1di 32

B.

Tech Seminar Report

PHASE CHANGE MEMORY


Submitted in partial fulfillment for the award of the Degree of Bachelor of Technology in Computer Science and Engineering

Submitted by EJAKHAN.N.M (Reg. No.09422008)

Under the guidance of

Mrs. SOORYA SURENDRAN

Department of Computer Science and Engineering


HINDUSTAN COLLEGE OF ENGINEERING, ARIPPA, KOLLAM KERALA OCTOBER 2012

CERTIFICATE

This is to certify that the thesis entitled PHASE CHANGE MEMORY is a bonafide record of the seminar done by EJAKHAN.N.M (Reg. No.09422008) under my supervision and guidance, in partial fulfillment for the award of Degree of Bachelor of Technology in Computer Science and Engineering from the University of Kerala for the year 2012.

Mrs. BEENA.V.R (Coordinator) Asst. Professor Dept. of CSE

Mrs. SOORYA SURENDRAN (Guide) Asst. Professor Dept. of CSE Prof. FRAHAD MUSADEEKH Head Of Department Dept. of Computer Science & Engineering

Place: Arippa Date:

ACKNOWLEDGEMENT
First and foremost, I wish to place on records my ardent and earnest gratitude to my Seminar guide Mrs.SOORYA SURENDRAN, Assistant Professor, Dept. of computer Science and Engineering. Her tutelage and guidance was the leading factor in translating my efforts to fruition. Her prudent and perspective vision has shown light on my trail to triumph. I am extremely happy to mention a great word of gratitude to Prof. FRAHAD
MUSADEEKH, Head of the Department of Computer Science and Engineering for

providing me with all facilities for the completion of this work. Finally yet importantly, I would like to express my gratitude to my Seminar coordinator Mrs., BEENA.V.R for her valuable assistance provided during the course of the seminar. I would also extend my gratefulness to all the staff members in the Department. I also thank all my friends and well-wishers who greatly helped me in my Endeavour.

EJAKHAN.N.M

ABSTRACT
Phase-change memory (also known as PRAM,Ovonic Unified

Memory, Chalcogenide RAM and C-RAM) is a type of non-volatile random-access memory. PRAMs exploit the unique behavior of chalcogenide glass. Heat produced by the passage of an electric current switches this material between two states, crystalline and amorphous. Recent versions can achieve two additional distinct states, in effect doubling their storage capacity. PRAM is one of several new memory technologies competing in the non-volatile role with the almost universal flash memory. The latter technology has a number of practical problems that these replacements hope to address. Leon Chua, who is considered to be the father of non-linear circuit theory, has argued that all 2-terminal non-volatile memory devices including phase change memory should be considered memristors. Stan Williams of HP Labs has also argued that phase change memory should be considered to be a memristors. The crystalline and amorphous states of chalcogenide glass have dramatically different electrical resistivity. The amorphous, high resistance state represents a binary 0, while the crystalline, low resistance state represents a 1. Chalcogenide is the same material used in re-writable optical media. In those instances, the material's optical properties are manipulated, rather than its electrical resistivity, as chalcogenide refractive index also changes with the state of the material.

CONTENTS

Chapter No.

TITLE

Page No.

List of Figures List of Tables List of Abbreviations 1 2 3 4 4.1 4.2 5 6 7 8 9 9.1 9.2 9.3 9.4 9.5 10 INTRODUCTION WHAT IS PCM? CHALCOGENIDE MATERIALS THEORY OF OPERATIONS Writing Reading MAIN FEATURES DISADVANTAGES COMPARISON PCM BASED MEMORY MODELS HYBRID MAIN MEMORY ORGANIZATION Lazy Write Organization Line Level Writes Fine-Grained Wear-Leveling for PCM Page Level Bypass for Write Filtering Impact of These Techniques CONCLUSION

LIST OF ABBREVIATIONS

DRAM FGWL HDD LLWB OS PCM PLB

Dynamic Random Access Memory Fine Grained Wear Leveling Hard Disk Drive Line Level Write Back Operating System Phase Change Memory Page Level Bypass

LIST OF FIGURES

Figure No.

Title

Page No.

2.1 2.2 3.1 4.1 4.2 4.3 4.4 7.1 8.1 9.1 9.2

Typical PCM cell PCM storage cell and its implementation Periodic table Chalcogenide Amorphous and Polycrystalline Current-Voltage characteristic Set Pulse and Reset Pulse Reading Typical Access Latencies Main Memory Organizations Lazy write Organization Fine Grained Wear Leveling

LIST OF TABLES

Table No.

Title

Page No.

4.1 7.1 7.2 9.1 9.2

Set operation VS Reset Operation Comparison PCM VS Flash Impact of the dierent techniques on performance Comparison

CHAPTER 1

INTRODUCTION

Current computer systems consist of several cores on a chip and sometimes several chips in a system. As the number of cores in the system increases, the number of concurrently running applications (or threads) increases, which in turn increases the combined working set of the system. The memory system must be capable of supporting this growth in the total working set. For several decades, DRAM has been the building block of the main memories of computer systems. However, with the increasing size of the memory system, a signicant portion of the total system power and the total system cost is spent in the memory system. Current DRAM-based main memory systems are starting to hit the power and cost limit. Therefore, technology researchers have been studying new memory technologies that can provide more memory capacity than DRAM while still being competitive in terms of performance, cost, and power. An alternative memory technology that uses resistance contrast in phase-change materials is being actively investigated in the circuits community this is known as Phase Change Memory (PCM).These devices oer more density relative to DRAM, and can help in- crease main memory capacity of future systems while remaining within the cost and power constraints. There are several challenges to overcome before PCM can become a part of the main memory system. First, PCM being much slower than DRAM makes a memory system comprising exclusively of PCM, to have much increased memory access latency; thereby, adversely impacting system performance. Second, PCM devices are likely to sustain signicantly reduced number of writes compared to DRAM, therefore the write trac to these devices must be reduced. Otherwise, the short lifetime may signicantly limit the usefulness of PCM for commercial systems.

CHAPTER 2

WHAT IS PCM

PCM is a type of non-volatile memory that exploits the prope r t y o f c h a l c o g e n i d e g l a s s t o s w i t c h b e t w e e n t w o s t a t e s , a m o r p h o u s a n d c r y s t a l l i n e , with the application of heat using electrical pulses. The phase change material can be switched from one phase to another reliably, quickly, and a large number of times. The amorphous phase has low optical reexivity and high electrical resistivity. Whereas, the crystalline phase has high reexivity and low resistance. PCM exploits deference in the electrical resistivity of a material in these dierent phases. The dierence in resistance between the two states is typically about ve orders of magnitude and can be used to infer logical states of binary data namely 1(high bit) and 0 (low bit).

Figure 2.1: Typical PCM cell

The gure shows a graphical representation of a basic P C M s t o r a g e e l e m e n t . A s shown on the left, a layer of chalcogenide is sandwiched between a top electrode and a bottom electrode. A resistive heating element extends from the bottom electrode and contacts a layer of the chalcogenide material. Current injected into the junction of the chalcogenide and the heater induces the phase change through Joule heating.

Figure 2.2: PCM storage cell and its implementation

Figure at right is the actual implementation of the concept, showing an amorphous bit is formed layer of polycrystalline chalcogenide. Because of the change in reectivity, the amorph o u s b i t a p p e a r s a s a mushroom cap shaped structure in the layer of polycrystalline chalcogenide.

CHAPTER 3 CHALCOGENIDE MATERIALS

The PCM technology uses a class of materials known as chalcogenides. C h a l c o g e n i d e s a r e a l l o y s t h a t c o n t a i n i n g e l e m e n t i n t h e O x y gen/Sulphur family of the Periodic Table i.e. Group 16 in the new style or Group VIa in the old style Periodic Table (usually combined with IV and V group elements)

Figure 3.1: Periodic table-Chalcogenide

The history of phase-change materials can be traced back to w o r k s t a r t i n g i n t h e 1950s by Dr. Stanford Ovshinsky who was researching the properties of a class of glassy materials that exhibited the ability to easily and stably change between two phases. By t h e l a t e 1960s, he had reported that certain of these materials exhibited a r e v e r s i b l e c h a n g e b o t h i n r e s i s t i v i t y a n d r e e c t i v i t y w h e n c h a n gi n g b e t w e e n a n o r d e r e d ( p o l y - crystalline) state and a disordered (amorphous) state. It was recognized that this eect could be exploited both for optical memory as well as electronic memory Phase-change materials have been in use for many years for high-volume rewritable CDs and DVDs which make use of the dierence in optical properties. Mnemonics PCM is using an alloy of Germanium, Antimony and Tellurium (Ge2Sb2Te5), known more

commonly as GST. Most companies performing research and development in PCM t o d a y a r e u s i n g G S T o r c l o s e l y r e l a t e d a l l o y s . O t h e r a l l o y s t h a t a r e b e i n g u s e d f o r the research purposes of PCM are Nitrogen-doped GST, Sb2Te3 with N-doping (STN), AgInSbTe (silver-indiumantimony-tellurium)

CHAPTER 4 THEORY OF OPERATION

Phase-change chalcogenide exhibit a reversible phase change between the amorphous phase and the crystalline phase. As illustrated in Figure 4.1, in the amorphous phase, there is an absence of regular order to the crystalline lattice. In this phase, the material demonstrates high resistivity and low reectivity. In contrast, in the polycrystalline phase, the material has a regular crystalline structure and exhibits high reectivity and low resistivity.

Figure 4.1: Amorphous and Polycrystalline

In PCM, we are exploiting the dierence in resistivity b e t w e e n t h e t w o p h a s e s o f the material. This phase change is induced in the material through localized Joule heati n g c a u s e d b y c u r r e n t i n j e c t i o n . Th e n a l p h a s e o f t h e m a t e r i a l i s m o d u l a t e d b y t h e magnitude of the injected current and the time of the operation.

4.1 WRITING The PCM material is between a top and a bottom electrode with a heating element that e x t e n d s f r o m t h e b o t t o m e l e c t r o d e , a n d e s t a b l i s h e s c o n t a c t w i t h t h e P C M m a t e r i a l . When current is injected into the junction of the material and the heating element, it induces the phase change Crystallizing the phase-change material by heating it above the crystallization temp e r a t u r e ( b u t b e l o w t h e m e l t i n g t e m p e r a t u r e ) i s c a l l e d t h e S E T o p e r a t i o n . T h e S E T operation is controlled by moderate power, and long duration of electrical pulses and this returns the cell to a low-resistance state, and logically stores a 1. Melt-quenching the material is called the RESET operation, and it makes the material amorphous. The RESET operation is controlled by high power pulses which place the memory cell in high-resistance state and logically stores a 0.In the phase-change memory, threshold switching provides a means to deliver the required programming current needed to program a bit in the high-resistance state at low voltage. From a highresistance (RESET) state, a pcm bit is programmed into a low-resistance (SET) state by applying programming voltage in excess of Vth, allowing the bit to enter the dynamic ON state. Current then is allowed to ow for a length of time sucient to ensure crystallization. The device can then be programmed to the RESET state by applying a short, somewhat larger current pulse to a bit in the polycrystalline s t a t e . Th e r e s e t p u l s e o n l y n e e d s t o b e o f s u f f i c i e n t m a gn i t u d e a n d d u r a t i o n t o m e l t the programmed volume of chalcogenide alloy and to have a fast enough falling edge to permit the molten programmed volume of material to cool fast enough to vitrify. The duration of the reset pulse can be short, since the material in the programmed volume can be heated to the melting point in a few nanoseconds.

Figure 4.2: current-voltage characteristic

gure4.2: Current-voltage characteristics for an Ovonic Unied Memory (OUM) cell element in both the RESET (amorphous, high-resistance) and SET (crystalline, low-resistance) states, showing key device parameters: Read/SET/RESET regimes and SET and RESET states.Vh is the holding voltage, and Vth is the switching threshold voltage.

Figure 4.3: Set Pulse and Reset Pulse,

Table 4.1: Set operation V/S Reset Operation

SET OPERATION Crystallizing the pcm. Pulse of Moderate power but long duration To Low resistance state pulse of 150 microamper ,1.2V Logically stores 1 ~150A, 1.2V. A SET dissipates 90W for 150ns ~13.5pJ.

RESET OPERATION Melt quenching to make it amorphous. Pulse of higher power but short duration To High resistance state pulse of 300 microampere, 1.6V Logically stores 0. ~300 A , 1.6V RESET dissipates 480W f o r 4 0n s ~ 1 9 . 2 p J .

4.2 READING

To read the data, the chips use a smaller current to determine which state the chalcogenide is in. Information stored in the cell is read out by measurement of the cells resistance. In read mode, verifying the cell resistance is accomplished at a Voltage Less than Vth, typically 0.4 V. This ensures that while reading the state of the cell is not aected and no writing can take place. Prior to reading the cell, the bit line is precharged to the read voltage. The word line is active low when using a BJT access transistor. If a selected cell is in a crystalline state, having low resistance, the bit line is discharged with current owing through the storage element and access transistor. Otherwise, if the cell is in an amorphous state, it prevents or limits bit line current since i n t h i s s t a t e t h e m a t e r i a l h a s h i gh r e s i s t a n c e

CHAPTER 5

M A I N F E A T UR E S

5.1 BIT-ALTERABLE Like RAM or EEPROM, PCM is bit alterable. Flash technology requires a separate erase step in order to change information. Information stored in bit-alterable memory can be switched from a one to zero or zero to a one without a separate erase step.

5.2 SCALING Both NOR and NAND rely on memory structures which are d i c u l t t o s h r i n k a t s m a l l l i t h o s . Th i s i s d u e t o g a t e t h i c k n e s s remaining constant and the need for operation voltage of more than 10V while the operation of CMOS logic has been scaled to 1V or even less. This scaling eect is often referred to as Moores Law, where memory densities double with each smaller generation. Flash rely on oating gate memory structures, which are also dicult to shrink. With PCM, as the memory cell shrinks, the volume of GST material shrinks as well, providing a truly scalable solution. Chalcogenide lms have already been proven to have stable characteristics to a 5nm node. As the PCM memory cell shrinks, the volumes of GST material involved in the state change shrinks resulting in reduced power consumption or higher write performance. This unique feature of PCM technology supports the promise of scalability beyond that of other memory technologies.

5.3 DENSITY PCM is a dense technology with feature size comparable to DRAM cells. Further- more, a PCM cell can be in di erent degrees of partial crystallization thereby enabling more than one bit to be stored in each cell, recently, a prototype with two logical bits in each physical cell has been demonstrated. This means four states with dierent degrees of partial crystallization are possible, which allows twice as many bits to be stored in the same physical area. Hence the density of PCM is almost four times to that of DRAM. Hence more amount of information can be stored in a PCM, than that of a DRAM for a given size.

5.4 NON-VOLATILE PCM is non-volatile. It does not require a constant power supply to retain information, while DRAM does. Hence there is no need of refreshing circuits in order to maintain the data in a PCM.

5.5 READ PERFORMANCE Like RAM and NOR-type ash, the technology features fast r a n d o m a c c e s s t i m e s . Th i s e n a b l e s t h e e x e c u t i o n o f c o d e d i r e c t l y f r o m t h e m e m o r y , w i t h o u t a n i n t e r m e d i a t e c o p y t o R A M . Th e r e a d latency of PCM is comparable to single bit per cell NOR ash, while the read bandwidth can match DRAM. In contrast, NAND ash suers from long random access times on the order of 10s of microseconds that prevent direct code executions.

5 . 6 WR I T E / E R A S E P E R F O R M A N C E PCM is capable of achieving write speeds like NAND, but w i t h l o w e r l a t e n c y a n d w i t h n o s ep a r a t e e r a s e s t e p r e q u i r e d . N O R ash features moderate write speeds but long erase times. As with RAM, no separate erase step is required with PCM, but the write speed (bandwidth and latency) does not match the capability of R A M t o d a y . Th e c a p a b i l i t y o f P C M i s e x p e c t e d , h o w e v e r , h o w e v e r , to improve with each process generation as the PCM cell area decreases.

CHAPTER 6 DISADVANTAGES

6 . 1 L I MI T E D L I F E T I M E The number of writes to a PCM is limited about 109, after which the memory cell be- gin to wear out. Due to the fact that the operation is temperature dependant. Expansion and contraction.

6.2 HIGH ACCESS LATENCIES PCM also suers from high access latencies compared to DRAM. It is around 250 ns for PCM whereas 60 ns in case of DRAM

6.3 HIGH ENERGY CONSUMPTION T h o u gh P C M e n j o y s t h e a d v a n t a g e o f h a v i n g a l m o s t z e r o l e a k a g e p o w e r , i t s u e r s f r o m h i gh e r d y n a m i c p o w e r c o n s u m p t i o n . This mainly supported by the fact that the read and write operations are temperature dependant.

CHAPTER 7 COMPARISON

Figure 7.1: Typical Access Latencies

Figure 7.1 shows the typical access latency (in cycles, a s s u m i n g a 4 G H z m a c h i n e) o f d i e r e n t m e m o r y t e c h n o l o g i e s , a n d their relative place in the overall memory hierarchy. Hard disk drive (HDD) latency is typically about four to ve orders of m a gn i t u d e h i g h e r t h a n D R A M . A t e c h n o l o g y d en s e r t h a n D R A M and access latency between DRAM and hard disk can bridge this speed gap. Flash-based disk caches have already been proposed to bridge the gap between DRAM and hard disk, and to reduce the power consumed in HDD. However, with Flash being 28 times slower than DRAM, it is still important to increase DRAM capacity to reduce the accesses to the Flash-based disk cache. The access latency of PCM is much closer to DRAM, and coupled with its density advantage, PCM is an attractive technology to increase memory capacity while remaining within the system cost and power budget. Furthermore, PCM cells can sustain 1000x more writes than Flash cells, which makes the lifetime of PCM-based memory system in the range of years as opposed to days for a Flash-based main memory system. Write endurance is the maximum number of writes for each cell. Data retention is the duration for which the non-volatile technologies can retain data. It cannot be found from the gure7.2 that it has Density similar to NAND ash and Read latency similar to NOR ash PCM oers a density advantage similar to NAND Flash, which means more main memory capacity for the same chip area. The read latency of PCM is similar to NOR Flash, which is only about 4X slower compared to DRAM.

Table7.1: Comparison Para meter Density Read Latency Write speed Endurance Retention RAM 1X 60ns 1Gbps N/A Refresh D N AND FLASH 4X 25ns 2.4MB/s 10^4 10ys NOR FLASH 0.25X 300ns 0.5MB/s 10^4 10ys PCM 2Xto4X 200to300ns 100MB/s 10^6to10^8 10ys

The write latency of PCM is about an order of magnitude slower than read latency. However, write latency is typically not in the critical path and can be tolerated using buers. Finally, P C M i s a l s o e x p e c t e d t o h a v e h i gh e r w r i t e e n d u r a n c e ( 1 0 6 t o 1 0 8 writes) relative to flash (104 writes).

Table 7.2: PCM VS FLASH. PCM performance Fast Low voltage (0.4-2 V) Scaling: good
Medium endurance (~109)

FLASH performance Slow High voltage (10-15 V) Scaling: bad


Short endurance (~105)

Medium current (50-300 nA)

Low current (~ nA)

CHAPTER 8 P C M B A S E D M EM O R Y M O D E L

There are several challenges to overcome before PCM can become a part of the main memory system. First, PCM being much slower than DRAM makes a memory system comprising exclusively of PCM, to have much increased memory access latency; thereby, adversely impacting system performance. Second, PCM devices are l i k e l y t o s u s t a i n s i gn i c a n t l y r e d u c e d n u m b e r o f w r i t e s c o m p a r e d to DRAM, therefore the write trac to these devices must be r e d u c e d . O t h e r w i s e , t h e s h o r t l i f e t i m e m a y s i gn i c a n t l y l i m i t t h e usefulness of PCM for commercial systems. There is active research on PCM, and several PCM prototypes have been proposed, each optimizing for some important device characteristics (such as d e n s i t y , l a t e n c y , b a n d w i d t h , o r l i f e t i m e ) . Wh i l e t h e P C M technology matures, and becomes ready to be used as a complement to DRAM, it is believed that system architecture solutions can be explored to make these memories part of the main memory to improve system performance.

Figure 8.1: Main Memory Organizations Figure 8.1 (a) shows a traditional system in which DRAM main memory is backed by a disk. Flash memory is nding widespread use to reduce the latency and power requirement of disks. In fact, some systems have only Flash-based storage without the hard disks; for example, the Mac Book Air laptop has DRAM backed by a 64GB Flash drive. I t i s t h e r e f o r e r e a s o n a b l e t o e x p e c t f u t u r e h i gh p e r f o r m a n c e systems to have Flash- based disk caches such as shown in Figure 8.1(b). However, because there is still two orders of magnitude d i e r e n c e i n t h e a c c e s s l a t e n c y o f D R A M m e m o r i e s a n d t h e n ex t level of storage, a large amount of DRAM main memory is still

needed to avoid going to the disks. PCM can be used instead of DRAM to increase main memory capacity as shown in Figure 8.1(c). However, the relatively higher latency of PCM compared to DRAM will signicantly decrease the system performance. Therefore, to get the best capacity and latency, Figure 8.1(d) shows the hybrid system we foresee emerging for future highperformance systems. The larger PCM storage will have the capacity to hold most of the pages needed during program execution, thereby reducing disk accesses due to paging. The fast DRAM memory will act as both a bu er for main memory, and as an interface between the PCM main memory and the processor system. We show that a relatively small DRAM buer (3 percentage size of the PCM storage) can bridge most of the latency gap between DRAM and PCM.

CHAPTER 9 H Y B R I D M A I N M EM R Y O R G A N I Z A T I O N

In a hybrid main memory organization, the PCM storage is managed by the Operating System (OS) using a Page Table, in a m a n n e r s i m i l a r t o c u r r e n t D R A M m a i n m e m o r y s y s t e m s . Th e

DRAM buer is organized similar to a hardware cache that is not visible to the OS, and is managed by the DRAM controller. Although, the DRAM buer can be organized at any granularity, it can be assumed that both the DRAM buer and the PCM storage are organized at a page granularity. DRAM memory acts as a bu er as well as an interface between the PCM and processor Di erent techniques used in this hybrid main memory organization are: 1. Lazy write organization 2. Line level writes 3. Fine grained wear leveling 4. Page level bypass

9 . 1 L A Z Y WR I T E O R G A N I Z A T I O N The Lazy-Write organization reduces the number of writes to the PCM and overcomes the slow write speed of the PCM, both w i t h o u t i n c u r r i n g a n y p e r f o r m a n c e o v e r h e a d . W h en a p a g e f a u l t i s serviced, the page fetched from the hard disk (HDD) is written o n l y t o t h e D R A M c a c h e . A l t h o u gh a l l o c a t i n g a p a g e t a b l e e n t r y a t the time of page fetch from HDD automatically allocates the space for this page in the PCM, the allocated PCM page is not written with the data brought from the HDD. This eliminates the overhead of writing the PCM. To track the pages present only in the DRAM, and not in the PCM, the DRAM tag directory is extended with a p r e s e n c e ( P ) b i t . W h en t h e p a g e f r o m H D D i s s t o r e d i n t h e DRAM cache, the P bit in the DRAM tag directory is set to 0. In the lazy write organization, a page is written to the PCM only w h en i t i s e v i c t e d f r o m t h e D R A M s t o r a g e , a n d t h e P b i t i s 0 , o r the dirty bit is set.

Figure 9.1: Lazy write Organization If on a DRAM miss, the page is fetched from the PCM then the P bit in the DRAM tag directory entry of that page is set to 1. W h en a p a g e w i t h P b i t s e t i s e v i c t e d f r o m t h e D R A M , i t i s n o t written back to the PCM unless it is dirty. Furthermore, to account for the larger write latency of the PCM a write queue is associated with the PCM. We assume that tags of both the write queue and the DRAM buer are made of SRAM in order to help in probing these structures while incurring low latency. Given the PCM write latency, a write queue of 100 pages is sucient to avoid stalls due to write queue being full.

9 . 2 L I N E L E V E L WR I T E S Typically, the main memory is read and written in pages. However, endurance limits of the PCM require exploring mechanisms to reduce the number of writes to the PCM. We propose writing to the PCM memory in smaller chunks instead of a whole page. For example, if writes to a page can be tracked at the granularity of a processors cache line, the number of writes to the PCM page can be minimized by writing only dirty lines within a page. We propose Line Level Write Back (LLWB) that tracks the writes to pages held in the DRAM on the basis of processors cache lines. To do so, the DRAM tag directory shown in Figure 9.1 is extended to hold a dirty bit for each cache line in the page. In t h i s o r g a n i z a t i o n , w h en a d i r t y p a g e i s e v i c t e d f r o m t h e D R A M , i f

the P bit is 1 (i.e., the page is already present in the PCM), only t h e d i r t y l i n e s o f t h e p a g e a r e w r i t t e n t o t h e P C M . W h en t h e P b i t of a dirty page chosen for eviction is 0, all the lines of the page will have to be written to the PCM. LLWB signicantly reduces wasteful writes from DRAM to PCM for workloads which write to very few lines in a dirty page. To support LLWB we need dirty bits per line of a page. For example, for the baseline system with 4096B page and 256B line size, we need 16 dirty bits per page in the tag store of DRAM buer.

9 . 3 F I N E - G R A I N E D WE A R - L E V E L I N G F O R P C M Memories with limited endurance typically employ wearleveling algorithms to extend their life expectancy. For example, in Flash memories, wear-leveling algorithms arrange data in a manner so that sector erasures are distributed more evenly across the Flash cell data in a manner so that sector erasures are distributed more evenly across the Flash cell array and single s e c t o r f a i l u r e s d u e t o h i gh c o n c e n t r a t i o n o f e r a s e c y c l e s a r e minimized.LLWB reduces write trac to PCM. However, if only some cache lines within a page are written to frequently, they will wear out sooner than the other lines in that page. We analyze the distribution of write trac to each line in a PCM page. Figure 9.2 shows the total write back trac per dirty page for the two database applications, db1 and db2. The average number of writes per line is also shown. The page size is 4KB and line size is 256B, giving a total of 16 lines per page, numbered from 0 to 15. The lifetime of PCM can be increased if the writes can be made uniform across all lines in the page. This can be done by tracking number of writes on a per line basis; however, this would incur huge tracking overhead.

Figure 9.2: Fine Grained Wear Leveling Fine Grained Wear-Leveling (FGWL), is used for making the writes uniform (in the average case) while avoiding per line storage. In FGWL, the lines in each page are stored in the PCM in a rotated manner. For a system with 16 lines per page the rotate amount is between 0 and 15 lines. If the rotate value is 0, the page is stored in a traditional manner. If it is 1, then the Line 0 of the address space is stored in Line 1 of the physical PCM page, each line is stored shifted, and Line 15 of address space is stored in L i n e 0 . Wh e n a P C M p a g e i s r e a d , i t i s r e a l i g n e d . T h e p a g e s a r e written from the Write Queue to the PCM in a line-shifted format. On a page fault, when the page is fetched from the hard disk, a Pseudo Random Number Generator (PRNG) is consulted to get a random 4-bit rotate value, and this value is stored in the WearLevelShift (W) eld associated with the PCM page as shown in Figure 9.1. This value remains constant until the page is r e p l a c e d , a t w h i c h p o i n t t h e P R N G i s c o n s u l t e d a g a i n f o r t h e n ew page allocated in the same physical space of the PCM.

9 . 4 P A G E L E V E L B Y P A S S F O R WR I T E F I L T E R I N G N o t a l l a p p l i c a t i o n s b en e t f r o m m o r e m e m o r y c a p a c i t y . F o r example, streaming applications typically access a large amount of data but have poor reuse. Such applications do not benet from the capacity boost provided by PCM. In fact, storing pages of such applications only accelerates the endurance related wear-out of PCM. As PCM serves as the main memory, it is necessary to a l l o c a t e s p a c e i n P C M w h en a p a g e t a b l e e n t r y i s a l l o c a t e d f o r a

page. But, the actual writing of such pages in the PCM can be avoided by leveraging the lazy write architecture. We call this P a g e L e v e l B y p a s s ( P L B ) . W h en a p a g e i s e v i c t e d f r o m D R A M , PLB invalidates the Page Table Entry associated with the page, and does not install the page in PCM. We assume that the OS enables/disables PLB for each application using a conguration bit. If the PLB bit is turned on, all pages of that application bypass the PCM storage.

9.5 IMPACT OF THESE TECHNIQUES Table 9.1: Impact of the dierent techniques on performance Configuration PCM 32GB +1GB DRAM +LAZY WRITE +LLWB +PLB No. of bytes per cycle 0.317 0.807 0.725 0.316 0.247 Average Lifetime 7.6ys 3.0ys 3.4ys 7.6ys 9.7ys

Table 9.2: Comparison


no 1 2 3 4 5 6 7 Parameter Scalability Density Latency(read) Write speed Dynamic Power Static Power Crosstalk effect DRAM Less Less Less High Less High High PCM High High High Low High Nil Nil HYBRID Limited High Medium Medium Medium Medium Less

CHAPTER 10 C O N C L US I O N

Phase change memory can be exploited by the memory system and by the convergence of consumer, computer and communication electronic systems. The caching of the existing memory technologies, reducing the overall system cost and system complexity will be the compelling motivation for PCM adoption. Bandwidth will drive the sustaining side of PCM in code and data transfer applications, while reduction in power dissipation will represent a further added value of this technology. However, PCM comes with the drawback of increased access latency and limited number of writes. In order to overcome these disadvantages we can use it in conjunction with a DRAM buer and make use of three techniques: Lazy Write, LLWB, and PLB. These simple techniques can reduce the write trac by 3X and increase the average lifetime of PCM from 3 years to 9.7 years. Fine Grained Wear Leveling (FGWL) technique can be used to make the wear-out of PCM storage uniform across all lines in a page. PCM is todays m e m o r y b r e a k t h r o u gh . L i k e a s h , P C M i s a n o n - v o l a t i l e m e m o r y that can store bits even without a power supply. But unlike ash, data can be written to cells much faster, at rates comparable to the dynamic and static random-access memory (DRAM and SRAM) used in all computers and cell phones today. Quite simply, PCM blends together the best attributes of NOR ash, NAND ash, EEpROM and RAM- delivering a new category of memory for new usage models.

REFERANCE

[1] IBM Research (2009, May.). Scalable High Performance Main Memory System Using Phase-Change Memory Technology, IBM research 978- 1-60558-526-0/09/06(2009), IBM research. [Online]. Available: Internet: http://www.cs.ucsb.edu/ Chong/290N/pcm.pdf [May.26, 2009]. [2] The Basics of Phase Change Memory Technology. Intern e t : h t t p : / / w w w . n u m o n y x . c o m / D o c u m e n t s / Wh i t e P a p e r s / [ J a n . 3 0 , 2010]. [3] Wong, H.- S.P.; Sang Bum Kim; Byoungil Lee; Caldwell, M.A.; J i a l e L i a n g ; Y i W u ; J e y a s i n gh , R . G . D . ; S h i m e n g Y u , R e c e n t progress of phase change memory (PCM) and resistive switching random access memory (RRAM) , IEEE 10.1109/ICSICT.2010.5667542, 2010, pp: 1055 - 1060

Potrebbero piacerti anche