Sei sulla pagina 1di 19

7-Memory management in C#

Somethin remaining?

INDEXERS

Indexers are properties with parameters that allow you to treat an object like an array. Indexers are defined similar to other properties except for a few differences, Syntax <access> <data_type> this[<param_type> <name>] { .. } One can then treat the object like an array, which will invoke the indexer property. e.g. public boolean this[int bitPos] { get { <get_logic> } set { <set_logic> } }

Indexers

Program time!

What is Memory Management?


Unmanaged programming languages -> the programmer is responsible for tracking the lifetimes of allocated objects. The programmer must release the resources once an object is no longer in use. The programmer may often forget to release the memory (memory leak) or attempts to release resources that have already been disposed. The second case can cause the program to behave unpredictably and eventually be terminated prematurely. Resource management is tedious and distracts the developer from the real problems at hand. Resource management is tedious and distracts the developer from the real problems at hand. C#, a managed programming language, frees him of this task by using the Garbage collector. Garbage collection is an automatic memory management mechanism. However, the programmer is still responsible to write code that will release the resources correctly. The GC will ensure this code is called correctly when the object is no longer required.

CLR memory allocation


When a process is initialized, the CLR reserves a contiguous region of address space, called the managed heap. Objects in CLR are allocated on this managed heap. The heap also maintains a pointer called NextObjPtr(say). Initially, the NextObjPtr is set to the base address of managed heap. When a new object is created using the new operator, a newobj CIL instruction is generated by the compiler. A newobj instruction causes the runtime to 1. Calculate number of bytes required for the object. (Including overhead members) 2. CLR then checks if the bytes required for allocation is available in the reserved region. 3. If there is enough space for the object, the CLR will allocate the object starting at address pointed to by NewObjPtr. 4. CLR will increment NewObjPtr by the number of bytes of the object, so that NewObjPtr points to the end of the object. 5. However, if enough space isnt available for the object, then the GC will be called to free up memory.

GC roots
The GC checks to see if any objects in the heap are no longer being used by the application. If such objects exist, the memory used by these objects can be reclaimed. (If no more memory is available in the heap after a garbage collection, new throws an OutOfMemory- Exception.) But how does the garbage collector know whether the application is using an object? This isn't simple to answer. Every application has a set of roots. A single root is a storage location containing a memory pointer to a reference type object. This pointer either refers to an object in the managed heap or is set to null. - For example, a static field (defined within a type) is considered a root. - In addition, any method parameter or local variable is also considered a root. - Only variables that are of a reference type are considered roots; value type variables are never considered roots. When a garbage collection is about to begin, the GC will iterate through all the type objects (for static roots) and will walk up the call stack (for local roots) to identify the current set of roots in the program.

GC Algorithm
1. The Marking Phase
When the garbage collection begins, it assumes all objects are garbage. The GC then starts with the marking phase. This is when the GC walks up the call stack identifying all the roots in the program. If an object is found to be referenced by a root, then it will be marked by turning on a bit in the sync block of the object. The GC continues to walk through all the reachable objects recursively. If GC comes across an object that is already marked it can stop walking further down that path (because it has already looked at these objects). This helps performance and also prevents looping. Once this is done, the heap now contains a set of marked objects and a set of unmarked objects. The unmarked objects are considered garbage. The memory for the unmarked objects is reclaimed. PROBLEM? This may result in the managed heap becoming fragmented. If the GC determines that is the case, it will begin with the compact phase. Here it will remove such fragments by compacting memory and moving objects around.

The Marking phase


When the GC starts running, it assumes that all objects in the heap are garbage. The GC starts traversing the roots and building a graph of all objects reachable from the roots. Figure shows 1. Application's roots refer directly to objects A, C, D, and F. 2. All of these objects become part of the graph. 3. When adding object D, the GC notices that this object refers to object H -> object H is also added to the graph.

GC Algorithm
2. The Compact Phase
Once all of the roots have been checked, the heap contains a set of marked and unmarked objects. The marked objects are reachable via the application's code, and the unmarked objects are unreachable.

GC traverses the heap linearly looking for contiguous blocks of unmarked (garbage) objects.
If small blocks are found the GC will leave them alone. For large blocks of free contiguous memory, the GC will shift objects around to create a single contiguous free block. Moving the objects in memory will invalidate their references; so the GC will revisit all the surviving object references(roots) and update them to reference the new object locations.

Can we decide when Garbage collection should occur? The GC.Colllect() method helps you to force garbage collection to occur.

Disadvantage of GC
Performance hit. But this disadvantage overrides the advantages of the GC.

Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage. The GC walks through the heap linearly, looking for contiguous blocks of garbage objects. It then shifts the non-garbage objects down in memory removing all of the gaps in the heap. Also, it modifies the roots to point to the new locations of the objects. The NextObjPtr is positioned just after the last non-garbage object. At this point, the new operation is tried again and the resource requested by the application is successfully created.

Finalization
Mechanism offered by the CLR that allows an object to perform cleanup before being garbage collected. e.g. closing a file, releasing a native resource, db connection etc. The type that wishes to use finalization implements a Finalize method (defined in System.Object) and defines the cleanup code inside it. The GC will call the Finalize method on the object before releasing its memory. The Finalize method has a special syntax. ~<Name of type>() { //Cleanup code goes here }

Disadvantages
Finalizable objects take longer to allocate because pointers to them must be placed on the finalization list. Finalizable objects are not immediately garbage collected and get promoted to older generations, which adds to memory pressure. Finalizable objects cause your application to run slower since extra processing must occur for each object before its collected.

Finalization Process
In order to implement finalization, the runtime implements two data structures, finalization list and freachable queue. Both these data structures are internally managed by the runtime and are transparent to the application developer.

Finalization List
An internal data structure managed by the garbage collector Each entry in the finalization list points to an object that should have its Finalize called before the objects memory can be reclaimed. i.e. when a new object is created, if the type of the object defines a Finalize method, a pointer to the newly created object is placed in the finalization list.

Freachable queue
The freachable ( F-reachable) queue is another of the garbage collectors internal data structures. Each pointer in the freachable queue identifies an object that is ready to have its Finalize method called.

Freachable queue
The f stands for finalization. reachable means that the objects in the queue are actually reachable. Since objects in the freachable queue are reachable, they cannot be garbage collected. Also, any objects referred to by objects in the queue cannot be garbage collected. These objects will be moved from a non-reachable state to a reachable state and hence will not be garbage collected for the first collection. After the GC finishes, the finalizer thread will remove each object from the queue and call its Finalize method. The next time a garbage collection occurs ,these objects will be truly considered garbage and their memory reclaimed. In short, it will take at least 2 garbage collections to collect a finalizable object. In practical situations, it may take more than 2 garbage collections, because the object may be promoted to the next generation of objects. Generation?.... In a while

Finalization Process
When a garbage collection occurs, for all the objects that are termed as garbage, the GC scans the finalization list, looking for pointers to these objects. When such a pointer is found, it is appended to the freachable queue. A special high-priority thread is dedicated to calling the Finalize method for each object in the freachable queue. This thread removes each entry from the queue and then calls the objects Finalize method. After freachable queue is empty, this thread sleeps. It will be woken again when there are items in the queue.

Generations
The CLR GC is a generational (ephemeral) garbage collector. It makes the following assumptions - The newer an object is, the shorter its lifetime will be. - The older an object is, the longer its lifetime will be. - Collecting a portion of the heap is faster than collecting the whole heap.

Finalization Pictorial representation


Some of the objects on the heap are reachable from the application's roots, and some are not. Objects C, E, F, I and J have Finalize methods. So theyre moved to the Finalization Queue.

Finalization Pictorial representation


When a GC occurs, objects B, E, G, H, I, and J are determined to be garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, it is removed from the finalization queue and appended to the freachable queue

Memory occupied by B, G, and H has been reclaimed coz the didnt hv a Finalize method. Memory occupied by E, I, and J couldnt be reclaimed coz their Finalize method has not bin called yet.

Finalization Pictorial representation


The next time the GC is invoked, it sees that the finalized objects are truly garbage, since the application's roots don't point to it and the freachable queue no longer points to it. The memory for the object is reclaimed. IMP - 2 GCs are required to reclaim memory used by objects that require finalization. More than 2 collections may be necessary since the objects could get promoted to an older generation. Memory occupied by B, G, and H has been reclaimed coz the didnt Finalize method. Memory occupied by E, I, and J couldnt be reclaimed coz their Finalize method has not bin called yet.

Generations How they work


When initialized , the managed heap contains no objects. Objects added to the heap are to be Generation 0. In summary, Generation 0 contains newly created objects that the GC has never examined. When initializing the CLR sets a limit on the size of Generation 0. If trying to allocate a new object, the CLR detects that it has reached the limit of the generation 0, it will trigger a garbage collection. Any objects that survive this collection, are said to be Generation 1. At the end of this collection, Generation 0 will be empty. Similar to Generation 0, Generation 1 will also have limit (which will be higher than Generation 0). The CLR will now continue executing normally and triggering GC when Generation 0 becomes full. Eventually (if the application runs long enough), the CLR will have a situation where both Generation 0 and 1 are full. In this case, the GC will also examine Generation 1 objects to determine which are garbage. Any objects that are determined garbage will be reclaimed. Any objects that survive this are promoted to the next Generation, Generation 2. At the end of this collection, both Generation 0 and 1 will be empty. Similar cases will be handled for Generation 2 as well.

Generations Advantages
Dividing the objects into generations, the GC doesnt have to examine each and every object in the heap every time. This speeds up garbage collection. If the all the objects in Generation 0 are determined to be garbage, then all the CLR has to do is set the NextObjPtr to the beginning of the managed heap. The CLR Garbage collector is a self-tuning collector. It will modify the limits for each generation based on observations regarding previous collections. e.g. if it notices that very few objects survive Generation 0, it may lower the limit for generation 0.

Ques: If GC is so great, why isnt it implemented in C++? C++ allows casting a pointer from one type to another. Thus, there's no way to know what a pointer refers to. In the CLR, the managed heap always knows the actual type of an object, and the metadata information is used to determine which members of an object refer to other objects.

Potrebbero piacerti anche