Sei sulla pagina 1di 98

Memory Management

Every Windows administrator has to field user complaints about client performance. Client-
system performance can be affected by factors such as memory, CPU, disk and the network.
Of these factors, the most confusing is memory management, which admins need to
understand for making informed decisions and troubleshooting.

Users typically equate adding memory to resolving performance bottlenecks, and it's
relatively cheap and easy to add memory. But does adding memory really improve
performance? This article boils down memory management to simple terms. Part 2 in this
series will tie these concepts to the information shown in Windows 7 Task Manager
features and Resource Monitor and identify common memory-related issues.

Figure 1

Figure 1 shows the memory component of the Windows XP and Windows 7 Task Manager.
Note that there are fundamental differences between Windows XP, Vista and Windows 7
Task Manager versions.

It's important to know the difference between physical and virtual memory. Physical
memory is the amount of physical RAM available in the computer. Physical memory can be
visualized as a table shown in Figure 2, where data is stored. Each cell shown in the table is a
unique "address" where data is stored.
Figure 2

Virtual memory essentially allows each process -- applications, dynamic link libraries (DLLs),
etc. -- to operate in a protected environment where it thinks it has its own private address
space. Figure 1 shows the virtual memory table for a process on a computer with 2 GB of
RAM. The CPU translates or maps the virtual addresses into physical addresses in RAM using
page table entries (PTEs).

Virtual memory limits

The virtual address space for 32-bit architecture has a physical limit of about 4 GB,
regardless of the amount of RAM in the computer. Windows divides this into two sections,
as shown in Figure 2: user space and kernel space. The addresses in the kernel space are
reserved for system processes. Only those in the user space are accessible for applications.
So, each application has a virtual memory limit of 2 GB. Again, this is regardless of physical
RAM. That means that no process can ever address more than 2 GB of virtual address space
by default. Exceeding this limit produces an "out of virtual memory" error and can occur
even when plenty of physical memory is available.

In a simple example, consider having several applications open at the same time on a 32-bit
platform. Each application or process has 2 GB of virtual address space and has no idea that
it is sharing physical RAM with other processes. Memory management will then use PTEs to
translate the virtual address space for each process to physical RAM.
Note that, as shown in Figure 2, the use of virtual memory allows the three applications,
each with 2 GB of virtual address space, to share the 2 GB RAM in the computer. This is
accomplished by paging infrequently used data to disk, then paging it back to RAM when
needed.

Figure 3

Processes will run faster if they reside in memory as opposed to requiring the memory
manager page data in from the disk and put it back in memory. Thus, more memory in the
system allows more processes to reside in memory and reduces paging from disk.

However, since x86 processors can address only 4 GB address space, and the computer has
12 GB of RAM, you may wonder if you have 8 GB of wasted RAM. The answer is the Physical
Address Extension (PAE). This is a processor feature -- supported by Intel and AMD -- that
extends the address space to about 64 GB. This requires chipsets capable of PAE and
applications to be written to take advantage of PAE, which is enabled by default.

We can also steal 1 GB from kernel space and add it to user space by using the /3 GB switch
in Boot.ini and Boot Configurable Data. This is commonly done for server applications such
as 32-bit Microsoft Exchange Server, which uses almost 3 GB of virtual address space. It is
potentially dangerous, though, because we're reducing the memory that the kernel may
need.
Figure 4

Figure 1 conceptually shows how multiple processes map their virtual address space to
physical RAM. Note that in order to do this above 4 GB, you must use the PAE, as shown
in Figure 3. The 64-bit architecture (x64) permits 264 or about 16 EB of address space.
Because of limitations in the first x64 processors, however, Windows current
implementation is limited to 16 TB -- 8 TB in user space and 8 TB in Kernel space, as shown
in Figure 4.

This makes a huge difference in memory addressing. Rather than having to use the PAE and
/3 GB options and another level of complexity to address more virtual memory, x64
addresses it in a flat model. This allows up to 1 TB of RAM to be addressed, allowing more
processes to run in memory without paging, significantly increasing performance for
memory-intensive processes.

Making reservations

When a process starts up, it reserves a certain amount of memory -- as dictated by the
developer in the code -- but it requires few physical resources. This is analogous to making a
hotel reservation. Making the reservation just takes some time to enter it in the system and
talk to the customer. It does not block off the room until you arrive.

PRO+
Content
Find more PRO+ content and other member only offers, here.

 E-Zine

End-user computing tech ROI comes in many different forms


 E-Handbook

The state of workspace tools and where they're headed


 E-Zine

So many ways to deliver applications, so little time

Committed memory occurs when the process requests the memory manager to back the
reservation with memory (RAM and page file). In the hotel example, this would be checking
into the hotel.

Figure 5

You have now blocked off a room, but you haven't used any resources -- no power, no water,
etc. Similarly, at this point, nothing has been written to memory. When the memory is
written to, virtual memory pages will then be mapped into RAM, and memory will be
consumed. In the hotel, this is where you enter the room and consume water, electricity,
housekeeping services, etc. Each process performs these actions as they run.
Committing to memory

The sum of all the memory used by active processes in the system is referred to as
the system commit charge and is displayed in Task Manager along with the system commit
limit. In Figure 5, the Windows 7 Task Manager shows the committed memory values in the
"Commit (MB)" line.

Figure 6

The first number (3,528 in the figure) is the commit charge or currently used memory, while
the second number (7,990) is the system commit limit (RAM + page file). In the Physical
Memory section, the Total (RAM) is 3,996 (4 GB), so we have about a 4 GB page file. In XP,
the Commit Peak was also displayed, which is the peak amount used in a given time period
to aid in planning purposes.

Figure 6 demonstrates the memory management process described above. Here we see the
8 GB virtual address space (4 GB RAM, 4 GB page file) for a process that has reserved 2 GB
and committed 500 MB but has only written 50 MB to memory. The PTEs map this virtual
address space to physical space. These pages show up in the Working Set and may or may
not be mapped contiguously in physical RAM.

Figure 7 shows a graphical representation of the Windows page frames, which will help
explain how this works. On system boot, all memory resides in the Free Page List. The zero
page thread places pages in the Zero Page List.
Figure 7

When a process starts, it takes only the pages it needs from the Zero Page List and creates a
working set. Each process has its own working set, and pages are added as needed. Note
that a program such as a DLL or executable file does not completely "load" into memory.
Only the portion of the program that is being used is loaded into memory, with the
remainder staying on disk. So if only 50 K of a 500 K .exe is required, only 50 K will be loaded
into memory.

As the demand grows, additional pages will be added to the working set. At some point,
Windows Memory Management (WMM) decides that process' working set is large enough
and will start trimming pages that have been idle the longest. Rather than clearing them and
then having to create and write them again when needed, they will be moved to the
standby list or the modified page list.

Pages that are modified -- such as text entered in a Word document -- will be moved to the
modified page list. Periodically, the modified page writer takes a bunch of pages from the
modified page list, writes the data to disk and then sends the pages to the standby list.
Pages on the standby list can be reused because the WMM knows where the data is (on disk
in the page file).

Pages that are not modified are sent directly to the standby list since they are already on
disk. The advantage of the standby list is that if the page is required again -- such as when
you start working on a document that has been idle for a long time -- it can be easily be
retrieved from the modified page list or the standby list and written back into the working
set very quickly. Note that a modified page would have to be written to disk and put on the
standby list before it can be faulted back to the working set.

Figure 8

Windows will only load shareable data such as programs or DLLs into memory once. For
instance, I opened Word three times to edit three documents. Figure 8 shows the Task
Manager view, listing Winword.exe once. If the working set becomes large enough or if I
leave the Word docs idle long enough, the pages holding Winword.exe will be moved to the
standby list and may be reused.

The pages holding my documents were modified, so they would go to the modified list and
eventually be written to the page file (on disk) and then to the standby list, unless the
document was saved (in which case, it is on disk). Thus, the standby list is a file cache
because it is a copy of something already on disk.

More on Windows
memory management:

A guide to troubleshooting computer memory problems

How to detect a memory leak in Microsoft Windows


Windows 7 virtual memory performance optimization

Balancing desktop and workload management virtually or physically

Page faulting

The process of paging between the process working sets and the modified and standby lists
is called "soft page faulting" and is fairly inexpensive because it happens in physical memory.
Writing data to the disk is called "hard page faulting" and is more expensive in performance
as it is going to disk.

When a program terminates or a process willingly releases private memory, its working set
pages are released back to the Free List, which gets zeroed and moved to the Zero Page List.
The purpose of zeroing pages is security -- to ensure that no process can reuse and expose
sensitive data such as a password.

Once you understand memory management, you can diagnose memory-related


performance problems. Part 2 of this series will explain how to interpret the information
provided in Windows 7 Task Manager features and Resource Monitor, along with several
examples. Remember, though, just adding memory won't necessarily solve a problem.

Thanks to Clint Huffman at Microsoft for his technical contribution to this article and content
for Figure 6. Figure 7 was created using concepts and ideas from Clint Huffman
and Windows Internals 5th Edition by David A. Solomon and Mark Russinovich.

ABOUT THE AUTHOR:


Gary Olsen is a solution architect in Hewlett-Packard's Technology Services organization and
lives in Roswell, Ga. He has worked in the IT industry since 1981 and holds an M.S. in
computer-aided manufacturing from Brigham Young University. Olsen has authored
numerous technical articles for TechTarget, Redmond Magazine and TechNet magazine, and
he has presented numerous times at the HP Technology Forum. He is a Microsoft MVP for
Directory Services and is the founder and president of the Atlanta Active Directory Users
Group.
Windows Memory Management

Introduction
What is Windows Memory Management? - Overview

Microsoft has, as of operating system versions Vista SP1 and Windows Server 2008,
implemented new technologies, for both resource allocation and security. Several of these
new technologies include the Dynamic Allocation of Kernel Virtual Address Space (including
paged and non-paged pools), kernel-mode stack jumping, and Address Space Layout
Randomization. Basically, the allocation of resources are not fixed, but are dynamically
adjusted according to operational requirements. The implementation of these new
technologies such as Address Space Layout Randomization are mostly due to the hacker
threat of an advanced knowledge of the location of key system components (such
as kernel32.dll, ntdll.dll, etc), and are partly due to the Window’s goal of using memory
allocation more efficiently by allocation on an as needed basis. In order to understand these
new technologies better and be able to use them as a developer, device driver writer, or
system’s administrator, this paper will focus on the Windows Memory Manager prior to
Vista SP1.

How Does the Windows Memory Manager Work?

The purpose of this paper is to therefore give a conceptual understanding to those who
have struggled with memory management as a whole and to explain why these newer
technologies have evolved. It will start with a general view of the Windows Memory
Manager, to get more specific as to how Windows manages used and unused memory. To
illustrate how memory works, tools from the TechNet SysInternals web sites will be
described for memory leaks. The paper will conclude with a brief description of paging lists.

The OS Maps Virtual Addresses to Physical Addresses.

Because the virtual address space might be larger or smaller than the physical memory on
the machine, the Windows Memory Manager has two first-in-rank responsibilities. The first
is to translate, or map, a process's virtual address space into physical memory so that when
a thread is running in the context of that process reads or writes to the virtual address space,
the correct physical address is referenced.

The second one is paging some of the contents of memory to disk when it becomes
overcommitted. That is, when running threads or system code try to use more physical
memory than is currently available-and bringing the contents back into physical memory as
needed.

One vital service provided by the Memory Manager includes memory-mapped files.
Memory-mapping can speed-up sequential file processing due to the fact the data is not
sought randomly, and it provides a mechanism for memory-sharing between processes
(when they are referencing the same DLL, as there must be only one instance at a time of
any DLL.

Most virtual pages will not be in physical memory (RAM), so the OS responds to page faults
(references to pages not in memory) and loads data from disk, either from the system swap
file or from a normal file. Page faults, while transparent to the programmer, have an
important impact on performance, and programs should be designed to minimize faults
(again, if the data was stored in register, this would prevent reading data from a file on disk
or locating data stored in memory to then read that data to write to the system memory
address bus connected to the CPU). The wizards from SysInternals contend that the concern
is not that there is one process that is hard page faulting, but rather a collection of them
hard page faulting. This hard page faulting causes the system to thrash and is a clear
indication the system needs more memory.

Dynamic memory allocated in heaps must be physically in a paging file. The OS’s memory
management controls page movement between physical memory and the paging file and
also maps the process’s virtual address to the paging file. When the process terminates, the
physical space in the file is deallocated.

Windows provides an illusion of a flat virtual address space (4GB), when in reality, there is a
much smaller amount of physical memory. The hardware memory management unit of
today’s microprocessors provides a way for the OS to map virtual addresses to physical
address and it does this in the granularity of a page. The Windows Memory manager
implements a demand paged virtual memory subsystem which is another way of saying that
it is a lazy allocator. In other words, if you launch an application such as Notepad, it does not
launch the entire application and appropriate DLLs into physical memory. It does so as the
application demands: as Notepad touches code pages, as it touches data pages, it's at that
point where the memory manager will make a connection between virtual memory and
physical memory, reading in contents off disk as needed. In short, it is a common
misconception that the memory manager reads the entire executable image off of the disk.
An example of this can be illustrated using process monitor and setting the filter to
something that has been run since a reboot, say, solitaire. After launching solitaire, solitaire
is on the disk. Solitaire, as it starts up, is causing page faults, reading pieces of its own
executable off of the disk on demand. When you stop the logging of the trace-gathered
information and look, you will see an example of a process, sol.exe, reading sol.exe: it is
reading itself, faulting itself onto disk.

As features of Solitaire are used, you will see sol.exe reading various DLLs, as those DLLs are
being virtually loaded -- only the pieces being read are being loaded. Another component of
the Windows Memory Manager is memory sharing. For instance, if you have two instances
of Notepad, the common misconception is that there are two copies of Notepad and
associated DLLs loaded into physical memory. The Windows memory manager will recognize
that is a second instance of Notepad, an image that already has pieces of it in physical
memory and will automatically connect the two virtual images to the same underlying
physical pages. The important part of process startup and applications can take advantage
of that and share memory.
On 32 bit Windows, 2 GB for each process (user), and 2 GB for the system. Just like
applications need virtual memory to store code and data, the operating system also needs
virtual memory to map itself, the device drivers that are configured to load, and also to
store the data that is maintained by the drivers and the OS (kernel memory heaps).

Tools that indicate memory usage often show virtual memory, physical memory, and the
working set. The virtual memory counter does not offer a lot of information when
troubleshooting memory leaks; virtual memory is used to map the code and data of an
application, and an amount that is kept on reserve; that is, most virtual pages will not be in
physical memory, so the OS responds to page faults (references to pages not in memory)
and loads data from the disk. Therefore the virtual memory counter is not effective when
troubleshooting. The private bytes counter indicates the number bytes of private memory
that is private to a process -- it cannot be shared with another process. For instance, if you
launch Notepad and start typing in text, no other process is interested in that data, so it is
private to that process.

What About Memory Leaks?

How do we determine if we have a memory leak and if so, how do we further determine if it
is a process leaking the memory, or if it is in kernel-mode, etc. In Task Manager, there is a
memusage counter that is often used to trace source of a leaker. But the memusage counter
does not actually indicate the private virtual memory for the process. The private bytes
counter in task manager would actually be the virtual memory size. This misleads many who
would assume that the virtual memory size would indicate that this is the amount of the
virtual address space allocation. For reasons such as this, it is better to gather data by using
Process Explorer, a freeware utility written by Mark Russinovitch. This tool uses a device
driver to extract all of the relevant system information applicable to the version of Windows
that you are running and contains a colorful and itemized display of processes, thread
activity, CPU usage (perhaps by threads running that are not accounted for that are
consuming CPU clock cycles), and all of the other needed counters available in the actual
Windows Performance Monitor counters. Three columns needed, particular to this context,
are the private bytes, the private delta, and the private bytes history counters that can be
found in the “select columns” choice of the View menu tool bar. Process Explorer shows the
“process tree” view in order to reveal which processes are running as child processes under
the control of a parent process. The differences are reflected in the colors shown in the logic
of the user interface. Pink colors indicate service host processes (svchost.exe) that run off of
the Services.exe process. The light blue color shows the processes that are running under
the same account as the user, as opposed to a SYSTEM or NETWORK account. The brown
color shows processes that are call jobs, which are simply a collection of processes. These
counters can be dragged and dropped to show the private bytes column next to the private
bytes delta (in which if a negative number pops up means that a process is releasing
memory), and the private bytes history.

If there is a process leak, it will not be related to the Task Manager memusage counter. The
private bytes, the private bytes delta, and the private bytes history, are counters that can be
set to examine private virtual memory to determine if it is a process is leaking memory. A
case in point is that a process can be using an enormous amount of virtual memory, but
most of that could actually not be in use, but kept on reserve. But the private bytes history
counter column shows you a relative comparison of private bytes usage in a process with
respect to all other processes running in the system. To examine this, download the
SysInternals tool TestLimit.exe (or TestLimit64.exe if it is a 64 bit system you're running).The
‘m’ switch on this tool will leak the amount of specified private bytes every one half a
second. That is, if you type c:\windows\system32> testlimit –m 5 you are leaking 10 MB of
private bytes per second. With Process Explorer open and the private bytes, the private
bytes delta, and the private bytes history (a weighted graph indicated by the width of the
yellow) counter columns in view, you would see a growth in the cmd.exe process that would
depart from a flat yellow line and approach a thick yellow line in the private bytes history
counter column. The private bytes delta column would not show a negative sign to the left
of any numerical figure, but only a positive number to indicate that it is not releasing any
memory. Control-C the testlimit program and the memory is recycled back to machine.

What would happen if we did not Control-C (terminate) the process? Would it have
exceeded the amount of virtual address space allocated? It would, in fact, be stopped
sooner than that by reaching an important private bytes limit called the “commit limit”. The
system commit limit is the total amount of private virtual memory across all of the
processes in the system and also the operating system that the system can keep track of at
any one time. It is a function of two sizes: the page file size(s) (you can have more than one)
+ (most of) physical memory.

The amount of physical memory that the operating system assigns to each process is called
its working set. Every process starts out with an empty or zero-sized working set. As the
threads of the process begin to touch virtual memory addresses, the working set begins to
grow. When the operating system boots up, it has to decide how much physical memory will
be given with respect to each process, as well as how much physical memory it needs to
keep for itself to both store cached data and to keep free.So the sizes of the working sets of
individual processes are ultimately determined by the Windows Memory Manager. The
Windows Memory Manager monitors the behavior of each process and then determines the
amount of physical memory based on its memory demands and paging rates. In effect, the
Windows Memory Manager decides if a process needs to grow or shrink, while trying to
satisfy all of these process’s demands as well as the demands of the operating system itself.

The above would indicate that application launching would be a time-consuming operation.
As of Windows XP, Windows introduced a mechanism to speed up application launching
called the logical prefetcher. Windows monitors the page faults (recall that during
application start-up, the process reads itself, faulting itself, reading pieces of its own
executable, off of disk on demand) during application start-up, and further defines the start-
up as the first ten seconds of an application's activity. It saves a record of this information in
a prefetch folder that resides in the Windows directory. Deleting these files would only
harm system performance because these .pf files were written by a system process and the
data was extracted from the kernel. So in terms of the working set, the Task Manager shows
this working set with the memusage counter. Process Explorer shows the current and peak
working set numbers in a separate counter. The peak working set is the most physical
memory ever assigned to a process.
Windows automatically shares any memory that is shareable. This means the code: pieces
of any executable or DLL. Only one copy of an executable or a DLL is in memory at any one
time. This also includes an instance Terminal Server. If several users are logged on and they
are using Outlook, one copy of Outlook is read off of disk (or the demanded pieces of it) to
be resident in memory. If one user starts using other features of Outlook, they are read off
on demand. If a second user starts using those same features, they are already resident in
memory. This apart from parts of the executable and the DLLs, file data is also shared. To
reiterate, it is not the memusage or the working set counters that are of help in memory
leaks. The working set will grow, and then the Memory Manager will decide that it has
gotten too big and shrink it. A quick fix is to sometimes add more memory. If one process is
“hard page faulting”, that is not an indication that the system needs more memory,
although there is a performance impact. If a collection of processes begin to excessively
hard page fault, then this is a clear indication that the system needs more memory.

Why memusage Working Set Columns are not Memory Leaker Indicators

If the working set keeps growing and it reaches a point, the Windows Memory Manager will
block that growth because it has decided that this working set is too big, and there are other
consumers of physical memory. If, at this point, the process begins to leak virtual memory
but is not physically using any more memory, the Memory Manager begins to reuse the
physical memory to store new data that it might be referencing through the newly allocated
virtual memory.

The working set is growing as the threads are touching virtual address spaces and the
process is touching different pages that have been brought into the working set; at some
point the Memory Manager says enough to that process; that there others that need just as
much as you do. So as a process requests a page, the memory manager takes away a page,
and, obviously takes the oldest pages away first. That is, pieces of the working set that have
not been accessed for the longest time are pulled out. When those pages are pulled out,
they are not overwritten, zeroed out, or destroyed, because they do represent a copy of
data that was once being used by this process. So Windows keeps those on several paging
lists.

To understand the performance counters so as to use them and determine if your system
needs more physical memory, it is necessary to delve into the internals of how the Windows
organizes the memory that is not currently owned by a process. This is memory that is not in
a working set. The way that the Windows Memory Manager keeps track of this is that it
keeps track of this unassigned memory in one of four paging lists. These unowned pages are
organized by type:

 Free page list


 Modified page list
 Standby page list
 Zero page list

It is necessary to start out with the modified and standby page list first. When the Memory
Manager pulls a page out of a process's working set, it is pulling out a page that the process
may still need. It may have to be reused by that process; it may (being on the standby or
modified page list) represent code or a DLL of an image and be reused by another process.
The list that the page goes to depends on whether or not the page has been modified or not.
If the page gets written to, then the Memory Manager has to ensure that the page gets
written back to the file that it came from. That file might be a file that came from disk, such
as a data file that is mapped into the process's address space. If the process modifies that
page and it gets removed from the processes working set, then the Memory Manager has to
make sure that page makes it back to that file on disk that was being modified. If the file has
been modified but does not represent data mapped into the virtual address pace, then it
may represent private data to that process that it might want to use again.

Pages that have not been modified go to the standby list. The modified page list is called the
"dirty" list and the standby page list is called the “clean" list. After pages have been written
to disk, those pages move from the modified list to the standby list.

The pages on the modified or standby list that are brought back into the working set are
called soft faults -- not paging file reads or mapped file reads -- because there is no disk I/O.
If the data being referenced is no longer in memory because it is back on the file on disk or
back on the paging file, then the system would incur a hard fault and have to do a paging
read operation and bring it back into memory.

The free page list doesn't exist when the system boots and only grows when private
memory is returned to the system. Private memory would be a piece of a process address
space such as a buffer that contains the text that you have typed into Notepad. When
Notepad exits, whether you have saved that data or not, the memory inside the Notepad
process address space that contains that private memory is returned to the free list. For
example, if you launch Notepad and start typing text, that data is not usable by any other
process. The keystrokes are buffered; so other process is interested in that data saved or
not. So that memory is returned to the free page list. Private process memory is never
reused without first being zeroed.This free page list is where the Memory Manager goes
when it needs to perform a page read. When a page fault is occurring, the Memory Manager
is going to an I/O that is going to overwrite the contents of the page completely. So when
the Memory Manager has a page fault and it needs to find a free page to read a piece from
the file in from the disk, it goes to the free list first (if there is anything there).

When, however, the free page list gets to be a certain size, a kernel thread called the zero
page thread is awakened (this thread is the only thread that runs at priority 0). Its job is to
zero out those dirty pages so when Windows needs zeroed pages, it has them at hand.

References
 The Sysinternals Video Library: Troubleshooting Memory Problems, by Mark Russinovitch
and David Solomon
 Windows Internals 4th Edition written by Mark Russinovitch and David Solomon
 Windows Systems Programming 2nd Edition written by Johnson M. Hart
https://www.codeproject.com/Articles/29449/Windows-Memory-Management
The Virtual-Memory Manager in
Windows NT
Randy Kath
Microsoft Developer Network Technology Group

Created: December 21, 1992

Abstract
This article provides an in-depth survey of the memory management system in Windows
NT™. Specifically, these topics are explored in detail:

 Virtual memory in Windows NT

 32-bit virtual addresses

 Page directory, page tables, and page frames

 Translating a virtual address

 Process integrity

 Reserved and committed memory

 Translation lookaside buffers

 Page-table entry structure

 Page faults

 Sharing pages across process boundaries

 Prototype page-table entries

 Copy-on-write page optimization

 Characteristics of the virtual-memory manager

 The page-frame database

 Managing a working set of pages for each process

This article does not discuss the Win32™ memory management application programming
interface (API). Instead, several other technical articles on the Microsoft Developer
Network CD should be referenced for issues related to understanding how to manage
memory with the Win32 API. Those articles provide both insight into the system and
understanding of the functions themselves. While this article primarily deals with
Windows NT-specific memory management issues, it does refer to some of the memory
objects in the Win32 subsystem (like memory-mapped files and dynamic heaps) in an
attempt to shed some light on the age-old dilemma of performance vs. resource usage
as it applies to applications written for the Win32 subsystem in Windows NT.
Introduction
As the size of applications and the operating systems that run them grow larger and
larger, so do their demands on memory. Consequently, all modern operating systems
provide a form of virtual memory to applications. Being the newest of the operating
systems to hit the main stream, Windows NT™ will likely have applications ported to it
that will evolve into larger monstrosities that require even more memory than they did
on the last operating system on which they ran. Even applications being written
exclusively for Windows NT will be written with the future in mind and will no doubt take
advantage of all the memory that is available to them.

Fortunately, Windows NT does, in fact, offer virtual memory to its applications (or
processes) and subsystems. Windows NT provides a page-based virtual memory
management scheme that allows applications to realize a 32-bit linear address space for
4 gigabytes (GB) of memory. As a result, each application has its own private address
space from which it can use the lower 2 GB—the system reserves the upper 2 GB of
every process's address space for its own use.

Figure 1. A process in Windows NT has a 4-GB linear address space, of which


the lower 2 GB is available for applications to use.

As illustrated in Figure 1, each process can address up to 4 GB of memory using 32-bit


linear addresses. The upper half of the address space is reserved for use by the system.
Because the system is the same for each process, regardless of the subsystem it runs on,
similar pages of system memory are typically mapped to each process in the same
relative location for efficiency.

Note The Win32™ subsystem provides user services that are loaded as dynamic-link
libraries (DLLs) into the lower portion of the address space of a process. These DLLs
exist in addition to the system DLLs that occupy the upper portion of the address space.
Depending on which DLLs an application links or loads, the DLLs that are mapped into
the lower portion of a process's address space will vary from one application to the next
within a subsystem.
If only we had PCs with similar memory capacities. . . . Actually, a computer doesn't
really need 4 GB of physical memory for Windows NT to operate effectively—though the
general rule of virtual memory systems is the more physical memory, the better the
performance. Windows NT's memory management system virtualizes memory such that
to each application it appears as though there is 2 GB of memory available, regardless of
how much physical memory actually exists. In order to do this, Windows NT must
manage memory in the background without regard to the instantaneous requests that
each application makes. In fact, the memory manager in Windows NT is a completely
independent process consisting of several threads that constantly manage available
resources.

Windows version 3.x has realizable limitations to the maximum amount of memory
available to it and all of its applications; these are often barriers to large applications for
this environment. Windows NT's limits are far more theoretical. Windows NT employs the
PC's hard disk as the memory-backing store and, as such, has a practical limit imposed
only by available disk space. So, it is reasonable to assume that a Windows NT system
could have an extremely large hard disk or array of disks amounting to 2 GB or more of
physical memory and provide that much virtual memory to each of its applications
(minus the portions used by the system, occupied by the file system, and allocated by
files stored within the file system). In short, Windows NT provides a seemingly endless
supply of memory to all of the applications running on it.

Virtual Memory in Windows NT


The virtual-memory manager (VMM) in Windows NT is nothing like the memory
managers used in previous versions of the Windows operating system. Relying on a 32-
bit address model, Windows NT is able to drop the segmented architecture of previous
versions of Windows. Instead, the VMM employs 32-bit virtual addresses for directly
manipulating the entire 4-GB process. At first this appears to be a restriction because,
without segment selectors for relative addressing, there is no way to move a chunk of
memory without having to change the address that references it. In reality, the VMM is
able to do exactly that by implementing virtual addresses. Each application is able to
reference a physical chunk of memory, at a specific virtual address, throughout the life
of the application. The VMM takes care of whether the memory should be moved to a
new location or swapped to disk completely independently of the application, much like
updating a selector entry in the local descriptor table (LDT).

Windows versions 3.1 and earlier employed a scheme for moving segments of memory
to other locations in memory both to maximize the amount of available contiguous
memory and to place executable segments in the location where they could be executed.
An equivalent operation is unnecessary in Windows NT's virtual memory management
system for three reasons. One, code segments are no longer required to reside in the 0-
640K range of memory in order for Windows NT to execute them. Windows NT does
require that the hardware have at least a 32-bit address bus, so it is able to address all
of physical memory, regardless of location. Two, the VMM virtualizes the address space
such that two processes can use the same virtual address to refer to distinct locations in
physical memory. Virtual address locations are not a commodity, especially considering
that a process has 2 GB available for the application. So, each process may use any or
all of its virtual addresses without regard to other processes in the system. Three,
contiguous virtual memory in Windows NT can be allocated discontiguously in physical
memory. So, there is no need to move chunks to make room for a large allocation.

The foundation for the system provides the answer to how VMM is able to perform these
seemingly miraculous functions. VMM is constructed upon a page-based memory
management scheme that divides all of memory into equal chunks called pages. Each
page is 4096 bytes (4K) in size with no discrimination applied as to how a page is used.
Everything in Windows NT—code, data, resources, files, dynamic memory, and so forth—
is implemented using pages of physical memory.

Because everything in the system is realized via pages of physical memory, it is easy to
see that pages of memory become scarce rather quickly. VMM employs the use of the
hard disk to store unneeded pages of memory in one or more files called pagefiles.
Pagefiles represent pages of data that are not currently being used, but may be needed
spontaneously at any time. By swapping pages to and from pagefiles, the VMM is able to
make pages of memory available to applications on demand and provide much more
virtual memory than the available physical memory. Also, pagefiles in Windows NT are
dynamic in size, allowing them to grow as the demands for pages of memory grow. In
this way, Windows NT is able to provide virtually unlimited memory to the system.

Note A detailed discussion on how the virtual-memory manager performs the functions
mentioned here is presented later in this article in the section "The Virtual-Memory
Manager (VMM)."

32-Bit Virtual Addresses


One of the conveniences of the 32-bit linear address space is its continuity. Applications
are free to use basic arithmetic on pointers based on 32-bit unsigned integers, which
makes manipulating memory in the address space relatively easy. Though this is how
addresses are viewed by an application, Windows NT translates addresses in a slightly
different manner.

To Windows NT, the 32-bit virtual address is nothing more than a placeholder of
information used to find the actual physical address. Windows NT separates each 32-bit
virtual address into three groups. Each group of bits is then used independently as an
offset into a specific page of memory. Figure 2 shows how the 32-bit virtual address is
divided into three offsets, two containing 10 bits and one containing 12.

Figure 2. A 32-bit virtual address in Windows NT is divided into page offsets


that are used for translating the address into a physical location in memory.

Page Directory, Page Tables, and Page Frames


The first step in translating the virtual address is to extract the higher-order 10 bits to
serve as the first offset. This offset is used to index a 4-byte value in a page of memory
called the page directory. Each process has a single, unique page directory in the Win32
subsystem. The page directory is itself a 4K page, segmented into 1024 4-byte values
called page-directory entries (PDEs). The 10 bits provide the exact number of bits
necessary to index each PDE in the page directory (210 bits = 1024 possible
combinations).

Each PDE is then used to identify another page of memory called a page table. The
second 10-bit offset is subsequently used to index a 4-byte page-table entry (PTE) in
exactly the same way as the page directory does. PTEs identify pages of memory
called page frames. The remaining 12-bit offset in the virtual address is used to address
a specific byte of memory in the page frame identified by the PTE. With 12 bits, the final
offset can index all 4096 bytes in the page frame.

Through three layers of indirection, Windows NT is able to offer virtual memory that is
unique to each process and relatively independent of available physical resources. Also,
embedded within this structure is the basis for managing all of memory based on 4K
pages. Every page in the system can be categorized as either a page directory, page
table, or page frame.

Realizing 4 GB of Address Space


Translating a virtual address from page directory to page frame is similar to traversing a
b-tree structure, where the page directory is the root; page tables are the immediate
descendants of the root; and page frames are the page table's descendants. Figure 3
illustrates this organization.

Figure 3. Translating a virtual address is similar to traversing a b-tree structure.

A page directory has up to 1024 PDEs or a maximum of 1024 page tables. Each page
table contains up to 1024 PTEs with a maximum of 1024 page frames per page table.
Each page frame has its own 4096 one-byte locations of actual data. All totaled, the 32-
bit virtual address can be translated into 4 GB of address space (1024 * 1024 * 4096).
Yet, there is still the question of the pages that are used to represent the page tables
and page directory.

Looking closely at Figure 3 reveals that a considerable amount of overhead is required to


completely realize all of the page frames in memory. In fact, to address each location in
the 4-GB address space would require one page directory and 1024 page tables.
Because each page is 4K of memory, 4 MB (approximately) of memory would be needed
just to represent the address space ( [1024 page tables + 1 page directory] * 4096
bytes/page).

Although that may seem like a high price to pay, it really isn't, for two reasons: First, 4
MB is less than 0.1 percent of the entire 4-GB address space, which is a reasonably small
amount of overhead when you consider comparable operating systems. Second,
Windows NT realizes the address space as it is needed by the application, rather than all
at once, so page tables are not created until the addresses they are used to translate are
needed.

Translating a Virtual Address


Specifically, how does Windows NT translate a 32-bit virtual address into a specific
memory location? As an example, take a look at an address in the process of any Win32-
based application. This is easily performed by running the Windows NT debugger,
WINDBG.EXE. Simply load an application, step into the WinMain code for the application,
and choose to view local variables. A typical address is 043612FF16 or 0000 0100 0011
0110 0001 0010 1111 11112.

The first 10-bit offset is 00 0001 00002 or 0x01016. Shift the bits left two spaces to
form a 12-bit value where the lower-order significant bits are padded with zeros,
resulting in the value 0000 0100 00002 or 0x06016. The bit shifting provides an easy
mechanism for indexing the page on 4-byte boundaries. Use this value to index into the
4K page directory. The 4-byte PDE at this location identifies the page table for this
address.

Repeat this method for the second 10-bit sequence, using the page table instead of the
page directory. The index shifted left two spaces is 1101 1000 01002 or D8616. The PTE
at this index identifies the specific page frame. To index a one-byte location in the page
frame, use the final 12 bits just as they are. Figure 4 demonstrates this process
pictorially.

Figure 4. Pages of memory are used to represent the page directory, page
tables, and page frames for each process.

Individual Process Integrity


Because every process has its own page directory, Windows NT is able to preserve the
integrity of each process's address space. This is both good news and bad news for those
programming applications destined for the Windows NT operating system. The good
news is that an application is secure against unwarranted, perhaps accidental, intrusion.
Also, as stated earlier, you are free to use as much of the address space as you need
without regard to the impact it has on other processes.

Now for the bad news. Because two processes can use the same virtual address to refer
to different locations in memory, processes are not able to communicate addresses to
one another. This makes sharing memory much more difficult than in the past and
requires that you use specific mechanisms to share memory with other processes. For
more information on sharing memory in Windows NT, refer to the section "Sharing Pages
Across Process Boundaries" later in this article.

There is more bad news when it comes to stray pointers. Having exclusive access to your
own process means that you own the entire space (that is, the lower 2 GB).
Consequently, if you have a stray pointer, it is much more difficult to detect. For
example, a pointer that runs past the bounds of a designated array only exists in your
own address space. Whether you actually committed that memory is another question,
but you can be certain that it won't pounce on another process's memory. A stray
pointer can point to two things: either a memory location you have committed for some
other purpose or an invalid address (one that memory has not been committed to). The
latter case generates an access violation exception, which you can handle easily enough
through structured exception handling. The former case will simply result in a successful
read or write operation. So, there is no way of knowing that a problem even occurred.

Reserved vs. Committed Memory


In Windows NT, a distinction exists between memory and address space. Although each
process has a 4-GB address space, rarely if ever will it realize anywhere near that
amount of physical memory. Consequently, the virtual-memory manager must keep
track of the used and unused addresses of a process, independent of the pages of
memory it is actually using. In actuality this amounts to having a structure for
representing all of the physical memory in the system and a structure for representing
each process's address space.

As part of the process object (the overhead associated with every process in Windows
NT), the VMM stores a structure called the virtual address descriptor (VAD) tree to
represent the address space of a process. As address space gets used for a process, the
VMM updates the VAD tree to reflect which addresses are used and which are not.
Fortunately, Windows NT recognizes the value of managing address space independent
of memory and extends this capability to subsystems. Further, the Win32 subsystem
provides this capability through the VirtualCreate API. With this function, applications
can reserve a range of addresses for use at a later time.

Many parts of the Win32 subsystem make use of the reserved memory feature. Take
stack space, for example. A Win32-based application can reserve up to 1 MB of memory
for the stack while actually committing as little as 4K of space initially. Memory-mapped
files represent another example of reserving a range of addresses for use at a later time.
In this case, the address range is reserved with the function CreateFileMapping until
portions are requested via a call to function MapViewOfFile. This permits applications
to map a large file (it is possible to load a file 1 GB in size in Windows NT) to a specific
range of addresses without having to load the entire file into memory. Instead, portions
(views) of the file can be loaded on demand directly to the reserved address space.

One other benefit of reserving address space resides in the fact that the addresses are
contiguous by default. Windows NT uses this technique for mapping each application's
code and DLL files to specific addresses. Because the content of code is executed
sequentially, the address space that references it must also be contiguous. By reserving
address space, Windows NT need load only the pages that are being used; the rest are
loaded on demand.

Another nice feature of reserved memory is that Windows NT can reserve address space
for page tables, as well as other pages. From a more global perspective, this means that
the 4 MB of memory required to realize a process's 4-GB address space is also not
committed until needed. A possible drawback to this feature is that a process takes a
performance hit on page faults. When translating virtual addresses that have only been
reserved, the process generates a page fault for every reserved page it accesses.
Because of this, a process could generate a page fault more than once on a single
address translation. More discussion on page faults is presented below in the section
"Page Faults."

Translation Lookaside Buffers (TLBs)


Considering all the work Windows NT does to retrieve the physical address of a page,
addressing in general seems a little on the inefficient side. After all, to translate a single
virtual address, the VMM must access memory in three physical pages of memory.
However, Windows NT uses another addressing scheme in parallel with the virtual
address translation technique described above.

Windows NT exploits the capability of modern CPUs by putting the translation lookaside
buffer (TLB) to use. The TLB (often referred to as the internal or on-chip cache) is
nothing more than a 64K buffer that is used by hardware to access the location of a
physical address. Specifically, Windows NT uses the TLB to provide a direct connection
between frequently used virtual addresses and their corresponding page frames. Using it,
the VMM is able to go from a virtual address directly to the page frame, thereby avoiding
translation through both the page directory and page table.

Because the TLB is a hardware component, its contents can be searched and compared
completely in parallel with the standard address translation performed in software. So,
the time saved in translating a virtual address comes without a tradeoff. Also, being a
hardware mechanism, it is extremely fast in comparison to the software translation
described earlier. A big win all the way around. Too bad there isn't room for more than
32 entries! Being a hardware component also has its limitations.

Each entry in the TLB consists of a virtual address and a corresponding PTE to identify
the page frame. Consecutively addressing two locations in the same page of memory
generates only one entry in the TLB in order to reduce redundancy and save precious
TLB space. Every time an address is translated in software and references a new page
frame, an entry is added to the TLB. Once the TLB is full, every new entry requires a
previous entry to be dropped from the buffer. The algorithm for dropping entries from
the TLB is simple: drop the least recently used page from the list.

Flushing the Buffer on Context Switches


Although Windows NT appears to run all threads concurrently, in actuality only one
thread can execute at any given time. So, Windows NT schedules each thread to execute
for a small amount of time, often referred to as a time slice. When the time slice expires,
Windows NT schedules the next thread to run during its slice. This continues until all
threads have had a chance to execute during their slice. At that time, Windows NT
schedules the first thread again, repeating this process indefinitely. The act of stopping a
thread at the end of its time slice and scheduling another thread is called context
switching.

When a thread is running, it spends much of its time translating addresses and relies on
the TLB to make this as fast as possible. Yet, a single context switch can render the TLB
useless. The address translations of one thread will be incorrect for another thread
unless the second thread is a thread of the same process. The chances of the thread
being from the same process are remote at best, and even if it were from the same
process, addresses used by the second thread are likely to be entirely different from
those of the first.

Consequently, the buffer is automatically flushed when context switching between


threads on the Intel platform—a hardware feature. The millions-of-instructions-per-
second (MIPS) implementation of Windows NT does not flush the buffer during context
switches, but it does provide a 36-bit address space that Windows NT makes use of for
this purpose. Windows NT uses the extra four address bits to identify which process is
responsible for each TLB entry.

When you consider the rate at which Windows NT performs context switches, it seems at
first that this action would nullify any gains the TLB could offer. In actuality, though,
consider that the duration between context switches is on the order of 17 milliseconds.
On a machine capable of 5 million instructions per second (a typical Intel-based 80386
33MHz is on the order of 10-15 MIPS), this will amount to 85,000 instructions per
context switch (17 milliseconds * 5 MIPS). Then, take into account the fact that on
average an application attributes 30–50 percent of its time addressing memory, and you
get 25,500–42,500 address instructions per context switch. Also, the TLB will become
completely full after executing 32 address instructions that reference a unique page of
memory. So, the TLB will likely be refilled very soon after the context switch occurs so
that it again becomes useful for nearly all address instructions.

Associative Buffer Limitation


One limitation exists in the translation lookaside buffer due to its tight integration with
hardware. Because there are only 32 entries in the TLB, and each of these entries must
be able to map to physical addresses, there is some restriction as to which address each
entry can contain. Because of this, some overlap exists, creating the possibility that the
each of TLB entries capable of containing a specific address could all be in use at one
time. In that case, accessing a page of memory that can only be represented in one of
these entries forces the least recently used entry to be dropped from the list.

The worst-case scenario for this situation is that repetitively executing code at five
different addresses that can be located in only four TLB entries results in complete
address translation for all five addresses. What happens is each new address translation
ends up replacing the least recently used entry in the TLB. Then, if the next of these five
addresses is the one that was just replaced, it must be translated over again because it
was just replaced in the list. This is repeated for each of the five addresses in a cyclical
fashion, making the TLB useless. Although the possibility of this type of occurrence does
exist, it has proven an extremely rare occurrence.

Page-Table Entry Structure


Address translation has two aspects: breaking a virtual address into three offsets for
indexing into pages of memory and locating an actual physical page of memory. The
page directory and page table entries mentioned earlier are used for this purpose.

Once memory is committed for a range of reserved addresses, it exists as either a page
in random access memory (RAM) or in a pagefile on disk. A page-table entry identifies
the location of the page, its protection, its backing pagefile, and the state of the page, as
shown in Figure 5.
Figure 5. Page-table entries are used to provide access to physical pages of
memory.

The first 5 bits are dedicated to page protection for each page of memory. The Win32
API exposes PAGE_NOACCESS, PAGE_READONLY, and PAGE_READWRITE protection to
applications written for the Win32 subsystem.

Following the protection bits are 20 bits that represent the physical address of the page
in memory if it is resident. Note that these 20 bits can address any 4K page of memory
in the 4-GB address space (220 * 4096 = 4 GB). If the page of memory is paged to disk,
the twenty address lines are used as an offset into the appropriate pagefile to locate the
page.

The next four bits are used to indicate which pagefile backs this page of memory. Each
of the 16 possible pagefiles can be uniquely identified with these four bits.

The final three bits indicate the state of the page in memory. The first bit is a flag
indicating pages in transition (T); the second indicates dirty pages, pages that have been
written to but not saved (D); and the third indicates whether each page is present in
memory (P). The state table below represents the possible states of a page.

Table 1. Page-Table Entry Page States

T D P Page state

0 - 0 Invalid page

- 0 1 Valid page

- 1 1 Valid dirty page

1 0 0 Invalid page in transition

1 1 0 Invalid dirty page in transition

When a page is not present and not in transition, the dirty bit is ignored. Also, only when
a page is present, the transition bit is ignored.

The above description of a PTE applies to all pages of memory that are backed by one of
the 16 pagefiles. Yet, in Windows NT, not all pages of memory are backed by these
pagefiles. Instead, Windows NT backs pages of memory that represent either code or
memory-mapped files with the actual file they represent. This provides a substantial
savings of disk memory by eliminating redundant information on the disk. When a page
of this type is present in memory, the PTE is structured just as described above for
present pages and pages in transition. When a page is not present in memory, the PTE
structure changes to provide 28 bits that can be used for addressing an entry in a
system data structure. This entry references the name of a file and a location within the
file for the page of memory. To get 28 bits in the PTE, the four pagefile bits and four of
the protection bits are sacrificed, while the three state bits remain intact.

Page Faults
When Windows NT addresses an invalid page (that is, during the course of address
translation one of the PTEs identifies a page as not present), a page-fault exception is
raised by the processor. A page fault results in switching immediately to the pager. The
pager then loads the page into memory and, upon return, the processor re-executes the
original instruction that generated the page fault. This is a relatively fast process, but
accumulating many page faults can have a drastic impact on performance.

It is possible that during translation of a single virtual address, as many as three page
faults can occur. This is due to the fact that, in the worst case, a virtual address may be
realized only by accessing a page directory, page table, and page frame where none of
these pages is present in memory and each generates a separate page fault when
accessed.

This is one instance in which the TLB can really improve performance of memory
addressing by avoiding the page fault associated with loading a page table and page
directory, reducing multiple fault possibilities to at most a single page fault. A similar
reduction of page faults from two to one occurs when, during translation, a page
directory is present but the page table and page frame are not. In both cases, the page
frame is not present, so one page fault is required to retrieve the page frame. On the
other hand, if the page frame is present but the page table and page directory are not,
two page faults are avoided.

Sharing Pages Across Process Boundaries


Given that each process has its own page directory, sharing memory between processes
is anything but trivial. Because the page directory is the root of an address as shown in
Figure 4, an address in one process is essentially meaningless in the context of another
process. If, in fact, an address in one process is a valid address in a second process, it is,
at best, a coincidence. Yet on the other hand, there is no physical barrier preventing
page tables from two processes from having identical PTEs that point to the same page
frame. So, it is feasible for Windows NT to provide memory sharing in this way.

However, there is one glaring inefficiency in this scheme: What happens when the state
of a shared page is changed? Say, for example, that a shared page is written to by one
of four processes sharing that page. Then the system would have to update the four
PTEs, one entry in the page table for each of the four processes. This would not only be
an expensive performance hit on a single write, but there is no way of determining which
page tables reference a specific page frame. There would have to be some type of
overhead put in place that reverse-referenced all PTEs that reference a shared page
frame.

Prototype PTEs
For Windows NT, a better implementation than the scheme described above was chosen
for sharing memory. Rather than having multiple PTEs point to the same physical page,
another layer of page tables was put in place exclusively for shared memory. When two
or more processes share a page of memory, an additional structure called a prototype
page-table entry is used to reference the shared page. Each process's PTE points to the
prototype PTE, which, in turn, points to the actual shared page. The prototype PTE is
also a 32-bit quantity that directly references the page frame of the shared page. Figure
6 illustrates this new indirect approach.

Figure 6. Prototype page-table entries are used to share pages of memory


between processes.

Performance Hit on Prototype PTEs


Prototype page-table entries are not without their own performance hit. They are
implemented as a global system resource mapped into the upper address space of all
processes. A maximum of 8 MB of space is reserved for use by the system to support a
prototype PTE data structure for all shared pages. Prototype PTEs are allocated
dynamically as they are needed by the system, so no memory is wasted on supporting
nonexistent shared pages. The biggest performance hit occurs in accessing one of these
nonexistent shared pages. The additional layer of indirection means that translating a
virtual address to a shared page could mean as many as four page faults, instead of
three.

Shared memory in Windows NT has several uses, the most common of which is code
sharing. Code is shared by default so that running multiple instances of an application
reuses as much of the existing resources as possible. Also, memory-mapped files are
implemented as shared memory. Finally, the Win32 subsystem provides a feature that
enables processes to share data via a DLL. For more information on the memory sharing
capabilities provided in the Win32 subsystem, refer to the article "Managing Memory
Mapped Files in Win32" on the Developer Network CD (Technical Articles, Win32).

Copy-on-Write Optimization
By default, all code pages have PAGE_READWRITE protection in Windows NT. This
characteristic makes life easy for applications like debuggers. Because a debugger can
write code pages, it is relatively easy to embed break points and single-step execution
instructions in the code itself. Yet, this also raises another issue. What if the code being
debugged was also being executed by another process simultaneously? The act of
writing a break-point instruction to the code page would affect both the process being
debugged and the other process. On the other hand, having duplicate copies of code,
one for each instance of a process, would be redundant and wasteful.

The solution to this problem is in an optimization called Copy-on-write. I have already


discussed how a prototype PTE is used for all code pages to make them capable of being
shared among different processes. In addition to the shareability of code pages,
Windows NT gives code pages another special characteristic that enables them to be
copied, if necessary, and backed by the pagefile. Copying would only occur if and when a
write ever occurred to a code page. The optimization resides in the fact that copying
does not occur unless necessary, as determined by the act of writing to a page.
Consequently, only pages that are written to are copied, saving precious memory
resources.

The Virtual-Memory Manager (VMM)


The virtual-memory manager in Windows NT is a separate process that is primarily
responsible for managing the use of physical memory and pagefiles. To do this, it must
track each page of physical memory, tune the working set of all active processes, and
swap pages to and from disk both on demand and routinely. The VM manager is an
executive component of Windows NT that runs exclusively in kernel mode. Because of
the time-critical nature of the code that is executed by the virtual-memory manager, the
VMM code resides in the small section of memory called nonpaged pool. This memory is
never paged to disk.

The Page-Frame Database


The virtual-memory manager uses a private data structure for maintaining the status of
every physical page of memory in the system. The structure is called the page-frame
database. The database contains an entry for every page in the system, as well as a
status for each page. The status of each page falls into one of the following categories:

Valid A page in use by an active process in the system. Its PTE is marked as
valid.

Modified A page that has been written to, but not written to disk. Its PTE is
marked as invalid and in transition.

Standby A page that has been removed from a process's working set. Its PTE is
marked as invalid and in transition.

Free A page with no corresponding PTE and available for use. It must first
be zeroed before being used unless it is used as a read-only page.

Zeroed A free page that has already been zeroed and is immediately available
for use by any process.

Bad A page that has generated a hardware error and cannot be used by
any process in the system.

Most of the status types are common to most paged operating systems, but the two
transitional page status types are unique to Windows NT. If a process addresses a
location in one of these pages, a page fault is still generated, but very little work is
required of the VMM. Transitional pages are marked as invalid, but they are still resident
in memory, and their location is still valid in the PTE. The VMM merely has to change the
status on this page to reflect that it is valid in both the PTE and the page-frame database,
and let the process continue.

The page-frame database associates similar pages based on each page's status. All the
pages of a given type are linked together via a linked list within the database; see Figure
7. These lists are then traversed directly according to status. This enables the VM
manager to locate three pages marked Free, for example, without having to search the
entire database independently for each Free page. Another way of thinking of the
database entries is to consider them as existing in six independent lists, one for each
type of page status.

Figure 7. The page-frame database records the status of pages of physical


memory.

Each page-frame entry in the database also reverse-references its corresponding PTE.
This is necessary so that the VM manager can quickly return to the PTE to update its
status bits when the status of a page changes. The VM manager is also able to reverse-
reference prototype PTEs to update their status changes, but note that the prototype
PTE does not reverse-reference any of its corresponding PTEs.

The VMM uses the page-frame database any time a page of memory is moved in or out
of memory or its state changes. Take, for example, a process that attempts to address a
specific memory location in a page that had been paged to disk. The translation for this
virtual address would generate a page fault as soon as an attempt to access the page
referenced by the PTE occurred. The VMM would then allocate a physical page of
memory to satisfy the request. Depending on the current state of the system, allocating
a page may be as easy as changing the PTE for the page to Valid and updating the page-
frame database for that page; such is the case for transitional pages as described above.
On the other hand, the VMM may be required to steal a Modified page from another
process; write the page to disk; update the PTE in the page table of the other process as
not in transition; zero the page; read in the new page from a pagefile; update its PTE to
indicate a valid page; and update the page-frame database to represent the physical
page as Valid.
Periodically, the VMM updates the page-frame database and the state of transitional
pages in memory. In an effort to keep a minimum number of pages available to the
system at all times, the VMM moves pages (figuratively speaking) from either the
Modified or Standby list to the Free list. Modified pages must be written to disk first and
then marked as Free. Standby pages do not need to be written because they are not
dirty. Free pages are eventually zeroed and moved to the Zeroed list. Pages in the Free
and Zeroed lists are immediately available to processes that request pages of memory.
Each time a page is moved from one list to the next, the VMM updates the page-frame
database and the PTE for the page. It is important to note that pages in either of the
transition states are literally in transition from Valid pages to Free pages.

Managing a Working Set of Pages for Each Process


Another part of the VMM gets pages into the transitional state. The thread that gets
transitional pages must continually decide what data is most deserving of replacement
on a process-by-process basis. The algorithm for deciding which page to replace is
typically based on predicting the page that is least likely to be needed next. This
prediction is influenced by factors such as what page was accessed least often and what
page was accessed the longest time ago. In Windows NT, the component responsible for
making these predictions is called the working-set manager.

When a process starts, the VMM assigns it a default working set that indicates the
minimum number of pages necessary for the process to operate efficiently (that is, the
least amount of paging possible to fulfill the needs of the process without starving the
needs of other processes). The working-set manager periodically tests this quota by
stealing Valid pages of memory from a process. If the process continues to execute
without generating a page fault for this page, the working set is reduced by one, and the
page is made available to the system. This test is performed indiscriminately to all
processes in the system, providing the basis for the free pool of pages described above.
All processes benefit from this pool by being able to allocate from it on demand.

The act of stealing a page from a process actually occurs in two stages. First, the
working-set manager changes the PTE for the page to indicate an invalid page in
transition. Second, the working-set manager also updates the page-frame database
entry for the physical page, marking it as either Modified or Standby, depending on
whether the page is dirty or not.

Conclusion
Developers for Windows NT face many new challenges on their way to becoming
proficient with the operating system. Understanding how to manage memory effectively
is likely to be one of the more difficult challenges—and probably the most important one.
Figuring out whether to use virtual memory, memory-mapped files, heap memory, or
space on a thread's stack for implementing specific types of data is representative of the
kind of decision that developers routinely face when developing applications for Windows
NT. Such a decision depends on knowing how and when the operating system allocates
specific resources—such as the virtual address descriptor (VAD) tree, the translation
lookaside buffer (TLB), page tables and page-table entries (PTEs), prototype PTEs, the
page-frame database, a process's virtual address space, system pagefiles, and so on. A
thorough knowledge of each of these system resources and how they are affected by
specific application programming interface (API) functions is the key to mastering
Windows NT.

This technical article identifies each of the components of the virtual memory
management system in Windows NT, focusing on answers to these questions:
 How does the system go about its business?

 How do the components work together to get things done?

 When do things get done?

Understanding how and when the system does things and what the system resources are
is only half the challenge. The other half lies in understanding how each of the API
functions affects these resources. With hundreds of memory management functions
available, the task is especially challenging. This technical article provides a foundation
for three other technical articles on this disc that specifically address the memory
functions available in the Win32 API and explain the impact each has on system
resources: "Managing Virtual Memory in Win32," "Managing Memory Mapped Files in
Win32," and "Managing Heap Memory in Win32" (MSDN Library, Technical Articles). You
should examine this article first and then read the other three when you are faced with
memory issues in a specific area of the Win32 API.

© 1998 Microsoft Corporation. All rights reserved. Terms of use

Abstract of working of protection:


Implicit points: visa cannot be made of yourself, or an office card by yourself

https://www.youtube.com/watch?v=qlH4-oHnBb8
Operating Systems Development - Virtual

Operating Systems Development Series

Operating Systems Development - Virtual


Memory
by Mike, 2008

This series is intended to demonstrate and teach operating system development from
the ground up.

Introduction
Welcome back! Jeeze, I cant believe we are already going on tutorial eighteen. See? OS
development isn't too bad ;)

In the last tutorial we have looked at physical memory management and even
developed a full working physical memory manager. In this tutorial, we will take it to a
new level by introducing paging and virtual memory. We will learn how we can mimic a
full virtual address space for our programs and learn how we can manage virtual
memory.

Heres the list for this chapter:

 Virtual Memory
 Memory Management Unit (MMU)
 Translation Lookaside Buffer (TLB)
 PAE and PSE
 Paging Methods
 Pages and Page Faults
 The Page Table
 The Page Directory Table
 Implimenting Paging

...And a whole lot more!

This tutorial will build off of the physical memory manager we developed in the last
chapter. This may also be the last chapter on memory management!

With that in mind, lets get started!

Virtual Memory Concepts


The need for Virtualization
You might be curious as to why we should worry about this "virtual memory" thing.
After all, we already have a nice and effective way of managing memory, right? Well,
sort of. While it manages blocks of memory well, thats all our physical memory
manager does. This alone is pretty useless, don't you think?

There are alot of very important concepts that we should look at to better understand
virtual memory and the need for it.

Right now all we have is a way to directly and indirectly work with physical memory.
There are alot of big problems with this that you may already know (or even have
experience with yourself ;) ) One that we have just seen was when we would access to
a block of memory that does not exist. Knowing that both programs and data are in
memory, it is also possible for programs to access each others memory spaces, or even
corrupt and overwrite themselves or other programs without knowing it. After all, there
is no memory protection.

Also, it is not always possible to load a file or program into a sequencial area of
memory. This is when fragmentation happens. For an example, lets say we have 2
programs loaded. One at 0x0, the other at 0x900. Both of these programs requested to
load files, so we load the data files:

Notice what is happening here. There is alot of unused memory between all of these
programs and files. Okay...What happens if we add a bigger file that is unable to fit in
the above? This is when big problems arise with the current scheme. We cannot directly
manipulate memory in any specific way, as it will currupt the currently executing
programs and loaded files.

As you can see, there are alot of problems that will arise when working with physical
memory. If your operating system is single-tasking (Where only one ring 0 program
runs at a time), then this might be fine. For anything more complex, we will be needing
more control over how memory works within the system. What we need is a way to
abstract physical memory in such a way that we do not need to worry about these
details anymore. I think you know where I am getting at here -- this is where
virtualization comes in. Lets take a look!

Virtual Memory
Concepts
Understanding what virtual memory is can be a little tricky. Virtual Memory is a special
Memory Addressing Scheme implemented by both the hardware and software. It allows
non contigous physical memory to act as if it was contigius memory.

Notice that I said "Memory Addressing Scheme". What this means is that virtual
memory allows us to control what a Memory Address refers to.

Virtual Address Space (VAS)


A Virtual Address Space is a Program's Address Space. One needs to take note
that this does not have to do with Physical Memory. The idea is so that each
program has their own independent address space. This insures one program
cannot access another program, because they are using a different address
space.

Because VAS is Virtual and not directly used with the physical memory, it allows the
use of other sources, such as disk drives, as if it was memory. That is, It allows us to
use more "memory" then what is physically installed in the system.
This fixes the "Not enough memory" problem.

Also, as each program uses its own VAS, we can have each program always begin at
base 0x0000:0000. This solves the relocation problems discussed ealier, as well as
memory fragmentation--as we no longer need to worry about allocating continous
physical blocks of memory for each program.

Virtual Addresses are mapped by the Kernel trough the MMU. More on this a
little later.

Memory Management Unit (MMU)


The Memory Management Unit (MMU) (Also known as Paged Memory
Management Unit (PMMU)) sets between (Or as part of) the microprocessor and
the memory controller. While the memory controller's primary function is the
translation of memory addresses into a physical memory location, the MMU's purpose is
the translation of virtual memory addresses into a memory address for use by
the memory controller.

This means--when paging is enabled, all of our memory refrences go through


the MMU first!

Translation Lookaside Buffer (TLB)


This is a cache stored within the processor used to improve the speed of virtual address
translation. It is useually a type of Content-addressable memory (CAM) where the
search key is the virtual address to translate, and the result is the physical frame
address. If the address is not in the TLB (A TLB miss), the MMU searches through the
page table to find it. If it is found in the TLB, it is a TLB Hit. If the page is not found or
invalid inside of the page table during a TLB miss, the processor will raise a Page
Fault exception for us.

Think of a TLB as a table of pages stored in a cache instead of in RAM--as that is


basically what it is.

This is important! The pages are stored in page tables. We set up these page tables
to describe how physical addresses translate to virtual addresses. In other words: The
TLB translates virtual addresses into physical addresses using the page tables
*we* set up for it to use! Yes, thats right--we set up what virtual addresses map to
what. We will look at how to do this a little later, cool? Dont worry--its not that bad ;)

Paged Virtual Memory


Virtual Memory also provides a way to indirectly use more memory then we actually
have within the system. One common way of approching this is by using Page files,
stored on a hard drive or a swap partition.

Virtual Memory needs to be mapped through a hardware device controller in order to


work, as it is handled at the hardware level. This is normally done through the MMU,
which we will look at later.

For an example of seeing virtual memory in use, lets look at it in action:


Notice what is going on here. Each memory block within the Virtual Addresses are
linear. Each Memory Block is mapped to either it's location within the real physical RAM,
or another device, such as a hard disk. The blocks are swapped between these devices
as an as needed bases. This might seem slow, but it is very fast thanks to the MMU.

Remember: Each program will have its own Virtual Address Space--shown
above. Because each address space is linear, and begins from 0x0000:00000, this
immiedately fixes alot of the problems relating to memory fragmentation and program
relocation issues.

Also, because Virtual Memory uses different devices in using memory blocks, it can
easily manage more then the amount of memory within the system. i.e., If there is no
more system memory, we can allocate blocks on the hard drive instead. If we run out of
memory, we can either increase this page file on an as needed bases, or display a
warning/error message,

Each memory "Block" is known as a Page, which is usually 4096 bytes in size. We will
cover Pages a little later.

Okay, so a Page is a memory block. This memory block can either be mapped to a
location in memory, or to another device location, such as a hard disk. This is
an unmapped page. If software accessed an unmapped page (The page is not currently
in memory), it needs to be loaded somehow. This is done by our Page fault handler.

We will cover everything later, so do not worry if this sounds hard :)

Because we are talking about paging in general, I think now would be a good idea to
look at some extensions that may be used with paging. Lets have a look!

PAE and PSE


Physical Address Extension (PAE)
PAE is a feature in x86 microprocessors that allows 32 bit systems to access up to 64
GB of physical memory. PAE supported motherboards use a 36 line address bus to
achieve this. Paging support with PAE enabled (Bit 5 in the cr4 register) is a little
different then what we looked at so far. I might decide to cover this a little later,
however to keep this tutorial from getting even more complex, we will not look at it
now. However, I do encourage readers to look into it if you are interested. ;)

Page Size Extension (PSE)


PSE is a feature in x86 microprocessors that allows pages more then 4KB in size. This
allows the x86 architecture to support 4MB page sizes (Also called "huge pages" or
"large pages") along side 4KB pages.

The World of Paging


Let the madness begin :)

Introduction
Woo-hoo! Welcome to the wonderful and twisted-minded world of paging! With all of
the fundemental concepts that we have went over already, you should have a nice and
good grasp at what paging and virtual memory is all about. This is a great start, don't
you think?

Okay, cool...but, how do we actually impliment it? How does paging work on the x86
architecture? Lets take a look!

Pages
A Page (Also known as a memory page or virtual page) is a fixed-length block of
memory. This block of memory can reside in physical memory. Think of it like this: A
page describes a memory block, and where it is located at. This allows us to "map" or
"find" the location of where that memory block is at. We will look at mapping pages and
how to impliment paging a little later :)

The i86 architecture uses a specific format for just this. It allows us to keep track of a
single page, and where it is currently located at. Lets take a look..

Page Table Entries (PTE)


A page table entry is what represents a page. We will not cover the page table until a
little later so dont worry too much about it. However we will need to look at what an
entry in the table looks like now. The x86 architecture defines a specific bit format for
working with pages, so lets take a look at it.

 Bit 0 (P): Present flag


o 0: Page is not in memory
o 1: Page is present (in memory)
 Bit 1 (R/W): Read/Write flag
o 0: Page is read only
o 1: Page is writable
 Bit 2 (U/S):User mode/Supervisor mode flag
o 0: Page is kernel (supervisor) mode
o 1: Page is user mode. Cannot read or write supervisor pages
 Bits 3-4 (RSVD): Reserved by Intel
 Bit 5 (A): Access flag. Set by processor
o 0: Page has not been accessed
o 1: Page has been accessed
 Bit 6 (D): Dirty flag. Set by processor
o 0: Page has not been written to
o 1: Page has been written to
 Bits 7-8 (RSVD): Reserved
 Bits 9-11 (AVAIL): Available for use
 Bits 12-31 (FRAME): Frame address

Cooldos! Thats all? Well.. I never said it was hard ;)

Quite possibly the most important thing here is the frame address. The frame
address represents the 4KB physical memory location that the page
manages. This is vital to know when understanding paging, however it is hard to
describe why it is so right now. For now, just remember that each and every page
manages a block of memory. If the page is present, it manages a 4KB physical
address space in physical memory.

The Dirty Flag and Access Flag are set by the processor, not software. You might
wonder on how the processor knows what bits to set; ie, where they are located in
memory. We will look at that a little later. Just rememeber that, this will allow the
software or executive to test if a page has been accessed or not.

The present flag is an important one. This one single bit is used to determin if a page
is currently in physical memory or not. If it is currently in physical memory, the frame
address is the 32 bit linear address for where it is located at. If it is not in physical
memory, the page must reside on another location--such as a hard disk.

If the present flag is not set, the processor will ignore the rest of the bits in the
structure. This allows us to use the rest of the bits for whatever purpose...perhaps
where the page is located at on disk? This will allow--when our page fault handler gets
called--for us to locate the page on disk and swap the page into memory when needed.

Lets give out a simple example. Lets say that we want this page to manage the 4KB
address space beginning at physical location 1MB (0x100000). What this means--to put
in other words--is that this page is "mapped" to address 1MB.

To create this page, simply set 0x100000 in bits 12-31 (the frame address) of the page,
and set the present bit. Voila--the page is mapped to 1MB. :) For example:

%define PRIV 3

mov ebx, 0x100000 | PRIV ; this page is mapped to 1MB


Notice that 0x100000 is 4KB aligned? It ORs it with 3 (11 binary which sets the first
two bits. Looking at the above table, we can see that it sets the present and read/write
flags, making this page present (Meaning its in physical memory. This is true as it is
mapped from physical address 0x100000), and is writable.

Thats it! You will see this example expand further in the next few sections so that you
can start seeing how everything fits in, so don't worry to much if you still do not
understand.
Also notice that there is nothing special about PTEs--they are simply 32 bit data. What
is special about them is how they are used. We will look at that a little later...

pte.h and pte.cpp - Abstracting page table entries and pages


The demo hides all of the code to set and get the individual properties of the page table
entries inside of these two files. All these do is set and get the bits and frame address
from the 32 bit pattern that we have looked at in the list above. This interface does
have a little overhead but greatly improves readability and makes it easier to work with
them.

The first thing we do is to abstract the bit pattern used by page table entries. This is too
easy:

enum PAGE_PTE_FLAGS {

I86_PTE_PRESENT = 1,
//0000000000000000000000000000001
I86_PTE_WRITABLE = 2,
//0000000000000000000000000000010
I86_PTE_USER = 4,
//0000000000000000000000000000100
I86_PTE_WRITETHOUGH = 8,
//0000000000000000000000000001000
I86_PTE_NOT_CACHEABLE = 0x10,
//0000000000000000000000000010000
I86_PTE_ACCESSED = 0x20,
//0000000000000000000000000100000
I86_PTE_DIRTY = 0x40,
//0000000000000000000000001000000
I86_PTE_PAT = 0x80,
//0000000000000000000000010000000
I86_PTE_CPU_GLOBAL = 0x100,
//0000000000000000000000100000000
I86_PTE_LV4_GLOBAL = 0x200,
//0000000000000000000001000000000
I86_PTE_FRAME = 0x7FFFF000
//1111111111111111111000000000000
};
Notice how this matches up with the bit format that we looked at in the above list. What
we want is a way to abstract the setting and getting of these properties (ie, bits) behind
the interface.

To do this, we first abstract the data type used to store a page table entry. In our case
its a simple uint32_t:

//! page table entry


typedef uint32_t pt_entry;
Simple enough. Next up is the interface routines that are used to set and get these bits.
I dont want to look at the implimentation of it as all it does is (litterally) set or get
individual bits within a pt_entry. So instead I want to focus on the interface:
extern void pt_entry_add_attrib (pt_entry* e, uint32_t attrib);
extern void pt_entry_del_attrib (pt_entry* e, uint32_t attrib);
extern void pt_entry_set_frame (pt_entry*, physical_addr);
extern bool pt_entry_is_present (pt_entry e);
extern bool pt_entry_is_writable (pt_entry e);
extern physical_addr pt_entry_pfn (pt_entry e);
pt_entry_add_attrib() sets a single bit within the pt_entry. We pass it a mask (like
our I86_PTE_PRESENT bit mask) to set it. pt_entry_del_attrib() does the same but
clears the bit.

pt_entry_set_frame() masks out the frame address (I86_PTE_FRAME mask) to set


our frame address to it. pt_entry_pfn() returns this address.

There is nothing special about these routines--we can easily set and get these attributes
manually if we wanted to via bit masks or (if you wanted) bit fields. I personally feel
this setup makes it much easier to work with though ;)

Okay, this is great as this setup allows us to keep track of a single page. However, it is
useless by itself as a typical system will need to have alot of pages. This is where a
page table comes in.

Page Tables
The page table...hm...where oh where did we hear that term before? *looks one line
up*. Oh, right ;)

A Page Table is..well..a table of pages. (Surprised?) A page table allows us to keep
track of how the pages are mapped between physical and virtual addresses. Each page
entry in this table follows the format shown in the previous section. In other
words, a page table is an array of page table entries (PTEs).

While it is a very simple structure, it has a very important purpose. The page table
contains a list of all the pages it contains, and how they are mapped. By "mapping", We
refer to how the virtual address "maps" to the physical frame address. The page table
also manages the pages, weather they are present, how they are stored, or even what
process they belong to (This can be set by using the AVAIL bits of a page. This may not
be needed, it depends on the implimentation of the system.)

Lets stop for a moment. Remember that a page manages 4KB of physical address
space? By itself, a page is nothing more then a 32 bit data structure that describes the
properties of a specific 4KB region of physical memory (Remember this from before?)
Because each page "manages" 4KB of physical memory, putting 1024 pages together
we have 1024*4KB=4MB of managed virtual memory. Lets take a look at how its set
up:

Thats an example of a page table. Notice how it is nothing more then an array 1024
page entries. Knowing that each page manages 4KB of physical memory, we can
actually turn this little table into its own virtual address space. How can we do this?
Simple: By deciding the format of a virtual address.

Heres an example: Lets say we have designed a new virtual address format like this:

AAAAAAAAAA BBBBBBBBBBBB
page table index offset into page
This is our format for a virtual address. So, when paging is enabled, all memory
addresses will now follow the above format. For example, lets say we have the following
instruction:
mov ecx, [0xc0000]

Here, 0xc0000 will be treated like a virtual address. Lets break it apart:

11000000 000000000000 ; 0xc0000 in binary form


AAAAAAAAAA BBBBBBBBBBBB
page table index offset into page

What we are now doing is an example of address translating. We are actually


translating this virtual address to see what physical location it refers to. The page table
index, 11000000b = 192. This is the page entry inside of our page table. We can now
get the base physical address of the 4KB that this page manages. If this page is present
(Pages present flag is set), all we need to do is access the pages frame address to
access the memory. If this page is NOT present, then generate a page fault--The page
data might be somewhere on disk. The page fault handler will allow us to copy the 4KB
data for the page into memory somewhere and set the page to present and update
its frame address to point to this new 4KB block of physical memory.

Okay okay, I know. This little example of creating a fake "virtual address" might seem
silly, but guess what? This is how its actually done! The actual format of a virtual
address is a little bit more complex in that there are three sections instead of 2.
However, if we omit the first section of the real virtual address format then it would
be exactally the same as our above example.

I hope by now you are starting to see how everything fits together, and the importance
of page tables.

Page Size
A system with smaller page sizes will require more pages then a system with larger
page sizes. Because the table keeps track of all pages, a system with smaller page sizes
will also require a larger page table because there are more pages to keep track of.
Simple enough, huh?

The i86 architecture supports 4MB (2MB pages if using Page Address Extension
(PAE)) and 4KB sized pages.

The important things to note are: Notice how page size may effect the size of page
tables.

The Page Directory Table (PDT)


Okay... We are almost done! A page table is a very powerful structure as you have
seen. Remember our previous virtual address example? I gave an example of a virtual
addressing system where each virtual address was composed of two parts: A page table
entry and a offset into that page.

On the x86 architecture, the virtual address format actually uses three sections instead
of two: The entry number in a page directory table, the page table index, and the
offset into that page.

A Page Directory Table is nothing more then an array of Page Directory Entries. I
know I know... How useless and non-informative was that last sentence? ;)

So, anyways, lets first look at a page directory entry. Then we will start looking at the
directory table, and where it all fits in...

Page Directory Entries (PDEs)


Page directory entries help provide a way to manage a single page table. Not only do
they contain the address of a page table, but they provide properties that we can use to
manage them. You will see how all of this fits in within the next section, so dont worry if
you dont understand it yet.

Page directory tables are very simularly structured in the way page tables are
structured. They are an array of 1024 entries, where the entries follow a specific bit
format. The nice thing about the format of page directory entries (PDEs) is that they
follow almost the exact same format that page table entries (PTEs) do (in fact they can
be interchangeable). There is only a few little bit of details (pun intended ;) ).

Here is the format of a page directory entry:

 Bit 0 (P): Present flag


o 0: Page is not in memory
o 1: Page is present (in memory)
 Bit 1 (R/W): Read/Write flag
o 0: Page is read only
o 1: Page is writable
 Bit 2 (U/S):User mode/Supervisor mode flag
o 0: Page is kernel (supervisor) mode
o 1: Page is user mode. Cannot read or write supervisor pages
 Bit 3 (PWT):Write-through flag
o 0: Write back caching is enabled
o 1: Write through caching is enabled
 Bit 4 (PCD):Cache disabled
o 0: Page table will not be cached
o 1: Page table will be cached
 Bit 5 (A): Access flag. Set by processor
o 0: Page has not been accessed
o 1: Page has been accessed
 Bit 6 (D): Reserved by Intel
 Bit 7 (PS): Page Size
o 0: 4 KB pages
o 1: 4 MB pages
 Bit 8 (G): Global Page (Ignored)
 Bits 9-11 (AVAIL): Available for use
 Bits 12-31 (FRAME): Page Table Base address

Alot of the members here should look familiar from the page table entry (PTE) list that
we looked at ealier.

The Present, Read/Write, and access flags are the same as it was with PTEs,
however they apply to a page table rather then a page.

page size determins if the pages inside of the page table are 4KB or 4MB.

Page Table Base address bits contain the 4K aligned address of a page table.
pde.h and pde.cpp - Abstracting Page Directory Entries
Simular to what we did with PTEs, we have created an interface to abstract PDEs in the
same manner.
enum PAGE_PDE_FLAGS {

I86_PDE_PRESENT = 1,
//0000000000000000000000000000001
I86_PDE_WRITABLE = 2,
//0000000000000000000000000000010
I86_PDE_USER = 4,
//0000000000000000000000000000100
I86_PDE_PWT = 8,
//0000000000000000000000000001000
I86_PDE_PCD = 0x10,
//0000000000000000000000000010000
I86_PDE_ACCESSED = 0x20,
//0000000000000000000000000100000
I86_PDE_DIRTY = 0x40,
//0000000000000000000000001000000
I86_PDE_4MB = 0x80,
//0000000000000000000000010000000
I86_PDE_CPU_GLOBAL = 0x100,
//0000000000000000000000100000000
I86_PDE_LV4_GLOBAL = 0x200,
//0000000000000000000001000000000
I86_PDE_FRAME = 0x7FFFF000
//1111111111111111111000000000000
};

//! a page directery entry


typedef uint32_t pd_entry;
Not to hard. We use the new type pd_entry to represent a page directory entry. Also,
with the PTE interface, we provide a small set of routines used to provide a nice way of
setting and getting the bits within the page directory entry:
extern void pd_entry_add_attrib (pd_entry* e, uint32_t attrib);
extern void pd_entry_del_attrib (pd_entry* e, uint32_t attrib);
extern void pd_entry_set_frame (pd_entry*, physical_addr);
extern bool pd_entry_is_present (pd_entry e);
extern bool pd_entry_is_user (pd_entry);
extern bool pd_entry_is_4mb (pd_entry);
extern bool pd_entry_is_writable (pd_entry e);
extern physical_addr pd_entry_pfn (pd_entry e);
extern void pd_entry_enable_global (pd_entry e);

Understanding the Page Directory Table


The Page Directory Table is sort of like an array of 1024 page tables. Remember that
each page table manages 4MB of a virtual address space? Well... Putting 1024 page
tables together we can manage a full 4GB of virtual addresses. Sweet, huh?

Okay, its a little more complex then that, but not that much. The Page Directory
Table is actually an array of 1024 page directory entries that follow the format
above. Look back at the format of an entry and notice the Page Table Base
address bits. This is the address of the page table this directory entry manages.

It may be easier to see it visually, so here you go:


Notice what is happening here. Each page directory entry points to a page table.
Remember that each page manages 4KB of physical (and hence virtual) memory? Also,
remember that a page table is nothing more then an array of 1024 pages? 1024*4KB =
4MB. This means that each page table manages its own 4MB of address space.

Each page directory entry provides us a way to manage each page table much easier.
Because the complete page directory table is an array of 1024 directory entries, and
that each entry manages its own table, we effectivly have 1024 page tables. From our
previous calculation we know each page table manages 4MB of address space. So 1024
page tables*4MB size= 4GB of virtual address space.

I guess thats it for ... believe it or not... everything. See, its not that hard, is it? In the
next section, we will be revisiting the real format of an x86 virtual address, and you will
get to see how everything works together!

Use in Multitasking
We run into a small problem here. Remember that a page directory table represents a
4GB address space? How can we allow multiple programs a 4GB address space if we can
only have one page directory at a time?

We cant. Not nativly, anyways. Alot of mutitasking operating systems map the high 2
GB address space for its own use as "kernel space" and the low 2 GB as "user space".
The user space cannot touch kernel space. With the kernel address space being mapped
to every processes 4GB virtual address space, we can simply switch the current page
directory without error using the kernel no matter what process is currently running.
This is possible do to the kernel always being located at the same place in the processes
address space. This also makes scheduling possible. More on that later though...

Virtual Memory Management


We have covered everything we need to develop a good virtual memory manager. A
virtual memory manager must provide methods to allocate and manage pages, page
tables, and page directory tables. We have looked at each of these in separate, but
have not looked at how they work together.

Higher Half Kernels


Abstract

A Higher Half Kernel is a kernel that has a virtual base address of 2GB or above. A lot
of operating systems have a higher half kernel. Some examples include the Windows
and Linux Kernels. The Windows Kernel gets mapped to either 2GB or 3GB virtual
address (depending on if /3gb kernel switch is used), the Linux Kernel gets mapped to
3GB virtual address. The series uses a higher half kernel mapped to 3GB. Higher half
kernels must be mapped properly into the virtual address space. There are several
methods to achieve this, some of which is listed here.

You might be interested on why we would want a higher half kernel. We can very well
run our kernel at some lower virtual address. One reason has to do with v86 tasks. If
you want to support v86 tasks, v86 tasks can only run in user mode and within the real
mode address limits (0xffff:0xffff), or about 1MB+64k linear address. It is also typical to
run user mode programs in the first 2GB (or 3GB on some OSs) as software typically
never has a need to access high memory locations.

Method 1

The first design is that we can have the boot loader set up a temporary page directory.
With this, the base address of the kernel can be 3GB. The boot loader maps a physical
address (typically 1MB) to this base address and calls the kernel's entry point.

This method works, but creates a problem of how the kernel is going to work with
managing virtual memory. The kernel can either try to work with the page directory and
tables set up by the boot loader, or create a new page directory to manage. If we create
a new page directory, the kernel will need to remap itself (1MB physical to the base
virtual address of the kernel) or cloning the existing temporary page directory to the
new page directory.

At this time, this is the method the series uses. The series boot loader will set up a
temporary page directory and maps the kernel to 3GB virtual. The kernel then creates a
new page directory during VMM initialization and remaps itself. The kernel must remain
position-independent during this set up phase. This is the method we use in our in-
house OS.

Method 2

Another possible design is that the boot loader loads the kernel into a physical memory
location and keeps paging disabled. The kernel virtual base address would be the virtual
address it is supposed to execute at. For example, the boot loader can load and execute
the kernel at 1MB physical, although the kernels base address is 3GB.

This method is a little tricky. There has to be a way for the boot loader to know what
physical address to load and execute the kernel at, and the kernel has to map itself to
its real base virtual address. This is usually done during kernel startup in position-
independent code. This can be used in position-dependent code, but the kernel must be
able to fix the addresses when accessing data or calling functions. This is the method
used in our in-house OS.

Method 3

This method uses Tim Robinson's GDT trick. This can be found in his documentation
located here (*.pdf) This allows your kernel to run at a higher address (its base
address) even though it is not loaded there. This trick works do to address wrap around.
For example, lets say our kernel is loaded at 1MB physical address, but we want it to
appear to be running at 3GB Virtual. The base that we want is X + 3GB = 1MB in this
case. Lets look closer.

Remember that the GDT descriptor base address is a DWORD. If the value becomes
greater then 0xffffffff, it will wrap around back to 0. 3GB = 0xC0000000. 0xffffffff -
0xc0000000 = 0x3FFFFFFF bytes left until it wraps. We need to add an address that will
make this address to point to our physical location (1MB). Knowing we have 0x3FFFFFFF
bytes left until our DWORD wraps back to 0, we can add 0x100000 (1MB) + 0x3FFFFFFF
= 0x400FFFFF + 1 = 0x40100000.

So, by using the above example, if our kernel is loaded at 1MB physical address but has
a real base address of 3GB virtual, we can create a temporary GDT with a base code
and data selector of 0x40100000. The processor automatically adds the base selector
addresses to the addresses it is accessing. After using LGDT to install this new GDT.
After this we are now running at 3GB. This works because the processor will add the cs
and ds selector base (40100000) to whatever address that is being referenced. For
example, 3GB would be translated by the processor to 1MB in our example as
3GB+base selector ((40100000) = 1MB physical.

This trick is fairly easy to impliment and works well but wont work for 64 bit (Long
Mode). After the kernel performs this trick it can set up its page directory and map itself
with ease after which can enable paging.

Virtual Addressing and Mapping Addresses


When we enable paging, all memory refrences will be treated as a virtual address.
This is very important to know. This means we must set up the structures properly first
before enabling paging. If we do not, we can run into an immiedate triple fault--with or
without valid exception handlers.

Remember the format of a virtual address? This is the format of a x86 virtual
address:

AAAAAAAAAA BBBBBBBBBB CCCCCCCCCCCC


directory index page table index offset into page
This is very important! This tells the processor (And *us*) alot of information.

The directory index portion tells us what index into the current page directory to
look in. Look back up to the Directory Entry Structure format in the previous
section. Notice that each directory table entry containes a pointer to a page
table. You can also see this within the image again in that section.

Because each index within the directory table points to a page table, this tells us what
page table we are accessing.

The page table index portion tells us what page entry within this page table we are
accessing.

...And remember that each page entry manages a full 4KB of physical address space?
The offset into page portion tells us what byte within this pages physical address
space we are refrencing.

Notice what happened here. We have just translated a virtual address into a physical
address using our page tables. Yes, its that easy. No trickery involved.

Lets look at another example. Lets assumed that virtual address 0xC0000000 was
mapped to physical address 0x100000. How do we do this? We need to find the page in
our structures that 0xC0000000 refer to -- just like we did above. In this case
0xC0000000 is the virtual address, so lets look at its format:

1100000000 0000000000 000000000000 ; 0xC0000000


in binary form

AAAAAAAAAA BBBBBBBBBB CCCCCCCCCCCC


directory index page table index offset into page

Remember that the directory index tells us what page table we are accessing within the
page directory table? So... 1100000000b (The directory index) = 768th page table.

Remember that the page table index is the page we are accessing within this page
table? That is 0, so its the first page. Also note the offset byte in this page is 0.

Now, all we need to do is set the frame address of the first page in the 768th page
table to 0x100000 and voila! You have just mapped 3GB virtual address to 1MB
physical! Knowing that each page is 4KB aligned, we can keep doing this in increments
of 4KB physical addresses.

Identity Mapping
Identity Mapping is nothing more then mapping a virtual address to the same physical
address. For example, virtual address 0x100000 is mapped to physical address
0x100000. Yep--Thats all there is to it. The only real time this is required is when first
setting up paging. It helps insure the memory addresses of your current running code of
where they are at stays the same when paging is enabled. Not doing this will result in
immediate triple fault. You will see an example of this in our Virtual Memory Manager
initialization routine.

Memory Managment: Implimentation


Implimentation
I suppose that is everything. What we will look at next is the virtual memory manager
(VMM) itself that has been developed for this tutorial. This will bring everything that we
have looked at together so that you can see how everything works.

I have tried to make the routines small so that we can focus on one topic at a time as
there is a couple of new things that we still need to look at.

Alrighty...First lets take a look at the page table and directory table themselves:

//! virtual address


typedef uint32_t virtual_addr;

//! i86 architecture defines 1024 entries per table--do not change
#define PAGES_PER_TABLE 1024
#define PAGES_PER_DIR 1024

#define PAGE_DIRECTORY_INDEX(x) (((x) >> 22) & 0x3ff)


#define PAGE_TABLE_INDEX(x) (((x) >> 12) & 0x3ff)
#define PAGE_GET_PHYSICAL_ADDRESS(x) (*x & ~0xfff)

//! page table represents 4mb address space


#define PTABLE_ADDR_SPACE_SIZE 0x400000
//! directory table represents 4gb address space
#define DTABLE_ADDR_SPACE_SIZE 0x100000000

//! page sizes are 4k


#define PAGE_SIZE 4096

//! page table


struct ptable {

pt_entry m_entries[PAGES_PER_TABLE];
};

//! page directory


struct pdirectory {

pd_entry m_entries[PAGES_PER_DIR];
};
Simular to our physical_addr type, I created a new address type for virtual memory--
virtual_addr. Notice that a page table is nothing more then an array of 1024 page
table entries? Same thing with the page directory table, but its an array of page
directory entries instead. Nothing special yet ;)

PAGE_DIRECTORY_INDEX, PAGE_TABLE_INDEX,
PAGE_GET_PHYSICAL_ADDRESS are macros that just returns the respective partion
of a virtual address. Remember that a virtual address has a specific format, these
macros allow us to obtain the information from the virtual address.

PTABLE_ADDR_SPACE_SIZE represents the size (in bytes) that a page table


represents. A page table is 1024 pages, where a page is 4K in size, so it is 1024 * 4k =
4MB. DTABLE_ADDR_SPACE_SIZE represents the number of bytes a page directory
manages, which is the size of the virtual address space. Knowing a page table
represents 4MB of the address space, and that a page directory contains 1024 page
tables, 4MB * 1024 = 4GB.

The virtual memory manager presented here does not handle large pages. Instead, it
only manages 4K pages.

The Virtual Memory Manager (VMM) we use relies on these structures heavily. Lets take
a look at some of the routines in the VMM to learn how they work.

vmmngr_alloc_page () - allocates a page in physical memory


To allocate a page, all we need to do is allocate a 4K block of physical memory for the
page to refer to, then simply create a page table entry from it:
bool vmmngr_alloc_page (pt_entry* e) {

//! allocate a free physical frame


void* p = pmmngr_alloc_block ();
if (!p)
return false;

//! map it to the page


pt_entry_set_frame (e, (physical_addr)p);
pt_entry_add_attrib (e, I86_PTE_PRESENT);

return true;
}
Notice how our PTE routines make this much easier to do? The above sets the PRESENT
bit in the page table entry and sets its FRAME address to point to our allocated block of
memory. Thus the page is present and points to a valid block of physical memory and is
ready for use. Cool, huh?

Also, notice how we "map" the physical address to the page. All this means is that we
set the page to point to a physical address. Thus the page is "mapped" to that address.

vmmngr_free_page () - frees a page in physical memory


To free a page is even easier. Simply free the block of memory using our physical
memory manager, and clear the page table entries PRESENT bit (marking it NOT
PRESENT) :
void vmmngr_free_page (pt_entry* e) {

void* p = (void*)pt_entry_pfn (*e);


if (p)
pmmngr_free_block (p);

pt_entry_del_attrib (e, I86_PTE_PRESENT);


}
Thats it! Now that we have a way to allocate and free a single page, lets see if we can
put them together in full page tables...

vmmngr_ptable_lookup_entry () - get page table entry from page


table by address
Now that we have a way of abtaining the page table entry number from a virtual
address, we need a way to get it from the page table. This routine does just that! It
uses the above function to convert the virtual address into an index into the page table
array, and returns the page table entry from it.
inline pt_entry* vmmngr_ptable_lookup_entry (ptable* p,virtual_addr addr)
{

if (p)
return &p->m_entries[ PAGE_TABLE_INDEX (addr) ];
return 0;
}
Because this routine returns a pointer, we can modify the entry as much as we need to
as well. Cool?

Thats it for the page table routines. See how easy paging is? ;)

Next up...The page directory routines!

vmmngr_pdirectory_lookup_entry () - get directory entry from


directory table by address
Now that we have a way to covert a virtual address into a page directory table index,
we need to provide a way to get the page directory entry from it. This is exactally the
same with the page table routine counterpart:
inline pd_entry* vmmngr_pdirectory_lookup_entry (pdirectory* p,
virtual_addr addr) {

if (p)
return &p->m_entries[ PAGE_TABLE_INDEX (addr) ];
return 0;
}

vmmngr_switch_pdirectory () - switch to a new page directory


Notice how small all of these routines are. They provide a minimal but very effective
interface for easily working with page tables and directories. When we set up a page
directory, we need to provide a way to install it for our use.

In the previous tutorial, we added two


routines: pmmngr_load_PDBR() and pmmngr_get_PDBR() to set and get
the Page Directory Base Register (PDBR). This is the register that stores the current
page directory table. On the x86 architecture, the PDBR is the cr3 processor register.
Thus, these routines simply set and gets the cr3 register.

vmmngr_switch_pdirectory () uses these routines to load the PDBR and set the current
directory:

//! current directory table (global)


pdirectory* _cur_directory=0;

inline bool vmmngr_switch_pdirectory (pdirectory* dir) {

if (!dir)
return false;

_cur_directory = dir;
pmmngr_load_PDBR (_cur_pdbr);
return true;
}

pdirectory* vmmngr_get_directory () {

return _cur_directory;
}

vmmngr_flush_tlb_entry () - flushes a TLB entry


Remember how the TLB caches the current page table? Sometimes it may be necessary
to flush (invalidate) the TLB or individual entries so that it can get updated to the
current value. This may be done automatically by the processor (Like during a mov
instruction involving a control register).

The processor provides a method for us to manually flush individual TLB entries ourself.
This is done using the INVLPG instruction.

We simply pass it the virtual address and the resulting page entry will be invalidated:

void vmmngr_flush_tlb_entry (virtual_addr addr) {

#ifdef _MSC_VER
_asm {
cli
invlpg addr
sti
}
#endif
}
Keep in mind that INVLPG is a privlidged instruction. Thus you must be running
in supervisor mode to use it.

vmmngr_map_page () - maps pages


This is one of the most important routines. This routine allows us to map any physical
address to a virtual address. Its a little complicated so lets break it down:
void vmmngr_map_page (void* phys, void* virt) {

//! get page directory


pdirectory* pageDirectory = vmmngr_get_directory ();

//! get page table


pd_entry* e = &pageDirectory->m_entries [PAGE_DIRECTORY_INDEX
((uint32_t) virt) ];
if ( (*e & I86_PTE_PRESENT) != I86_PTE_PRESENT) {

We are given a physical and virtual address as paramaters. The first thing that must be
done is to verify that the page directory entry that this virtual address is located in is
valid (That is, has been allocated before and its PRESENT bit is set.)

The page directory index is part of the virtual address itself, so we use
PAGE_DIRECTORY_INDEX() to obtain the page directory index. Then we just index into
the page directory array to obtain a pointer to the page directory entry. Then the test to
see if I86_PTE_PRESENT bit is set or not. If it is not set, then the page directory entry
does not exist so we must create it...

//! page table not present, allocate it


ptable* table = (ptable*) pmmngr_alloc_block ();
if (!table)
return;

//! clear page table


memset (table, 0, sizeof(ptable));

//! create a new entry


pd_entry* entry =
&pageDirectory->m_entries [PAGE_DIRECTORY_INDEX ( (uint32_t)
virt) ];

//! map in the table (Can also just do *entry |= 3) to enable these
bits
pd_entry_add_attrib (entry, I86_PDE_PRESENT);
pd_entry_add_attrib (entry, I86_PDE_WRITABLE);
pd_entry_set_frame (entry, (physical_addr)table);
}

The first thing the above does is to allocate a new page for the new page table and
clears it. After words, it uses PAGE_DIRECTORY_INDEX() again to get the directory
index from the virtual address, and indexes into the page directory to get a pointer to
the page table entry. Then it sets the page table entry to point to our new allocate page
table, and sets its PRESENT and WRITABLE bits so that it can be used.

At this point, the page table is guaranteed to be valid at that virtual address. So the
routine now just needs to map the address...

//! get table


ptable* table = (ptable*) PAGE_GET_PHYSICAL_ADDRESS ( e );

//! get page


pt_entry* page = &table->m_entries [ PAGE_TABLE_INDEX ( (uint32_t)
virt) ];

//! map it in (Can also do (*page |= 3 to enable..)


pt_entry_set_frame ( page, (physical_addr) phys);
pt_entry_add_attrib ( page, I86_PTE_PRESENT);
}

The above calls PAGE_GET_PHYSICAL_ADDRESS() to get the physical frame that the
page directory entry points to in order to get the page table entry. Then,
using PAGE_TABLE_INDEX to get the page table index from the virtual address,
indexing into the page table it obtains the page table entry. Then it sets the page to
point to the physical address and sets the pages PRESENT bit.

vmmngr_initialize () - initialize the VMM


This is an important routine. This uses all of the above routines (Well, most of them ;) )
to set up the default page directory, install it, and enable paging. We can also use this
an example of how everything works and fits together. Because this routine creates a
new page directory, we also need to map 1MB physical to 3GB virtual in order for the
kernel.

This is a fairly big routine so lets break it down and see whats going on:

void vmmngr_initialize () {

//! allocate default page table


ptable* table = (ptable*) pmmngr_alloc_block ();
if (!table)
return;

//! allocates 3gb page table


ptable* table2 = (ptable*) pmmngr_alloc_block ();
if (!table2)
return;

//! clear page table


vmmngr_ptable_clear (table);
Remember how page tables must be located at 4K aligned addresses? Thanks to out
physical memory manager (PMM), our pmmngr_alloc_block() already does just this
so we do not need to worry about it. Because a single block allocated is already 4K in
size, the page table has enough storage space for its entries as well (1024 page table
entries * 4 bytes per entry (size of page table entry) = 4K) so all we need is a single
block.

Afterwords we clear out the page table to clean it up for our use.

//! 1st 4mb are idenitity mapped


for (int i=0, frame=0x0, virt=0x00000000; i<1024; i++,
frame+=4096, virt+=4096) {

//! create a new page


pt_entry page=0;
pt_entry_add_attrib (&page, I86_PTE_PRESENT);
pt_entry_set_frame (&page, frame);

//! ...and add it to the page table


table2->m_entries [PAGE_TABLE_INDEX (virt) ] = page;
}
This parts a little tricky. Remember that as soon as paging is enabled, all address
become virtual? This poses a problem. To fix this, we must map the virtual addresses
to the same physical addresses so they refer to the same thing. This is idenitity
mapping.
The above code idenitity maps the page table to the first 4MB of physical memory (the
entire page table). It creates a new page and sets its PRESENT bit followed by the
frame address we want the page to refer to. Afterwords it converts the current virtual
address we are mapping (stored in "frame") to a page table index to set that page table
entry.

We increment "frame" for each page in the page table (stored in "i") by 4K (4096) as
that is the block of memory each page refrences. (Remember page table index 0
references address 0 - 4093, index 1 refrences address 4096--etc..?)

Here we run into a problem. Because the boot loader maps and loads the kernel directly
to 3gb virtual, we also need to remap the area where the kernel is at:

//! map 1mb to 3gb (where we are at)


for (int i=0, frame=0x100000, virt=0xc0000000; i<1024; i++,
frame+=4096, virt+=4096) {

//! create a new page


pt_entry page=0;
pt_entry_add_attrib (&page, I86_PTE_PRESENT);
pt_entry_set_frame (&page, frame);

//! ...and add it to the page table


table->m_entries [PAGE_TABLE_INDEX (virt) ] = page;
}

This code is pretty much the same as the above loop and maps 1MB physical to 3GB
virtual. This is what maps the kernel into the address space and allows the kernel to
continue running at 3GB virtual address.

//! create default directory table


pdirectory* dir = (pdirectory*) pmmngr_alloc_blocks (3);
if (!dir)
return;

//! clear directory table and set it as current


memset (dir, 0, sizeof (pdirectory));
The above creates a new page directory and clears it for our use.
pd_entry* entry = &dir->m_entries [PAGE_DIRECTORY_INDEX
(0xc0000000) ];
pd_entry_add_attrib (entry, I86_PDE_PRESENT);
pd_entry_add_attrib (entry, I86_PDE_WRITABLE);
pd_entry_set_frame (entry, (physical_addr)table);

pd_entry* entry2 = &dir->m_entries [PAGE_DIRECTORY_INDEX


(0x00000000) ];
pd_entry_add_attrib (entry2, I86_PDE_PRESENT);
pd_entry_add_attrib (entry2, I86_PDE_WRITABLE);
pd_entry_set_frame (entry2, (physical_addr)table2);
Remember that each page table represents a full 4MB virtual address space? Knowing
that each page directory entry points to a page table, we can saftley say that each page
directory entry represents the same 4MB address space inside of the 4GB virtual
address space of the entire directory table. The first entry in the page directory is for
the first 4MB, the second is for the next 4MB and so on. Because we are only mapping
the first 4MB right now, all we need to do is set the first entry to point to our page
table.
In a simular way, we set up a page directory entry for 3GB. This is needed so we can
map the kernel in.

Notice that we also set the page directory entries PAGE and PRESENT bit as well. This
will tell the processor that the page table is present and writable.

//! store current PDBR


_cur_pdbr = (physical_addr) &dir->m_entries;

//! switch to our page directory


vmmngr_switch_pdirectory (dir);

//! enable paging


pmmngr_paging_enable (true);
}
Now that the page directory is set up, we install the page directory and enable paging.
If everything worked as expected, your program should not crash. If it does not work, it
will probably triple fault.

Page Faults
As you know, as soon as we enable paging all addresses become virtual. All of these
virtual addresses rely heavily on the page tables and page directory data structures.
This is fine, but there will be alot of times when a virtual address requires the cpu to
access a page that is not yet valid. This is when a page fault exception (#PF) is
raised by the processor. A will only occur when a page is marked not present.
A General Protecton Fault (#GPF) will occur if the page is not properly mapped but
marked present and accessable. A #GPF will also occur if the page is not accessable.

A page fault is cpu interrupt 14 which also pushes an error code so that we can abtain
information. The error code pushed by the processor has the following format:

 Bit 0:
o 0: #PF occured because page was present
o 1: #PF occured NOT because the page was present
 Bit 1:
o 0: Operation that caused the #PF was a read
o 1: Operation that caused the #PF was a write
 Bit 2:
o 0: Processor was running in ring 0 (kernel mode)
o 1: Processor was running in ring 3 (user mode)
 Bit 3:
o 0: #PF did not occure because reserved bits were written over
o 1: #PF occured becaused reserved bits were written over
 Bit 4:
o 0: #PF did not occure during an instruction fetch
o 1: #PF occured during an instruction fetch

All other bits are 0.

When a #PF occures, the processor also stores the address that caused the fault in
the CR2 register.

Normally when a #PF occurs, an operating system will need to fetch the page from the
faulting address of the currently running program from disk. This requires several
different components of an OS (disk driver, file system driver, volume/mount points
management) that we do not yet have. Because of this, we will return back to page
fault handling a little later when we have a more evolved OS.

Demo
This demo includes all of the source code in this tutorial, and more. This demo includes
paging code inside of the bootloader and kernel to include the complete virtual memory
manager (VMM) and to map the kernel to the 3GB mark within its own virtual address
space.

There is nothing new visually with this demo. Because of this, there is no new pics.
However it does demenstrate the concepts described in this chapter in both assembly
language source (The bootloaders Paging.asm file) and C source (The VMM that we
have developed in this chapter.)

DEMO DOWNLOAD

Conclusion
I am very glad to get this one done! We have covered alot of information and ground in
this tutorial: Virtual Memory, Virtual addressing and translation, paging, methods, and
more. With this tutorial, we are not out of the paging word yet! However, we can all
saftely go to bed tonight knowing that we have a better understanding of it, how it
works, and hot to work with it. See? Its not so bad :)

Inside of the next tutorial I am thinking about going back to the fun stuff with
developing a keyboard driver. Because we already have a form of output, and we will be
able to retrieve input, we may even make a simple command line as well ;)

Until next time,

~Mike
BrokenThorn Entertainment. Currently developing DoE and the Neptune Operating
System

Questions or comments? Feel free to Contact me.

Would you like to contribute and help improve the articles? If so, please let me know!

Home
Chapter 17 Chapter 19

Managing Virtual Memory


Randy Kath
Microsoft Developer Network Technology Group
Created: January 20, 1993

Abstract
Determining which function or set of functions to use for managing memory in your application is
difficult without a solid understanding of how each group of functions works and the overall impact
they each have on the operating system. In an effort to simplify these decisions, this technical article
focuses on the virtual memory management functions: which ones are available, how they are used,
and how their use affects the operating system. The following topics are discussed in this article:

 Reserving, committing, and freeing virtual memory


 Changing protection on pages of virtual memory
 Locking pages of virtual memory
 Querying a process's virtual memory

A sample application called ProcessWalker accompanies this technical article on the Microsoft
Developer Network CD. This sample application is useful for exploring the virtual address space of a
process. It also employs the use of virtual memory functions for implementing a linked list structure.

Introduction
This is one of three related technical articles—"Managing Virtual Memory," "Managing Memory-
Mapped Files," and "Managing Heap Memory"—that explain how to manage memory in applications
for Windows. In each article, this introduction identifies the basic memory components in the
Windows programming model and indicates which article to reference for specific areas of interest.
The first version of the Microsoft Windows operating system introduced a method of managing
dynamic memory based on a single global heap, which all applications and the system share, and
multiple, private local heaps, one for each application. Local and global memory management
functions were also provided, offering extended features for this new memory management system.
More recently, the Microsoft C run-time (CRT) libraries were modified to include capabilities for
managing these heaps in Windows using native CRT functions such as malloc and free. Consequently,
developers are now left with a choice—learn the new application programming interface (API)
provided as part of Windows or stick to the portable, and typically familiar, CRT functions for
managing memory in applications written for Windows.
The Windows API offers three groups of functions for managing memory in applications: memory-
mapped file functions, heap memory functions, and virtual memory functions.
Figure 1. The Windows API provides different levels of memory management for versatility in
application programming.
In all, six sets of memory management functions exist in Windows, as shown in Figure 1, all of which
were designed to be used independently of one another. So, which set of functions should you use?
The answer to this question depends greatly on two things: the type of memory management you
want and how the functions relevant to it are implemented in the operating system. In other words,
are you building a large database application where you plan to manipulate subsets of a large
memory structure? Or maybe you're planning some simple dynamic memory structures, such as linked
lists or binary trees? In both cases, you need to know which functions offer the features best suited to
your intention and exactly how much of a resource hit occurs when using each function.
Table 1 categorizes the memory management function groups and indicates which of the three
technical articles in this series describes each group's behavior. Each technical article emphasizes the
impact these functions have on the system by describing the behavior of the system in response to
using the functions.
Table 1. Memory Management Functions

Memory set System resource affected Related technical article

Virtual memory functions A process' virtual address "Managing Virtual Memory"


space
System pagefile
System memory
Hard disk space

Memory-mapped file A process's virtual address "Managing Memory-Mapped


functions space Files"
System pagefile
Standard file I/O
System memory
Hard disk space

Heap memory functions A process's virtual address "Managing Heap Memory"


space
System memory
Process heap resource
structure

Global heap memory A process's heap resource "Managing Heap Memory"


functions structure
Local heap memory A process's heap resource "Managing Heap Memory"
functions structure

C run-time reference A process's heap resource "Managing Heap Memory"


library structure

Windows Memory System Overview


Windows employs a page-based virtual memory system that uses linear addressing. Internally, the
system manages all memory in segments called pages. Each page of physical memory is backed by
either a pagefile for volatile pages of memory or a disk file for read-only memory pages. There can be
as many as 16 separate pagefiles at a time. Code, resources, and other read-only data are backed
directly by the files from which they originated.
Windows NT provides an independent, 2 gigabyte (GB) user address space for each application
(process) in the system. To the application, it appears that there is 2 GB of memory available,
regardless of the amount of physical memory that is actually available. When an application requests
more memory than is available, Windows NT satisfies the request by paging noncritical pages of
memory—from this and/or other processes—to a pagefile and freeing those physical pages of
memory. Conceptually, the global heap no longer exists in Windows NT. Instead, each process has a
private 32-bit address space from which all of the memory for the process is allocated—including
code, resources, data, DLLs (dynamic-link libraries), and dynamic memory. Realistically, the system is
still limited by whatever hardware resources are available, but the management of available resources
is performed independently of the applications in the system.

Virtual Memory
Windows NT makes a distinction between memory and address space. Each process is attributed 2 GB
of user address space no matter how much physical memory is actually available for the process. Also,
all processes use the same range of linear 32-bit addresses ranging from 0000000016-7FFFFFFF16,
regardless of what memory is available. Windows NT takes care of paging memory to and from disk at
appropriate times so that each process is sure to be able to address the memory it needs. Although
two processes may attempt to access memory at the same virtual address simultaneously, the
Windows NT virtual memory manager actually represents these two memory locations at different
physical locations where neither is likely to coincide with the original virtual address. This is virtual
memory.
Because of virtual memory, an application is able to manage its own address space without having to
consider the impact on other processes in the system. The memory manager in Windows NT is
responsible for seeing that all applications have enough physical memory to operate effectively at any
given moment. Applications for the Windows NT operating system do not have to be concerned with
sharing system memory with other applications as they did in Windows version 3.1 or earlier. Yet even
with their own address space, applications still have the ability to share memory with other
applications.
One benefit of distinguishing between memory and address space is the capability it provides to
applications for loading extremely large files into memory. Instead of having to read a large file into
memory, Windows NT provides support for the application to reserve the range of addresses that the
file needs. Then, sections of the file can be viewed (physically read into memory) as needed. The same
can be done for large allocations of dynamic memory through virtual memory support.
In previous versions of Windows, an application had to allocate memory before being able to
manipulate the addresses in that memory. In Windows NT, the address space of each process is
already allocated; whether there is any memory associated with the addresses in the address space is
a different issue. The virtual memory management functions provide low-level support for
independently managing both the addresses and memory of a process.
The key virtual memory functions are:

 VirtualAlloc and VirtualFree


 VirtualLock and VirtualUnlock
 VirtualQuery or VirtualQueryEx
 VirtualProtect or VirtualProtectEx

Each function is grouped with its counterpart if it has one. Memory is allocated using VirtualAlloc and,
once allocated, must be freed with VirtualFree. Similarly, pages that have been locked
with VirtualLock must be unlocked with VirtualUnlock when no longer
needed. VirtualQuery and VirtualProtect have no counterparts, but they both have complementary
functions (indicated by the Ex extension on the function names) that allow them to be used on
processes other than the calling process, if the calling process has the appropriate privilege to do so.
These functions are explained below in their appropriate context.

Free, Reserved, and Committed Virtual Memory


Every address in a process can be thought of as either free, reserved, or committed at any given time.
A process begins with all addresses free, meaning they are free to be committed to memory or
reserved for future use. Before any free address may be used, it must first be allocated as reserved or
committed. Attempting to access an address that is either reserved or free generates an access
violation exception.
The entire 2 GB of addresses in a process are either free for use, reserved for future use, or committed
to specific memory (in use). Figure 2 represents a hypothetical process consisting of free, reserved,
and committed addresses.

Figure 2. A process's 2 GB of virtual address space is divided into regions of free, reserved, and
committed memory locations.
Reserved Addresses
When reserving addresses in a process, no pages of physical memory are committed, and perhaps
more importantly, no space is reserved in the pagefile for backing the memory. Also, reserving a range
of addresses is no guarantee that at a later time there will be physical memory available to commit to
those addresses. Rather, it is simply saving a specific free address range until needed, protecting the
addresses from other allocation requests. Without this type of protection, routine operations such as
loading a DLL or resource could occupy specific addresses and jeopardize their availability for later use.
Reserving addresses is a quick operation, completely independent of the size of the address range
being reserved. Whether reserving a 1 GB or a 4K range of addresses, the function is relatively speedy.
This is not surprising considering that no resources are allocated during the operation. The function
merely makes an entry into the process's virtual address descriptor (VAD) tree.
To reserve a range of addresses, invoke the VirtualAlloc function as shown in the following code
fragment:
/* Reserve a 10 MB range of addresses */
lpBase = VirtualAlloc (NULL,
10485760,
MEM_RESERVE,
PAGE_NOACCESS);

As shown here, a value of NULL used for the first parameter, lpAddress, directs the function to reserve
the range of addresses at whichever location is most convenient. Alternatively, a specific address could
have been passed indicating a precise starting address for the reserved range. Either way, the return
value to this function indicates the address at the beginning of the reserved range of addresses,
unless the function is unable to complete the request. Then, the return value for
the VirtualAlloc function is an error-status value.
The second parameter indicates the range of addresses the function should allocate. This value can be
anywhere from one page to 2 GB in size, but VirtualAlloc is actually constrained to a smaller range
than that. The minimum size that can be reserved is 64K, and the maximum that can be reserved is the
largest contiguous range of free addresses in the process. Requesting one page of reserved addresses
results in a 64K address range. Conversely, requesting 2 GB will certainly fail because it is not possible
to have that much address space free at any given time. (Remember that the act of loading an
application consumes part of the initial 2 GB address space.)
Note

Windows NT builds a safeguard into every process's address space. Both the upper and lower 65,536
bytes of each process are permanently reserved by the system. These portions of the address space
are reserved to trap stray pointers—pointers that attempt to address memory in the range
0000000016-0000FFFF16 or 7FFF000016-7FFFFFFF16. Not coincidentally, it is easy to detect pointers in
this range by simply ignoring the lower four nibbles (the rightmost two bytes) in these addresses.
Essentially, a pointer is invalid if the upper four nibbles are 000016 or 7FFF16; all other values
represent valid addresses.
The final two parameters in the VirtualAlloc function, dwAllocationType and dwProtect, are used to
determine how to allocate the addresses and the protection to associate with them. Addresses can be
allocated as either type MEM_COMMIT or MEM_RESERVE. PAGE_READONLY, PAGE_READWRITE, and
PAGE_NOACCESS are the three protections that can be applied to virtual memory. Reserved addresses
are always PAGE_NOACCESS, a default enforced by the system no matter what value is passed to the
function. Committed pages can be either read-only, read-write, or no-access.
Committed Memory
To use reserved addresses, memory must first be committed to the addresses. Committing memory to
addresses is similar to reserving it—call VirtualAlloc with the dwAllocation parameter equal to
MEM_COMMIT. At this point, resources become committed to addresses. Memory can be committed
as little as one page at a time. The maximum amount of memory that can be committed is based
solely on the maximum range of contiguous free or reserved addresses (but not a combination of
both), regardless of the amount of physical memory available to the system.
When memory is committed, physical pages of memory are allocated and space is reserved in a
pagefile. That is, pages of committed memory always exist as either physical pages of memory or as
pages that have been paged to the pagefile on disk. It is also possible that, while committing a chunk
of memory, part or all of that memory will not reside in physical memory initially. Some pages of
memory reside initially in the pagefile until accessed. Once pages of memory are committed, the
virtual memory manager treats them like all other pages of memory in the system.
In the Windows NT virtual memory system, page tables are used to access physical pages of memory.
Each page table is itself a page of memory, like committed pages. Occasionally, when committing
memory, additional pages must be allocated for page tables at the same time. So a request to commit
a page of memory can require one page commitment for a page table, one page for the requested
page, and two pages of space in the pagefile to back each of these pages. Consequently, the time it
takes VirtualAlloc to complete a memory-commit request varies widely, depending on the state of
the system and the size of the request.
The following example demonstrates how to commit a specific page of reserved addresses from the
previous example to a page of memory.
/* Commit memory for 3rd page of addresses. */
lpPage3 = VirtualAlloc (lpBase + (2 * 4096),
4096,
MEM_COMMIT,
PAGE_READWRITE);

Notice that instead of specifying NULL for lpAddress, a specific address is given to indicate exactly
which page of reserved addresses becomes committed to memory. Also, this page of memory is
initially given PAGE_READWRITE protection instead of PAGE_NOACCESS as in the previous example.
The return address from the function is the virtual address of the first pages of committed addresses.
Freeing Virtual Memory
Once addresses have been allocated as either reserved or committed, VirtualFree is the only way to
release them—that is, return them to free addresses. VirtualFree can also be used to decommit
committed pages and, at the same time, return the addresses to reserved status. When decommitting
addresses, all physical memory and pagefile space associated with the addresses is released. The
following example demonstrates how to decommit the page of memory committed in the previous
example.
/* Decommit memory for 3rd page of addresses. */
VirtualFree (lpBase + (2 * 4096),
4096,
MEM_DECOMMIT,
PAGE_NOACCESS);

Only addresses that are committed can be decommitted. This is important to remember when you
need to decommit a large range of addresses. Say, for example, you have a range of addresses where
several subsets of the addresses are committed and others are reserved. The only way to make the
entire range reserved is to independently decommit each subset of committed addresses one by one.
Attempting to decommit the entire range of addresses will fail because reserved addresses cannot be
decommitted.
Conversely, the same range of addresses can be freed in one fell swoop. It doesn't matter what the
state of an address is when the address is freed. The following example demonstrates freeing the 10
MB range of addresses reserved in the first example.
/* Free entire 10 MB range of addresses. */
VirtualFree (lpBase,
10485760,
MEM_RELEASE,
PAGE_NOACCESS);
Changing Protection on Pages of Virtual Memory
Use the VirtualProtect function as a method for changing the protection on committed pages of
memory. An application can, for example, commit a page of addresses as PAGE_READWRITE and
immediately fill the page with data. Then, the protection on the page could be changed to
PAGE_READONLY, effectively protecting the data from being overwritten by any thread in the process.
The following example uses the VirtualProtect function to make an inaccessible page available.
/* Change page protection to read/write. */
VirtualProtect (lpStack + 4096,
4096,
PAGE_READWRITE,
lpdwOldProt);

Consider the following as a context for using this function. A data-buffering application receives a
varying flow of data. Depending on specific hardware configurations and other software applications
competing for CPU time, the flow of data may at times exceed the capability of the process. To
prevent this from happening, the application designs a memory system that initially commits some
pages of memory for a buffer. The application then protects the upper page of memory with
PAGE_NOACCESS protection so that any attempt to access this memory generates an exception. The
application also surrounds this code with an exception handler to handle access violations.
When an access violation exception occurs, the application is able to determine that the buffer is
approaching its upper limit. It responds by changing the protection on the page to PAGE_READWRITE,
allowing the buffer to receive any additional data and continue uninterrupted. At the same time, the
application spawns another thread to slow the data flow until the buffer is back down to a reasonable
operating range. When things are back to normal, the upper page is returned to PAGE_NOACCESS
and the additional thread goes away. This scenario describes how combining page protection and
exception handling can be used to provide unique memory management opportunities.

Locking Pages of Virtual Memory


Processes in Windows NT have a minimal set of pages called a working set that, in order for the
process to run properly, must be present in memory when running. Windows NT assigns a default
number of pages to a process at startup and gradually tunes that number to achieve a balanced
optimum performance among all active processes in the system. When a process is running (actually,
when the threads of a process are running), Windows NT works hard at making sure that the process
has its working set of pages resident in physical memory at all times.
Processes in Windows NT are granted subtle influence into this system behavior with
the VirtualLock and VirtualUnlock functions. Essentially, a process can establish specific pages to
lock into its working set. However, this does not give the process free reign over its working set. It
cannot affect the number of pages that make up its working set (the system adjusts the working set
for each process routinely), and it cannot control when the working set is in memory and when it is
not. The maximum number of pages that can be locked into a process's working set at one time is
limited to 32. An application could do more harm than good by locking pages of committed memory
into the working set because doing so may force other critical pages in the process to become
replaced. In that case, the pages could become paged to disk, causing page faults to occur whenever
they were accessed. Then the process would spend much of its CPU allotment just paging critical
pages in and out of memory.
Below is an example that locks a range of addresses into memory when the process is running.
/* Lock critical addresses into memory. */
VirtualLock (lpCriticalData, 1024);

Notice the range of addresses being locked into memory in this example is less than one page. It is
not necessary for the entire range to be in a single page of memory. The net result is that the entire
page of memory containing the data for the addresses, not just the data for the addresses indicated, is
locked into memory. If the data straddles a page boundary, both pages are locked.

Querying a Process's Virtual Memory


Given a process's 2 GB of address space, managing the entire range of addresses would be difficult
without the ability to query address information. Because the addresses themselves are represented
independent of the memory that may or may not be committed to them, querying them is simply a
matter of accessing the data structure that maintains their state. In Windows NT, this structure is the
virtual address descriptor tree mentioned earlier. Windows exposes the capability of "walking the VAD
structure" in the VirtualQuery and VirtualQueryExfunctions. Again, the Ex suffix indicates which
function can be called from one process to query another—if the calling process has the security
privilege necessary to perform this function. The following example is extracted from the
ProcessWalker sample:
/* Query next region of memory in child process. */
VirtualQueryEx (hChildProcess,
lpMem,
lpList,
sizeof (MEMORY_BASIC_INFORMATION));

The ProcessWalker application's primary function is to walk a process's address space, identifying each
of its distinct address regions and representing specific state information about each region. It does
this by enumerating each region one at a time from the bottom of the process to the top. lpMem is
used to indicate the location of each region. Initially it is set to 0, and after returning from each query
of a new region, it is incremented by the size of the region it queried. This process is repeated
until lpMem reaches the upper system reserved area.
lpList is a pointer to a MEMORY_BASIC_INFORMATION structure to be filled in by
the VirtualQueryEx function. When the function returns, this structure represents information about
the region queried. The structure has the following members:
typedef struct _MEMORY_BASIC_INFORMATION { /* mbi */
PVOID BaseAddress; /* Base address of region */
PVOID AllocationBase; /* Allocation base address */
DWORD AllocationProtect; /* Initial access protection */
DWORD RegionSize; /* Size in bytes of region */
DWORD State; /* Committed, reserved, free */
DWORD Protect; /* Current access protection */
DWORD Type; /* Type of pages */
} MEMORY_BASIC_INFORMATION;

The VirtualQuery function returns this state information for any contiguous address region. The
function determines the lower bound of the region and the size of the region, along with the exact
state of the addresses in the region. The address it uses to determine the region can be any address in
the region. So, if you wish to determine how much stack space has been committed at any given time,
follow these steps:

1. Get the thread context for the thread in question.


2. Call the VirtualQuery function, supplying the address of the stack pointer in the thread
context information as the lpMem parameter in the function.

The query returns the size of the committed memory and the address of the base of the stack
in the MEMORY_BASIC_INFORMATIONstructure in the form of
the RegionSize and BaseAddress, respectively.
Regions of memory, as defined by VirtualQuery, are a contiguous range of addresses whose
protection, type, and base allocation are the same. The type and protection values are described
earlier in this technical article. The base allocation is the lpAddress parameter value that is used when
the entire region of memory was first allocated via the VirtualAlloc function. It is represented in
the MEMORY_BASIC_INFORMATIONstructure as the AllocationBase field.
When free addresses become either reserved or committed, their base allocation is determined at that
time. A region of memory is not static by any means. Once a single page in a region of reserved
addresses becomes committed, the region is broken into one or more reserved regions and one
committed region. This continues as pages of memory change state. Similarly, when one of several
PAGE_READWRITE committed pages is changed to PAGE_READONLY protection, the region is broken
into multiple, smaller regions.

Conclusion
The virtual memory management functions in Windows offer direct management of virtual memory in
Windows NT. Each process's 2 GB user address space is divided into regions of memory that are either
reserved, committed, or free virtual addresses. A region is defined as a contiguous range of addresses
in which the protection, type, and base allocation of each address is the same. Within each region are
one or more pages of addresses that also carry protection and pagelock flag status bits.
The virtual memory management functions provide capabilities for applications to alter the state of
pages in the virtual address space. An application can change the type of memory from committed to
reserved or change the protection from PAGE_READWRITE to PAGE_READONLY to prevent access to a
region of addresses. An application can lock a page into the working set for a process to minimize
paging for a critical page of memory. The virtual memory functions are considered low-level functions,
meaning they are relatively fast but they lack many high-level features.

Windows Process Memory Usage


Demystified
Wondering about Windows process memory use? Here's a breakdown
of everything you've ever questioned about Windows memory use.
by
Sasha Goldshtein

“How much memory is your process using?” — I bet you were asked that question, or asked it
yourself, more times than you can remember. But what do you really mean by memory?

I never thought it would be hard to find a definitive resource for what the various memory usage
counters mean for a Windows process. But try it: Google “Windows Task Manager memory
columns,”and you’ll see confusing, conflicting, inconsistent, unclear explanations of what the
different metrics represent. If we can’t even agree on what “working set” or “commit size” means,
how can we ever monitor our Windows applications successfully?

First, we will need a sample application that will allocate various kinds of memory for our
experiments. I’ve written one for this blog post: it is simply called Memory. You can find it on GitHub.
Currently, it supports multiple kinds of allocations: reserve, commit, shareable memory, and more.

To monitor application memory usage, we will use Sysinternals VMMap, a long-time favorite on my
blog. It offers unparalleled insight into what your application is doing in terms of memory. Simply
choose a process when launching VMMap, and view memory utilization categorized by type (private,
shared, reserved, committed) and purpose (image, heap, stack, mapped file). You can also run it
from the command line, for example:

VMMap.exe -p MyApp output.csv

Armed with these tools, let’s get to business and try to characterize the various kinds of memory
usage in Windows processes. We must begin with the virtual memory size of the process — the
amount of address space that is in use.

Virtual Memory
Windows applications do not access physical memory directly. Any address in your application is a
virtual address that is translated by the CPU to a physical address when accessed. Although it is
often the case that there is more virtual memory available than RAM to back it up, virtual memory is
still limited. On 32-bit Windows with a default configuration, each process can allocate up to 2GB of
virtual memory. On 64-bit Windows, each 64-bit process can allocate up to 128TB of virtual memory
(this limit used to be 8TB until Windows 8.1).
Each page of virtual memory can be in one of three states: free, reserved, and committed:

Free pages is available for subsequent allocations (and excluding unusable pages, discussed later).

Reserved pages are not available for subsequent allocations, but they are not backed by physical
memory. In other words, you may not access reserved pages, and you may not assume that at some
point the system will have sufficient physical memory to back them up. For example, try
running Memory.exe reserve 10000000 to allocate approximately 10TB of reserved memory. This
should work just fine (on a 64-bit system, of course), although you probably don’t have enough
physical memory to back up 10TB of virtual addresses.

Committed pages may be accessed by your application. The system guarantees that when you
access a committed page, there will be physical memory to back it up. The physical memory is
allocated on-demand, when you first access the page. Even though the system doesn’t allocate
physical memory immediately, this guarantee implies that there is a system-wide limit on how much
memory can be committed by all processes. This limit is called the commit limit. If an allocation
would exceed the commit limit, the system does not satisfy it. Go ahead and try it: Memory.exe
commit 10000000.

To further complicate things, committed memory can be shared with other processes. If two
processes share 100MB of physical memory, the 100MB virtual region is committed in both
processes, but it only counts once towards the commit limit.

It makes sense to examine the following aspects of a process’ virtual memory usage:

Committed bytes. This information is available in VMMap under Total > Committed and the Process
> Page File Bytes performance counter.

Reserved bytes. This information is available in VMMap as the delta between Total > Size and Total >
Committed. It can be calculated as the difference between non-free bytes and committed bytes.

Non-free bytes. This information is available in VMMap under Total > Size, or as the Process > Virtual
Bytes performance counter.

Free bytes. This information is available in VMMap under Free > Size. It can also be deduced from
the size of the virtual address space (2GB, 3GB, 4GB, 8TB, or 128TB — depending on the system
configuration), and the non-free bytes value.

This tells almost the whole story. Here’s a statement that at this point might sound fairly accurate:

The non-free bytes value is exactly the amount of virtual memory that
is available for subsequent allocations.
Unfortunately, it is not entirely accurate. The Windows memory manager guarantees (for historical
reasons) that new allocations are aligned on a 64KB boundary. Therefore, if your allocations are not
all divisible by 64KB, some memory regions might be lost for future allocations. VMMap calls
them Unusable, and it is the only tool that can reliably display them. To experiment,
run Memory.exe unusable 100. VMMap will report around 100MB of unusable virtual memory,
which is theoretically free and invisible to any other tool. However, that memory cannot be used to
satisfy future allocations, so it is as good as dead.
Shareable Memory
As I noted earlier, physical memory can be shared across multiple processes: more than one process
may have a virtual page mapped to a certain physical page. Some of these shared pages are not
under your direct control, e.g. DLL code is shared across processes; some other shared pages can be
allocated directly by your code. The reason it’s important to understand shared memory usage is
that a page of shared memory might be mistakenly attributed to all processes sharing that page.
Although it definitely occupies a range of virtual addresses in each process, it’s not duplicated in
physical memory.

There’s also a matter of terminology to clarify here. All shared pages are committed, but not all
committed pages can be shared. A shareable page must be allocated in advance as part of a section
object, which is the kernel abstraction for memory-mapped files and for sharing memory pages
across processes. So, to be precise, we can speak of two kinds of shareable memory:

Shareable memory that is shared: memory pages that are currently mapped into the virtual address
space of at least two processes.

Shareable memory that is not currently shared: memory pages that may be shared in the future, but
are currently mapped into the virtual address space of fewer than two processes.

NOTE: The terms “private” and “shared” (or “shareable”) memory refer only to committed memory.
Reserved pages cannot be shared, so it makes no sense to ask whether they are private or shareable.

It makes sense to look at the following per-process data points, to understand which part of its
virtual memory is shared (or shareable) with other processes:

Private bytes (memory that is not shared or shareable with other processes). This information is
available in VMMap under Total > Private. It is also available as a performance counter Process >
Private Bytes. Note that some of this committed memory may be backed by the page file, and not
currently resident in physical memory.

Shareable bytes. This information is available in VMMap under Shareable > Committed. You can’t
tell which of these bytes are actually shared with other processes, unless you settle for the following
two data points:

Shareable bytes currently resident. This information is available in VMMap under Total > Shareable
WS, but only includes pages that are resident in physical memory. It doesn’t include potentially-
shareable pages that happen to be paged out to disk, or that weren’t accessed yet after being
committed.

Shared bytes currently resident. This information is available in VMMap under Total > Shared WS,
but again only includes pages that are resident in physical memory.

Also note that VMMap’s Shareable category doesn’t include certain kinds of shareable memory,
such as images (DLLs). These are represented separately by the Image category.

Try it out: run Memory.exe shareable_touch 100. You’ll see private bytes unchanged, and shareable
bytes go up — even though the allocated memory isn’t currently shared with any other process.
Shared bytes, on the other hand, should remain the same. You can also try Memory.exe shareable
100 — you’ll see the Shareable/Shared WS values unchanged because physical memory is not
allocated unless the committed memory is also accessed.
Physical Memory
So far, we only discussed the state of virtual memory pages. Indeed, free, unusable, and reserved
pages have no effect on the amount of physical memory used by the system (other than the data
structures that must track reserved memory regions). But committed memory may have the effect
of consuming physical memory, too. Windows tracks physical memory on a system-wide basis, but
there is also information maintained on a per-process level that concerns that process’ individual
physical memory usage through its set of committed virtual memory pages, also known as
the working set.

Windows manages physical memory in a set of lists: active, standby, modified, free, and zero — to
name a few. These lists are global to all processes on the system. They can be very important from a
monitoring standpoint, but I’ll leave them for another time. If you’re really curious, there’s a great
Sysinternals tool called RAMMap that you can explore.

We need to add to our monitoring toolbox the following data points related to process physical
memory:

Private physical bytes. This refers to the physical pages that are mapped to private committed pages
in our process, and is often called the process’ private working set. This information is available in
VMMap under Total > Private WS. It is also available in Task Manager as Memory (private working
set).

Shareable or shared physical bytes. Similarly, these are the physical pages that are mapped to
shareable committed pages in our process. We discussed these metrics before when talking about
shareable/shared memory (in VMMap, these are under Total > Shared/Shareable WS).

Total physical bytes. Simply the sum of the previous two metrics. You might be tempted to say that
this is the amount of physical memory consumed by our process, which would be accurate if it
wasn’t for sharing. This information is available in VMMap under Total > Total WS, as the Process >
Working Set performance counter, and in Task Manager as Working set (memory).

Committed bytes not mapped yet to any backing storage (RAM or page file). Like I said before,
Windows doesn’t allocate any physical memory when you commit a page of virtual memory. Only
when the virtual page is first accessed, Windows handles the hardware exception by lazily allocating
a page of physical memory. So, you could have committed pages in your process that aren’t
currently backed by neither RAM nor page file — simply because they were never accessed until
now. Unfortunately, there is no easy way that I know of to get this information.

You can experiment with the on-demand physical memory allocation by running Memory.exe
commit 1000. Even though the system-wide commit size was charged 1000MB, you won’t see any
change in physical memory usage (e.g. in Task Manager). But now try Memory.exe commit_touch
1000, which commits memory and makes sure to touch every single page. This time, both the
commit size and physical memory usage should go up by 1000MB.

Committed bytes not currently resident. These are pages of committed memory that was paged out
to disk. If you’re willing to ignore committed pages that weren’t accessed yet, then this metric can
be calculated as the difference between VMMap’s Total > Committed and Total > Total WS values
(or as the difference between the Process > Page File Bytes and Process > Working Set Bytes
performance counters — recall that Process > Page File Bytes is really the commit size of the
process).
Kernel Memory
Finally, your process can indirectly affect the system’s memory usage, too. Kernel data structures
like files, sockets, and mutexes are created and destroyed when your process requests it. Page tables
that map virtual to physical addresses are allocated and populated when you commit and access
memory pages.

Although it is rarely the case that your process would make a significant dent in the kernel’s memory
usage, it’s important to monitor the following metrics:

Pool bytes. This refers to kernel memory directly attributable to your process, such as data
structures for files or synchronization objects. Pool memory is further subdivided to paged
pool and non-paged pool. The system-wide pool utilization values are available in Task Manager
(under the Memory tab), or as the Memory > Pool Paged Bytes and Memory > Pool Nonpaged Bytes
performance counters.

For some kernel objects, the pool allocation is also charged to the owning process. This is the case
with I/O completion packets queued to an I/O completion port, which is what you can experiment
with by running Memory.exe nppool 10000 and inspecting the value of the Process > Pool
Nonpaged Bytes performance counter. (To quickly inspect performance counters, run typeperf from
a command prompt window. For example: typeperf “Process(Memory)\Pool Nonpaged Bytes” will
show you the counter’s value every second.)

Page table bytes. Mapping virtual addresses to physical addresses requires book-keeping, provided
by data structures called page tables. These data structures are allocated in kernel memory. At a
high level, mapping a small page of virtual memory (4KB on both x86 and x86_64) requires 4 or 8
bytes of page table space, plus some additional small overhead. Because Windows is lazy, it doesn’t
construct page tables in advance when you reserve or even commit memory. Only when you actively
access a page, Windows will fill the page table entry. Page table usage is available in VMMap as Page
Table > Size. It would typically be a fairly small value, even if you allocate a lot of memory.

Experiment by running Memory.exe commit_touch 2000 (committing and touching almost 2GB of
memory). On my Windows 10 x64 system, the resulting increase in page table bytes was
approximately 4MB.

NOTE: Because any virtual memory allocation has the potential of requiring page table space
eventually, Windows used to charge reserved memory to the system commit limit, because it
anticipated these reserved pages to eventually become committed and require actual page table
space. In Windows 8.1 x64 and Windows 10 x64, a security mechanism called CFG (Control Flow
Guard) requires a 2TB chunk of reserved memory for each process. Charging commit for that many
pages would be impractical. Therefore, on newer versions of Windows, reserving memory does not
charge commit. You can verify this by running Memory.exe reserve 1000000 (to reserve almost 1TB
of memory) and note that the system-wide commit limit (typeperf “Memory\Committed Bytes”)
doesn’t go up considerably.

Summary
Hopefully, this post explained the key memory monitoring metrics for Windows processes. There’s a
lot more to say about the internals of Windows memory management, and I’m happy to refer you
to Windows Internals, 6th Edition for more details. You might also find the Testlimit tool useful to
check just how far the memory manager is willing to stretch for you
Don't forget about memory
How to monitor your Java applications' Windows memory usage
Emma Shepherd, Martin Trotter, Caroline Maynard, and Matthew Peters
Published on November 16, 2004
FacebookTwitterLinked InGoogle+E-mail this page

One of Java technology's well-known advantages is that the Java programmer --


unlike, say, a C programmer -- doesn't have the daunting responsibility of allocating
and freeing required memory. The Java runtime simply manages these tasks for you.
Memory is allocated automatically in the heap area for each instantiated object, and
the garbage collector periodically reclaims memory occupied by objects that are no
longer needed. But you're not completely off the hook. You still need to monitor your
program's memory usage, because a Java process's memory is made up of more
than objects floating around in the heap. It also consists of the bytecode for the
program (instructions the JVM interprets at runtime), JIT code (code that's already
been compiled for the target processor), any native code, and some metadata that
the JVM uses (exception tables, line number tables and so on). To complicate
matters, certain types of memory, such as native libraries, can be shared between
processes, so determining a Java application's true footprint can be a difficult task.
Tools for monitoring memory usage under Windows abound, but unfortunately no
single tool gives you all the information you need. What's worse, the variety of tools
out there don't even share a common vocabulary. But help has arrived. This article
introduces some of the most useful freely available tools and provides some tips on
how to use them.
Windows memory: A whirlwind tour
Before you can understand the tools we discuss in this article, you need a basic
understanding of how Windows manages memory. Windows uses a demand-paged
virtual memory system. Read on for a crash course.
The virtual address space
The concept of virtual memory was born in the 1950s as a solution to the complex
problem of how to deal with a program that won't all fit into real memory at once. In a
virtual-memory system, programs are given access to a larger set of addresses than
is physically available, and a dedicated memory manager maps these logical
addresses to actual locations, using temporary storage on disc to hold the overflow.
In the modern implementation of virtual memory that Windows uses, virtual storage
is organized into equal-sized units known as pages. Each operating-system process
is allocated its own virtual address space -- a set of virtual-memory pages that it can
read from and write to. Each page can be in one of three states:
 Free: The process is not yet using that area of the address space. Any attempt to
access that area for reading or writing causes some kind of runtime failure. A
Windows dialog box will probably pop up to say that an access violation has
occurred. (Your Java program can't make this kind of mistake; only a program written
in a language that supports pointers can.)
 Reserved: This area of the address space has been reserved for future use by the
process but can't be accessed until it has been committed. A lot of the Java heap
starts off as reserved.
 Committed: Memory that can be accessed by the program and is fully backed,
which means that page frames have been allocated for it in the paging file.
Committed pages are loaded into main memory only when the process first
references them. Hence the name on-demand paging.
Figure 1 illustrates how virtual pages in a process's address space are mapped to
physical page frames in memory.
Figure 1. Mapping of virtual pages in a process's address space to physical
page frames

If you're running on a 32-bit machine (a normal Intel processor, for example), the
total virtual address space for a process is 4GB, because 4GB is the largest number
you can address with just 32 bits. Windows doesn't usually let you access all of the
memory in this address space; your process gets just under half for its own private
use, and Windows uses the rest. The 2GB private area contains most of the memory
the JVM needs in order to execute your program: the Java heap, the C heap for the
JVM itself, stacks for program threads, memory for holding bytecode and JITted
methods, memory that native methods allocate, and more. We'll identify some of
these in an address-space map later on in this article.
Programs that want to allocate a large and contiguous area of memory but don't
need it all immediately often use a combination of reserved and committed memory.
The JVM allocates the Java heap in this way. The -mx parameter to the JVM tells it
what the maximum size of the heap should be, but the JVM often doesn't allocate all
that memory at the start. It reserves the amount specified by -mx, marking the full
range of addresses as available to be committed. It then commits just a part of the
memory, and it is only for this part that the memory manager needs to allocate pages
in real memory and in the paging file to back them up. Later, when the amount of live
data grows and the heap needs to be expanded, the JVM can commit a little more,
adjacent to the currently committed area. This way the JVM can maintain a single
contiguous heap while growing it as needed (see Related topics for an article on how
to use the JVM heap parameters).
Real memory
Physical storage is also organized into equal-sized units, most commonly known
as page frames. An operating-system data structure called the page table maps the
virtual pages accessed by applications to real page frames in main memory. The
pages that can't fit are kept in temporary paging files on disc. When a process tries
to access a page that's not currently in memory, a page fault occurs that causes the
memory manager to retrieve it from the paging file and put it back into main memory
-- a task known as paging. The precise algorithm used to decide which page should
be swapped out depends on the version of Windows you are using; it's probably a
variation of the least-recently-accessed method. It is also important to note that
Windows allows page frames to be shared between processes, for example for DLLs,
which are often used by several applications at once. Windows does this by mapping
multiple virtual pages from different address spaces to the same physical location.
An application is blissfully unaware of all this activity. All it knows about is its own
virtual address space. However, an application soon begins to suffer a marked
degradation in performance if the set of its pages currently in main memory, known
as the resident set, is smaller than the set of pages it actually needs to use, known
as the working set. (Unfortunately, as you'll see throughout this article, the tools we'll
discuss often use these two terms interchangeably, even though they refer to quite
different things.)
Task Manager and PerfMon
We'll take a look first at the two most common tools, Task Manager and PerfMon.
They're both bundled with Windows, so it'll be easy for you to get started with them.
Task Manager
Task Manager is a fairly simple Windows process monitor. You can access it with
the familiar Ctrl-Alt-Delete key combination, or by right-clicking on the Taskbar. The
Processes tab shows the most detailed information, as illustrated in Figure 2.
Figure 2. The Task Manager Processes tab

The columns that Figure 2 displays have been customized by selecting View >
Select Columns. Some of the column headings have fairly cryptic names, but you
can find a definition of each one in the Task Manager help. The most relevant
counters for a process's memory usage are:
 Mem Usage: The online help calls this the working set of the process (although
many would call it the resident set) -- the set of pages currently in main memory.
However, the catch is that it includes pages that can be shared by other processes,
so you must be careful not to double-count. If you're trying to find out the combined
memory usage of two processes that use a common DLL, for example, you can't
simply add their Mem Usage values.
 Peak Mem Usage: The highest value of the Mem Usage field since the process
started.
 Page Faults: The total number of times since the process started that a page was
accessed that wasn't in main memory.
 VM Size: The online help describes this as the "total private virtual memory allocated
by the process." To be clear, this is the private memory that has been committed by
the process. This can be quite different from the size of the total address space if the
process reserves memory but doesn't commit it.
Although the Windows documentation refers to Mem Usage as the working set, it
might be helpful to understand that in this context it really means what many people
call the resident set. You'll find definitions of these terms in the Memory Management
Reference's glossary (see Related topics). Working set more commonly means the
logical concept of which pages the process would need to have in memory at that
point to avoid any paging.
PerfMon
Another tool Microsoft ships with Windows is PerfMon, which monitors a wide variety
of counters, from print queues to telephony. PerfMon is normally on the system path,
so you can start it by enteringperfmon from a command line. The advantage of this
tool is that it displays the counters graphically, so you can easily see how they
change over time.
To get going, click on the + button on the toolbar at the top of the PerfMon screen.
This brings up a dialog that lets you choose the counters you want to monitor, as in
Figure 3a. The counters are grouped into categories known as performance objects.
The two that are relevant for memory usage are Memory and Process. You can get
a definition of a counter by highlighting it and clicking on theExplain button. The
explanation then appears in a separate window that pops up below the main dialog
box, as shown in Figure 3b.
Figure 3a. PerfMon counters window

Figure 3b. Explanation

Select the counters you're interested in (using Ctrl to highlight multiple rows) and
also the instance you want to monitor -- the Java process for the application you're
examining -- and then click on Add. The tool immediately begins to display the
values of all the counters you've chosen. You can show them as a report, as a graph
over time, or as a histogram. Figure 4 shows the histogram display.
Figure 4. PerfMon histogram

If you don't see anything being graphed, you might need to change the scale by
right-clicking on the graph area, selecting Properties, and then navigating to the
Graph tab. Or to change the scale of a particular counter, go to its Data tab.
Counters to watch out for
Unfortunately, PerfMon uses different terminology from Task Manager. Table 1 gives
you a quick summary of the most useful counters and, if applicable, each one's Task
Manager equivalent:
Table 1. Useful PerfMon memory counters

Task
Counter Manager
name Category Description equivalent

Resident set
-- how many
pages are
Working currently in Mem
Set Process real memory Usage

Total private
virtual
Private memory
Bytes Process allocated, VM Size
Task
Counter Manager
name Category Description equivalent

meaning
committed
memory

Total size of
the virtual
address
space,
including
shared
pages. Can
be much
larger than
either of the
previous
two values
because it
includes
Virtual reserved
Bytes Process memory. --

Average Linked to
number of Page
page faults Faults,
that have which
Page occurred per shows the
Faults / sec Process second total

Total
number of
virtual bytes
in the
Committed "committed"
Bytes Memory state --

Try an example
You can explore how these quantities appear in Task Manager and in PerfMon by
downloading and running a small program we've written in C (see
the Download section). The program uses the Windows VirtualAlloc call first to
reserve, then to commit, memory. Finally, it starts to touch some of the memory,
writing a value into it every 4,096 bytes, to bring pages into the working set. If you
run the program and watch it with Task Manager or PerfMon, you'll see the values
change.
Useful tools on the Web
Now that you know how much memory your application is using, it's time to dive into
the details of the actual memory contents. This section introduces some slightly
more sophisticated tools, discusses when it's appropriate to use them, and explains
how to interpret their output.
PrcView
PrcView is the first tool we'll show you that lets you inspect the contents of a
process's address space (see Related topics). You can use it to do more than look at
footprint. It can set priorities and kill processes, and it also exists in a useful
command-line version that lists properties of the processes on your machine. But
we'll show you how to use it to look at footprint.
When you start PrcView, it shows a Task Manager-like view of the processes in the
system. If you scroll to and highlight a Java process, the screen looks like the
example in Figure 5.
Figure 5. Initial PrcView screen

Right-clicking on the Java process to bring up a pop-up menu, or


choosing Process from the top menu bar, lets you inspect a few things about the
process -- which threads it owns and what DLLs it has loaded -- and lets you kill it or
set its priority. The option we are interested in is to inspect its memory, which brings
up a screen like the one in Figure 6.
Figure 6. Inspecting a process's memory

Now you can examine the first few lines of the address-space map that PrcView
displays. The first line tells you that from address 0, for a length of 65,536 (64K), the
memory is free. Nothing is allocated, nor can the addresses be used. The second
line tells you that starting immediately after, at address 0x00010000, is an 8,192-byte
stretch (two 4K pages) of committed memory -- memory that can be addressed and
that is backed by page frames in the paging file. Then there's another free stretch,
then another committed stretch, and so on.
The chances are that none of this area of the address space means anything to you,
because it's used by Windows. Microsoft's documentation describing the Windows
address space says that these are various areas reserved for MS-DOS compatibility,
and that the area for user data and code begins at 4MB (see Related topics).
If you scroll down, you eventually come to something in the address space that you
can clearly recognize, as shown in Figure 7.
Figure 7. Java heap values

The highlighted line and the one immediately below it in Figure 7 correspond to the
Java heap. The Java process we started here was given a 1000MB heap (using -
mx1000m) -- extravagantly large for the program in question but made this size so it
would show up clearly in the PrcView map. The highlighted line shows the committed
part of the heap as only 4MB, starting at address 0x10180000. Immediately after the
highlighted line comes a large reserved stretch, which is the remainder of the heap
that hasn't yet been committed. During startup, the JVM initially reserved the full
1000MB (making the address range 0x10180000 to 0x4e980000 unavailable) and
then committed just what it needed to get started, in this case 4MB. To verify that
this value really does correspond to the current heap size, you can invoke the Java
program with the -verbosegc JVM option, which prints out detailed information from
the garbage collector. From the second line of the second GC in the following -
verbosegc output, you can see that the current heap size is approximately 4MB:
1
>java -mx1000m -verbosegc Hello
2
[ JVMST080: verbosegc is enabled ]
3 [ JVMST082: -verbose:gc output will be written to stderr ]
4 <GC[0]: Expanded System Heap by 65536 bytes
5
6 <GC(1): GC cycle started Wed Sep 29 16:55:44 2004
7 <GC(1): freed 417928 bytes, 72% free (3057160/4192768), in 104 ms>
<GC(1): mark: 2 ms, sweep: 0 ms, compact: 102 ms>
8
<GC(1): refs: soft 0 (age >= 32), weak 0, final 2, phantom 0>
9
<GC(1): moved 8205 objects, 642080 bytes, reason=4>
10
The format of the -verbosegc output depends on the JVM implementation you use;
see Related topics for an article on the IBM JVMs or consult your own provider's
documentation.
In the event that the amount of live data grows and the JVM needs to expand the
heap beyond 4MB, it commits a bit more of the reserved area. This means that the
new area can start at 0x10580000, contiguous with the heap memory that's already
committed.
The three totals in the bottom line of the PrcView screen in Figure 7 give you the
total committed memory for your process, totalled based on the seventh column,
headed Type. The totals are:
 Private: Committed memory backed by the paging file
 Mapped: Committed memory mapped directly into the file system
 Image: Committed memory belonging to executable code, both the starting
executable and DLLs
So far, you've located the heap within the address space, based only on its size. In
order to understand better what some of the other areas of memory are, it would be
helpful to be able to peer inside the memory. This is what the next tool we'll discuss,
TopToBottom, lets you do.
TopToBottom
TopToBottom is available free from smidgeonsoft.com (see Related topics). It comes
without any documentation but provides a comprehensive set of views into the
currently executing processes. You can sort the processes not only by name and
process ID, but also by startup time, which can be very useful if you need to
understand the sequence in which programs are started on your computer.
Figure 8 shows TopToBottom with the list of processes sorted by creation time (View
> Sort > Creation Time).
Figure 8. TopToBottom processes sorted by creation time

The StartUp tab displays the process that created our Java process, the time and
date at which it was started, and the actual command line used to invoke it, as well
as the full path to the executable and the current directory. You can also click on the
Environment tab to display the values of all the environment variables that were
passed into the process at startup. The Modules tab shows the DLLs in use by our
Java process, as in Figure 9.
Figure 9. TopToBottom Modules tab

Again, you can sort the list in a variety of ways. In Figure 9 they're sorted by
initialization order. If you double-click on a row, you'll see detailed information about
the DLL: its address and size, the date and time it was written, a list of other DLLs on
which it depends, and a list of all the running processes that have loaded the DLL. If
you explore the list, you'll see that some DLLs, for example NTDLL.DLL, are required
by every running process; some, such as JVM.DLL, are shared among all Java
processes; and others might be used by only a single process.
You could try working out the total size of the DLLs the process uses by adding up
the individual DLLs' sizes. However, the resulting number could be misleading
because it doesn't mean that the process is consuming all that footprint. The true
size depends on which parts of the DLL the process is actually using. Those parts
contribute to the process's working set. It may seem obvious, but it's also worth
noting that the DLLs are read-only and shared. If lots of processes all use a given
DLL, only one set of real memory pages holds the DLL data at any one time. Those
real pages might then be mapped at a number of different addresses into the
processes that are using them. Tools such as Task Manager report the working set
as the total of shared and nonshared pages, so it can be quite hard to determine the
true effect of DLL use on footprint. The modules information is a useful way to get a
"worst case" view of the footprint that's due to DLLs, which you can further refine by
more detailed analysis using other tools if necessary.
You're interested in the memory footprint, so click on the Memory tab. Figure 10
shows a small subset of all the memory our Java program uses.
Figure 10. TopToBottom Memory tab

This display is similar to PrcView, but it shows only the committed memory in the
virtual address space, not the reserved memory. However, it has a couple of
advantages. First, it can characterize the pages in more detail. For example, in
Figure 10 it has identified the Thread 3760 stack area specifically, not just as some
read/write data. Additional data areas it recognizes include Environment, Process
Parameters, Process Heap, Thread Stack, and Thread Environment Block (TEB).
Second, you can browse or even search the memory directly from within
TopToBottom itself. You can search for a text string, or you can do a hex search for
a sequence of up to 16 bytes. You can restrict the hex search to a specified
alignment, which is useful when you're searching for a reference to an address.
TopToBottom also has a snapshot facility that dumps all the information it has about
the process to the clipboard.
VADump
VADump is a convenient command-line tool that's part of the Microsoft ® Platform
SDK package (see Related topics). Its purpose is to dump an overview of the virtual
address space and resident set for a particular process. The simplest way to use
VADump is to enter this command at the command line:
1 vadump process_id
The process_id is the number of the process you're interested in. You can invoke
VADump with no arguments to display the full usage information. We also
recommend that you pipe the output to a file (for example, vadump 1234 >
output.txt), because VADump produces too much information to fit on the screen.
The output begins by showing an index into the virtual address space for the
process:
>vadump -p 3904
1
2
Address: 00000000 Size: 00010000
3 State Free
4
5 Address: 00010000 Size: 00002000
6 State Committed
Protect Read/Write
7
Type Private
8
9 Address: 00012000 Size: 0000E000
10 State Free
11
12 Address: 00020000 Size: 00001000
13 State Committed
14 Protect Read/Write
Type Private
15
16
Address: 00021000 Size: 0000F000
17
State Free
18
19 Address: 00030000 Size: 00010000
20 State Committed
21 Protect Read/Write
22 Type Private

23
24 Address: 00040000 Size: 0003B000 RegionSize: 40000
State Reserved
25
Type Private
26 ................................
27
28
29
30
(We've truncated the output at the dotted line for readability.)
For each block, you can see the following information:
 Address: In hexadecimal format, relative to the start of the process's virtual address
space
 Size: In bytes, also in hexadecimal
 State: Free, reserved, or committed
 Protection status: Either read-only or read/write
 Type: Either private (not accessible by other processes), mapped (directly from the
file system), or image (the executable code)
It then lists all of the DLLs in use by the process, with their sizes, followed by a
summary of statistics about the working set and page-file usage.
So far, this information is also available from other tools. But you can generate much
more revealing output by using VADump's -o option. It produces a snapshot of the
current working set (the pages actually in main memory at a given point in time). This
option is poorly documented, but it can be extremely useful for determining what the
most significant components of the resident set are -- and therefore what the most
promising candidates for memory optimization are. You can also use it to identify if
you have a memory leak, by taking snapshots at regular intervals over a period of
time. In this mode, the output begins with a more-detailed dump of the committed
pages in the virtual address space, whether they are currently in main memory or
not:
1
2
3 >vadump -o -p 3904
4
5 0x00010000 (0) PRIVATE Base 0x00010000
0x00011000 (0) PRIVATE Base 0x00010000
6
0x00020000 (0) PRIVATE Base 0x00020000
7
0x00030000 (0) PRIVATE Base 0x00030000
8 0x00031000 (0) Private Heap 2
9 0x00032000 (0) Private Heap 2
10 0x00033000 (0) Private Heap 2
11 0x00034000 (0) Private Heap 2
12 0x00035000 (0) Private Heap 2
0x00036000 (0) Private Heap 2
13
0x00037000 (0) Private Heap 2
14 0x00038000 (0) Private Heap 2
15 0x00039000 (0) Private Heap 2
16 0x0003A000 (0) Private Heap 2
17 0x0003B000 (0) Private Heap 2
18 0x0003C000 (0) Private Heap 2
0x0003D000 (0) Private Heap 2
19
0x0003E000 (0) Private Heap 2
20 0x0003F000 (0) Private Heap 2
21 0x0007C000 (0) Stack for ThreadID 00000F64
22 0x0007D000 (0) Stack for ThreadID 00000F64
23 0x0007E000 (0) Stack for ThreadID 00000F64
24 0x0007F000 (0) Stack for ThreadID 00000F64
25 0x00080000 (7) UNKNOWN_MAPPED Base 0x00080000
0x00090000 (0) PRIVATE Base 0x00090000
26
0x00091000 (0) Process Heap
27 0x00092000 (0) Process Heap
28 0x00093000 (0) Process Heap
29 ...........................
30
31
If you scroll down to the end of this long listing you'll come to the more interesting
information: a list of the page-table mappings for the process's pages that are
currently resident in main memory:
1 0xC0000000 > (0x00000000 : 0x003FFFFF) 132 Resident Pages
2 (0x00280000 : 0x00286000) > jsig.dll
3 (0x00290000 : 0x00297000) > xhpi.dll
4 (0x002A0000 : 0x002AF000) > hpi.dll
(0x003C0000 : 0x003D8000) > java.dll
5
6 (0x003E0000 : 0x003F7000) > core.dll
7 (0x00090000 : 0x00190000) > Process Heap segment 0
(0x00190000 : 0x001A0000) > Private Heap 0 segment 0
8
(0x001A0000 : 0x001B0000) > UNKNOWN Heap 1 segment 0
9 (0x00380000 : 0x00390000) > Process Heap segment 0
10 (0x00030000 : 0x00040000) > Private Heap 2 segment 0
11 (0x00390000 : 0x003A0000) > Private Heap 3 segment 0
12 (0x00040000 : 0x00080000) > Stack for thread 0
13
14 0xC0001000 > (0x00400000 : 0x007FFFFF) 13 Resident Pages
(0x00400000 : 0x00409000) > java.exe
15
.................................................................
16
17
Each of these mappings corresponds to a single entry in the page table, and so
contributes an additional 4KB to the working set for the process. It can still be quite
difficult to work out from these mappings what parts of your application are using the
most memory, but luckily the next part of the output is a useful summary:
1
2 Category Total Private Shareable Shared
3 Pages KBytes KBytes KBytes KBytes
Page Table Pages 20 80 80 0 0
4
Other System 10 40 40 0 0
5 Code/StaticData 1539 6156 3988 1200 968
6 Heap 732 2928 2928 0 0
7 Stack 9 36 36 0 0
8 Teb 5 20 20 0 0
9 Mapped Data 30 120 0 0 120
Other Data 1314 5256 5252 4 0
10
11
Total Modules 1539 6156 3988 1200 968
12 Total Dynamic Data 2090 8360 8236 4 120
13 Total System 30 120 120 0 0
14 Grand Total Working Set 3659 14636 12344 1204 1088
15
The two most interesting values are normally Heap, which is the Windows process
heap, and Other Data. Memory allocated directly through Windows API calls form
part of the process heap, and Other Data includes the Java heap. The Grand Total
Working Set correlates to Task Manager's Mem Usage plus the Teb field, which is
the memory needed for the processes's Thread Environment Block, an internal
Windows structure.
Finally, the bottom of the VADump -o output is a summary of the relative contributions
to the working set from DLLs, heaps, and thread stacks:
1 Module Working Set Contributions in pages
2 Total Private Shareable Shared Module
3 9 2 7 0 java.exe
4 85 5 0 80 ntdll.dll
5 43 2 0 41 kernel32.dll
6 15 2 0 13 ADVAPI32.dll
11 2 0 9 RPCRT4.dll
7
53 6 0 47 MSVCRT.dll
8 253 31 222 0 jvm.dll
9 6 3 3 0 jsig.dll
10 7 4 3 0 xhpi.dll
11 15 12 3 0 hpi.dll
12 12 2 0 10 WINMM.dll
21 2 0 19 USER32.dll
13
14 2 0 12 GDI32.dll
14
6 2 0 4 LPK.DLL
15 10 3 0 7 USP10.dll
16 24 18 6 0 java.dll
17 22 16 6 0 core.dll
18 18 14 4 0 zip.dll
915 869 46 0 jitc.dll
19
20
Heap Working Set Contributions
21
6 pages from Process Heap (class 0x00000000)
22 0x00090000 - 0x00190000 6 pages
23 2 pages from Private Heap 0 (class 0x00001000)
24 0x00190000 - 0x001A0000 2 pages
25 0 pages from UNKNOWN Heap 1 (class 0x00008000)
26 0x001A0000 - 0x001B0000 0 pages
1 pages from Process Heap (class 0x00000000)
27
0x00380000 - 0x00390000 1 pages
28 715 pages from Private Heap 2 (class 0x00001000)
29 0x00030000 - 0x00040000 15 pages
30 0x008A0000 - 0x009A0000 241 pages
31 0x04A60000 - 0x04C60000 450 pages
32 0x054E0000 - 0x058E0000 9 pages
1 pages from Private Heap 3 (class 0x00001000)
33
0x00390000 - 0x003A0000 1 pages
34 7 pages from Private Heap 4 (class 0x00001000)
35 0x051A0000 - 0x051B0000 7 pages
36
37 Stack Working Set Contributions
38 4 pages from stack for thread 00000F64
39 1 pages from stack for thread 00000F68
1 pages from stack for thread 00000F78
40
1 pages from stack for thread 00000F7C
41
2 pages from stack for thread 00000EB0
42
43
44
45
46
47
You can also use VADump in this mode to get an accurate view of the combined
footprint of two or more Java processes (see Tips and tricks, later in this article).
Sysinternals Process Explorer
Yet more useful tools for analyzing memory come from a company called
Sysinternals (see Related topics). One is a graphical process explorer, shown in
Figure 11, that you can use as an advanced replacement for Task Manager.
Figure 11. Process Explorer process tree

Process Explorer has all the same features as Task Manager. For example, you can
get dynamic graphs of total system performance (with View > System
Information...), and you can configure the columns in the main process view in a
similar way. Under Process > Properties..., Process Explorer offers a lot more
information about the process, such as full path and command line, threads, and
dynamic graphs of CPU usage and private bytes. Its user interface is superior, as
you can see in Figure 11. It can also inspect information on DLLs and handles for a
process. You can use Options > Replace Task Managerto execute Process
Explorer instead of Task Manager by default.
Sysinternals ListDLLs
You can also download a couple of Sysinternals command-line utilities -- ListDLLs
and Handle. They're particularly useful if you want to incorporate some form of
memory monitoring into either scripts or programs.
ListDLLs lets you look at DLLs, which can make a significant contribution to memory
footprint. To start using it, add it to your path and invoke it with the help option to get
the usage information. You can invoke it with either the process ID or the name.
Here is the list of DLLs our Java program uses:
1
2
3 >listdlls -r 3904
4
5
6 ListDLLs V2.23 - DLL lister for Win9x/NT
7 Copyright (C) 1997-2000 Mark Russinovich
8 http://www.sysinternals.com

9
10 ---------------------------------------------------------------------
java.exe pid: 3904
11
Command line: java -mx1000m -verbosegc Hello
12
13 Base Size Version Path
14 0x00400000 0x9000 141.2003.0005.0022 C:\WINDOWS\system32\java.exe
15 0x77f50000 0xa7000 5.01.2600.1217 C:\WINDOWS\System32\ntdll.dll
16 0x77e60000 0xe6000 5.01.2600.1106 C:\WINDOWS\system32\kernel32.dll
17 0x77dd0000 0x8d000 5.01.2600.1106 C:\WINDOWS\system32\ADVAPI32.dll
0x78000000 0x87000 5.01.2600.1361 C:\WINDOWS\system32\RPCRT4.dll
18
0x77c10000 0x53000 7.00.2600.1106 C:\WINDOWS\system32\MSVCRT.dll
19 0x10000000 0x178000 141.2004.0003.0001 C:\Java141\jre\bin\jvm.dll
20 ### Relocated from base of 0x10000000:
21 0x00280000 0x6000 141.2004.0003.0001 C:\Java141\jre\bin\jsig.dll
22 ### Relocated from base of 0x10000000:
23 0x00290000 0x7000 141.2004.0003.0001 C:\Java141\jre\bin\xhpi.dll
### Relocated from base of 0x10000000:
24
0x002a0000 0xf000 141.2004.0003.0001 C:\Java141\jre\bin\hpi.dll
25 0x76b40000 0x2c000 5.01.2600.1106 C:\WINDOWS\system32\WINMM.dll
26 0x77d40000 0x8c000 5.01.2600.1255 C:\WINDOWS\system32\USER32.dll
27 0x7e090000 0x41000 5.01.2600.1346 C:\WINDOWS\system32\GDI32.dll
28 0x629c0000 0x8000 5.01.2600.1126 C:\WINDOWS\system32\LPK.DLL
29 0x72fa0000 0x5a000 1.409.2600.1106 C:\WINDOWS\system32\USP10.dll
30 ### Relocated from base of 0x10000000:
0x003c0000 0x18000 141.2004.0003.0001 C:\Java141\jre\bin\java.dll
31
### Relocated from base of 0x10000000:
32 0x003e0000 0x17000 141.2004.0003.0001 C:\Java141\jre\bin\core.dll
33 ### Relocated from base of 0x10000000:
34 0x04a40000 0x12000 141.2004.0003.0001 C:\Java141\jre\bin\zip.dll
35 ### Relocated from base of 0x10000000:
36 0x04df0000 0x3a1000 141.2004.0003.0001 C:\Java141\jre\bin\jitc.dll
37
38
Alternatively, the listdlls -r java command shows all the running Java processes
and the DLLs they're using.
Sysinternals Handle
Handle shows the list of handles (to files, sockets, and so on) that a process is using.
Unzip the Handle download file and add it to your path to try it out. It produces this
output if you try it on our Java program:
1
2
3 >handle -p 3904
4
5 Handle v2.2
6 Copyright (C) 1997-2004 Mark Russinovich
7 Sysinternals - www.sysinternals.com

8
------------------------------------------------------------------
9
java.exe pid: 3904 99VXW67\cem
10 c: File C:\wsappdev51\workspace\Scratch
11 4c: File C:\wsappdev51\workspace\Scratch\verbosegc.out
12 50: File C:\wsappdev51\workspace\Scratch\verbosegc.out
13 728: File C:\WebSphere MQ\Java\lib\com.ibm.mq.jar
14 72c: File C:\WebSphere MQ\Java\lib\fscontext.jar
730: File C:\WebSphere MQ\Java\lib\connector.jar
15
734: File C:\WebSphere MQ\Java\lib\jms.jar
16
738: File C:\WebSphere MQ\Java\lib\jndi.jar
17 73c: File C:\WebSphere MQ\Java\lib\jta.jar
18 740: File C:\WebSphere MQ\Java\lib\ldap.jar
19 744: File C:\WebSphere MQ\Java\lib\com.ibm.mqjms.jar
20 748: File C:\WebSphere MQ\Java\lib\providerutil.jar
74c: File C:\Java141\jre\lib\ext\oldcertpath.jar
21
750: File C:\Java141\jre\lib\ext\ldapsec.jar
22
754: File C:\Java141\jre\lib\ext\JawBridge.jar
23 758: File C:\Java141\jre\lib\ext\jaccess.jar
24 75c: File C:\Java141\jre\lib\ext\indicim.jar
25 760: File C:\Java141\jre\lib\ext\ibmjceprovider.jar
26 764: File C:\Java141\jre\lib\ext\ibmjcefips.jar
27 768: File C:\Java141\jre\lib\ext\gskikm.jar
794: File C:\Java141\jre\lib\charsets.jar
28
798: File C:\Java141\jre\lib\xml.jar
29 79c: File C:\Java141\jre\lib\server.jar
30 7a0: File C:\Java141\jre\lib\ibmjssefips.jar
31 7a4: File C:\Java141\jre\lib\security.jar
32 7a8: File C:\Java141\jre\lib\graphics.jar
33 7ac: File C:\Java141\jre\lib\core.jar

34
35
You can see that our process has a handle to the directory on our classpath and to
several JAR files. In fact the process has a lot more handles, but by default the utility
only shows handles that refer to files. You can display others by using the -
a parameter:
>handle -a -p 3904
1
2
Handle v2.2
3
Copyright (C) 1997-2004 Mark Russinovich
4 Sysinternals - www.sysinternals.com
5
6 ------------------------------------------------------------------
7 java.exe pid: 3904 99VXW67\cem
8 c: File C:\wsappdev51\workspace\Scratch
9 4c: File C:\wsappdev51\workspace\Scratch\verbosegc.out
50: File C:\wsappdev51\workspace\Scratch\verbosegc.out
10
71c: Semaphore
11 720: Thread java.exe(3904): 3760
12 724: Event
13 728: File C:\WebSphere MQ\Java\lib\com.ibm.mq.jar
14 72c: File C:\WebSphere MQ\Java\lib\fscontext.jar
15 730: File C:\WebSphere MQ\Java\lib\connector.jar
734: File C:\WebSphere MQ\Java\lib\jms.jar
16
738: File C:\WebSphere MQ\Java\lib\jndi.jar
17
73c: File C:\WebSphere MQ\Java\lib\jta.jar
18 740: File C:\WebSphere MQ\Java\lib\ldap.jar
19 744: File C:\WebSphere MQ\Java\lib\com.ibm.mqjms.jar
20 748: File C:\WebSphere MQ\Java\lib\providerutil.jar
21 74c: File C:\Java141\jre\lib\ext\oldcertpath.jar
750: File C:\Java141\jre\lib\ext\ldapsec.jar
22
754: File C:\Java141\jre\lib\ext\JawBridge.jar
23
758: File C:\Java141\jre\lib\ext\jaccess.jar
24 75c: File C:\Java141\jre\lib\ext\indicim.jar
25 760: File C:\Java141\jre\lib\ext\ibmjceprovider.jar
26 764: File C:\Java141\jre\lib\ext\ibmjcefips.jar
27 768: File C:\Java141\jre\lib\ext\gskikm.jar
28 76c: Key HKCU
770: Semaphore
29
774: Thread java.exe(3904): 3964
30 778: Event
31 77c: Semaphore
32 780: Semaphore
33 784: Thread java.exe(3904): 3960
34 788: Event
78c: Thread java.exe(3904): 3944
35
790: Event
36 794: File C:\Java141\jre\lib\charsets.jar
37 798: File C:\Java141\jre\lib\xml.jar
38 79c: File C:\Java141\jre\lib\server.jar
39 7a0: File C:\Java141\jre\lib\ibmjssefips.jar
7a4: File C:\Java141\jre\lib\security.jar
40
7a8: File C:\Java141\jre\lib\graphics.jar
41 7ac: File C:\Java141\jre\lib\core.jar
42 7b0: Event
43 7b4: Thread java.exe(3904): 3940
44 7b8: Event
45 7bc: Semaphore
7c0: Directory \BaseNamedObjects
46
7c4: Key HKLM\SOFTWARE\Windows NT\Drivers32
47
7c8: Semaphore
48 7cc: Semaphore
49 7d0: Event
50 7d4: Desktop \Default
51 7d8: WindowStation \Windows\WindowStations\WinSta0
7dc: Event
52
7e0: WindowStation \Windows\WindowStations\WinSta0
53
7e4: Event
54 7e8: Section
55 7ec: Port
56 7f0: Directory \Windows
57 7f4: Key HKLM
58 7f8: Directory \KnownDlls
7fc: KeyedEvent \KernelObjects\CritSecOutOfMemoryEvent
59
60
61
62
63
64
65
66
67
68
If you're interested in memory, handles are important because each one consumes
some space. The exact amount depends on the operating-system version and the
type of handle. In general, handles should not make a significant contribution to
footprint. By simply counting the number of lines this utility outputs, you can quickly
see if the number of handles is significant or -- worse still -- growing. Either scenario
is a possible cause for concern and suggests some more-detailed investigation.
Tips and tricks
Now that you have a handle (no pun intended) on all the tools we've shown you,
here are soe ways you can use them individually or together to improve your
memory monitoring.
Finding a process ID
To find the process ID of an application so you can use it in a command-line tool
such as VADump, open the Applications tab in Task Manager and right-click on the
process you're interested in. Select Go To Process. This takes you to the
corresponding ID in the Processes tab.
Identifying a Java process
Have you ever puzzled over a list of processes all named java or javaw, trying to
work out which is the one you want to investigate? If the Java process was launched
from within an IDE or a script, it can be difficult to determine which JVM is in use and
which command-line parameters were sent to the Java process. This information is
readily available on the TopToBottom Startup tab. You'll see the full command line
used to invoke the JVM and the time the process was started.
Identifying a handle hog
Ever tried to save a file only to be told that it's in use by another process? And even
when you close the program you think is responsible, you still get the error
message? You can use the SysInternals Process Explorer tool's Handle Search
facility to find the culprit. Just open the Search dialog and type in the name of the file.
ProcExp looks through all the open handles and identifies the process. Often it turns
out to be a small stub process left running by an editor or Web browser after you've
closed the user interface.
Investigating how much memory is being shared
You can use VADump with the -o option to get a detailed view of what is in a
process's current working set and how much of it is shared. Take a dump of a single
Java program running on the system, and then start up another one and take the
dump again. If you compare the Code/StaticData values in each of the summaries,
you'll see that the "Shareable" bytes have become "Shared," thereby reducing the
incremental footprint by a small amount.
Trimming the resident set
Windows implements a policy of "trimming" a process's resident set when it does not
appear to be in use. To demonstrate this, open up the Processes tab in Task
Manager so that you can see the process of the application you're monitoring, then
minimize the application window. Watch what happens to the Mem Usage field!
Determining the minimum amount of memory your application needs
For Windows Server 2003 and Windows NT, Microsoft provides an interesting utility
called ClearMem that might be useful if you wish to explore further how applications
use memory under Windows (see Related topics). This tool determines the size of
real memory, allocates enough memory to consume it, touches the allocated
memory as quickly as possible, and then releases it. This presses hard on the
memory consumption of other applications, and the net effect of running ClearMem
many times is to force the amount of memory an application is using down to its
minimum value.
Conclusion
In this article, we've outlined how Windows manages memory and surveyed some of
the most useful freely available tools that you can use to monitor your Java
programs' memory usage. You'll no doubt find and use other tools, be they free
downloads from the Web or commercial offerings, but we hope that we have armed
you with enough knowledge to fight your way through the contradictory terminology.
Often the only way to be confident that you know exactly what is being measured is
to perform an experiment, such as the little C program we used to demonstrate the
meaning of Task Manager's VM Size and Mem Usage.
Of course, these tools can only help you to identify the problem. It's then up to you to
solve it. Most of the time you'll find that the Java heap is gobbling up the biggest slice
of the memory, and you'll need to delve into your code's details to make sure that
object references aren't being held longer than necessary. Many more tools and
articles can help you with this effort, and some useful links in the Related
topics section should point you in the right direction.
Downloadable resources
 PDF of this content
 A C program to demonstrate how Windows uses memory (experiment.c | 3KB)
Related topics
 Visit the Memory Management Reference for an extensive glossary, FAQ, and
articles on memory management.
 Download the latest version of PrcView and learn more about its features.
 You can download TopToBottom and other Windows tools
from www.smidgeonsoft.com.
 Process Explore, ListDLLs and Handle are freely available at the SysInternals Web
site.
 Learn about ClearMem from the Microsoft Windows Server 2003 documentation site.
 The article "Handling Memory Leaks in Java programs" (developerWorks, February
2001) provides more information on heap-related memory issues.
 The article "Whose object is it, anyway?" (developerWorks, June 2003) discusses
the need to track object ownership to avoid memory leaks.

Introduction to memory management


 1. Overview
o 1.1. Hardware memory management
o 1.2. Operating system memory management
o 1.3. Application memory management
o 1.4. Memory management problems
o 1.5. Manual memory management
o 1.6. Automatic memory management
o 1.7. More information
 2. Allocation techniques
o 2.1. First fit
o 2.2. Buddy system
o 2.3. Suballocators
 3. Recycling techniques
o 3.1. Tracing collectors
o 3.2. Reference counts
 4. Memory management in various languages
1. Overview
Memory management is a complex field of computer science and there are many
techniques being developed to make it more efficient. This guide is designed to introduce
you to some of the basic memory management issues that programmers face.
This guide attempts to explain any terms it uses as it introduces them. In addition, there is
a Memory Management Glossary of memory management terms that gives fuller
information; some terms are linked to the relevant entries.
Memory management is usually divided into three areas, although the distinctions are a
little fuzzy:
 Hardware memory management
 Operating system memory management
 Application memory management
These are described in more detail below. In most computer systems, all three are present
to some extent, forming layers between the user’s program and the actual memory
hardware. The Memory Management Reference is mostly concerned with application
memory management.
1.1. Hardware memory management
Memory management at the hardware level is concerned with the electronic devices that
actually store data. This includes things like RAM and memory caches.
1.2. Operating system memory management
In the operating system, memory must be allocated to user programs, and reused by other
programs when it is no longer required. The operating system can pretend that the
computer has more memory than it actually does, and also that each program has the
machine’s memory to itself; both of these are features of virtual memory systems.
1.3. Application memory management
Application memory management involves supplying the memory needed for a program’s
objects and data structures from the limited resources available, and recycling that memory
for reuse when it is no longer required. Because application programs cannot in general
predict in advance how much memory they are going to require, they need additional code
to handle their changing memory requirements.
Application memory management combines two related tasks:
Allocation
When the program requests a block of memory, the memory manager must allocate that
block out of the larger blocks it has received from the operating system. The part of the
memory manager that does this is known as the allocator. There are many ways to perform
allocation, a few of which are discussed in Allocation techniques.
Recycling
When memory blocks have been allocated, but the data they contain is no longer required
by the program, then the blocks can be recycled for reuse. There are two approaches to
recycling memory: either the programmer must decide when memory can be reused
(known as manual memory management); or the memory manager must be able to work it
out (known as automatic memory management). These are both described in more detail
below.
An application memory manager must usually work to several constraints, such as:
CPU overhead
The additional time taken by the memory manager while the program is running.
Pause times
The time it takes for the memory manager to complete an operation and return control to
the program.
This affects the program’s ability to respond promptly to interactive events, and also to any
asynchronous event such as a network connection.
Memory overhead
How much space is wasted for administration, rounding (known as internal fragmentation),
and poor layout (known as external fragmentation).
Some of the common problems encountered in application memory management are
considered in the next section.
1.4. Memory management problems
The basic problem in managing memory is knowing when to keep the data it contains, and
when to throw it away so that the memory can be reused. This sounds easy, but is, in fact,
such a hard problem that it is an entire field of study in its own right. In an ideal world, most
programmers wouldn’t have to worry about memory management issues. Unfortunately,
there are many ways in which poor memory management practice can affect the robustness
and speed of programs, both in manual and in automatic memory management.
Typical problems include:
Premature frees and dangling pointers
Many programs give up memory, but attempt to access it later and crash or behave
randomly. This condition is known as a premature free, and the surviving reference to the
memory is known as a dangling pointer. This is usually confined to manual memory
management.
Memory leak
Some programs continually allocate memory without ever giving it up and eventually run
out of memory. This condition is known as a memory leak.
External fragmentation
A poor allocator can do its job of giving out and receiving blocks of memory so badly that it
can no longer give out big enough blocks despite having enough spare memory. This is
because the free memory can become split into many small blocks, separated by blocks still
in use. This condition is known as external fragmentation.
Poor locality of reference
Another problem with the layout of allocated blocks comes from the way that modern
hardware and operating system memory managers handle memory: successive memory
accesses are faster if they are to nearby memory locations. If the memory manager places
far apart the blocks a program will use together, then this will cause performance problems.
This condition is known as poor locality of reference.
Inflexible design
Memory managers can also cause severe performance problems if they have been designed
with one use in mind, but are used in a different way. These problems occur because any
memory management solution tends to make assumptions about the way in which the
program is going to use memory, such as typical block sizes, reference patterns, or lifetimes
of objects. If these assumptions are wrong, then the memory manager may spend a lot
more time doing bookkeeping work to keep up with what’s happening.
Interface complexity
If objects are passed between modules, then the interface design must consider the
management of their memory.
A well-designed memory manager can make it easier to write debugging tools, because
much of the code can be shared. Such tools could display objects, navigate links, validate
objects, or detect abnormal accumulations of certain object types or block sizes.
1.5. Manual memory management
Manual memory management is where the programmer has direct control over when
memory may be recycled. Usually this is either by explicit calls to heap management
functions (for example, malloc and free(2) in C), or by language constructs that affect
the control stack (such as local variables). The key feature of a manual memory manager is
that it provides a way for the program to say, “Have this memory back; I’ve finished with it”;
the memory manager does not recycle any memory without such an instruction.
The advantages of manual memory management are:
 it can be easier for the programmer to understand exactly what is going on;
 some manual memory managers perform better when there is a shortage of
memory.
The disadvantages of manual memory management are:
 the programmer must write a lot of code to do repetitive bookkeeping of memory;
 memory management must form a significant part of any module interface;
 manual memory management typically requires more memory overhead per object;
 memory management bugs are common.
It is very common for programmers, faced with an inefficient or inadequate manual memory
manager, to write code to duplicate the behavior of a memory manager, either by allocating
large blocks and splitting them for use, or by recycling blocks internally. Such code is known
as a suballocator. Suballocators can take advantage of special knowledge of program
behavior, but are less efficient in general than fixing the underlying allocator. Unless written
by a memory management expert, suballocators may be inefficient or unreliable.
The following languages use mainly manual memory management in most implementations,
although many have conservative garbage
collection extensions: Algol; C; C++; COBOL; Fortran; Pascal.
1.6. Automatic memory management
Automatic memory management is a service, either as a part of the language or as an
extension, that automatically recycles memory that a program would not otherwise use
again. Automatic memory managers (often known as garbage collectors, or simply
collectors) usually do their job by recycling blocks that are unreachable from the program
variables (that is, blocks that cannot be reached by following pointers).
The advantages of automatic memory management are:
 the programmer is freed to work on the actual problem;
 module interfaces are cleaner;
 there are fewer memory management bugs;
 memory management is often more efficient.
The disadvantages of automatic memory management are:
 memory may be retained because it is reachable, but won’t be used again;
 automatic memory managers (currently) have limited availability.
There are many ways of performing automatic recycling of memory, a few of which are
discussed in Recycling techniques.
Most modern languages use mainly automatic memory management: BASIC, Dylan, Erlang,
Haskell, Java, JavaScript, Lisp, ML, Modula-3, Perl, PostScript, Prolog,
Python, Scheme, Smalltalk, etc.
1.7. More information
For more detailed information on the topics covered briefly above, please have a look at
the Memory Management Glossary. Books and research papers are available on many
specific techniques, and can be found via our Bibliography; particularly recommended
are: Wilson (1994), which is survey of garbage collection techniques; Wilson et al. (1995),
which is a survey of allocation techniques; and Jones et al. (2012), which is a handbook
covering all aspects of garbage collection.
http://www.memorymanagement.org/mmref/index.html

Potrebbero piacerti anche