Sei sulla pagina 1di 6

Identifying High CPU Usage with vmstat

Oracle Tips by Burleson Consulting

We can also easily detect when we are experiencing a busy CPU on the Oracle database server.
Whenever the us (user) plus sy (system) times approach 100, the CPUs are operating at full
capacity as shown in the next listing.

Also see tips for 100% CPU utilization.

Please note that it is not uncommon to see the CPU approach 100 percent even when the server is
not overwhelmed with work. This is because the UNIX internal dispatchers will always attempt
to keep the CPUs as busy as possible. This maximizes task throughput, but it can be misleading
for a neophyte.

Remember, it is not a cause for concern when the user + system CPU values approach 100
percent. This just means that the CPUs are working to their full potential. The only metric that
identifies a CPU bottleneck is when the run queue (r value) exceeds the number of CPUs on the
server.
vmstat 5 1

kthr memory page faults cpu


----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 217485 386 0 0 0 4 14 0 202 300 210 20 75 3 2

Now let's take a look at measuring the memory utilization on the server.

Identifying RAM Memory Bottlenecks


Contention for random access memory (RAM) has always been a problem for Oracle. All
database servers have a limited amount of available RAM, and this RAM must be shared
between the Oracle database and all external sessions that connect to the server and consume
RAM in their Program Global Area (PGA).

Understanding UNIX RAM memory paging


As we have noted in chapter 2, most Oracle DBA’s rely on the pi column in vmstat to signal
when the server is swapping RAM memory. However, there is more to the story.

There are times when the pi column will be non-zero, even though there is no real RAM
swapping. To illustrate this let’s take a simple example. Suppose that we invoke a 20 Megabyte
Oracle executable program, such as a Pro*C program. We don't need to load all 20MB of the
executable into RAM all at once. Rather, we just want to load those pieces of the executable
code that require immediate execution. Hence, UNIX will use memory frames as necessary later
and rely on the principle of spatial locality to minimize the amount of pages in our RAM
working set.

To manage the memory segments, the UNIX kernel builds a memory map of the entire program
when it starts. Included in this map is a note on whether the storage is “in memory” or “on swap
disk.”

As the program starts, it begins accessing some of its pages that have never been loaded into
RAM memory. Hence, you may see vmstat page-in’s when a large number of programs are
starting and allocating their RAM memory.

During normal operation we may see various points in time when paging in happens a lot and
this is not always a cause for concern. Remember, a UNIX process may page-in when the UNIX
program is starting or is accessing parts of its code that it had not used before.

Paging out (the po column in vmstat) happens frequently as UNIX prepares for the possibility of
a page-in. With UNIX virtual memory we are always anticipating running out of RAM memory,
and a page-out is a method for being ready for a subsequent page-in. Also, as UNIX processes
end they call the free() system call to free the RAM pages so they can be used by new processes.

Internals of RAM memory paging

So if RAM paging in pi may be acceptable and paging out po may be acceptable, how do we tell
when the RAM on a server is overstressed and swapping? One answer is to correlate the UNIX
scan rate with page-in operations. When an Oracle server begins to run low on RAM memory,
the page stealing daemon process awakens and UNIX begins to treat the RAM memory as a
sharable resource, moving memory frames to the swap disk with paging operations.

The page-stealing daemon operates in two modes. When RAM memory shortages are not
critical, the daemon will steal small chunks of least-recently-used RAM memory from a
program. As RAM resource demands continue to increase, the page-stealing daemon escalates
and begins to page-out entire programs RAM regions. In short, we cannot always tell if the page-
in operations that we see are normal housekeeping or a serious memory shortage unless we
correlate the activity of the page stealing daemon with the page-in output.

To aid in this, the vmstat utility gives the sr column to designate the memory page scan rate. If
we see the scan rate rising steadily, we will have hit the page-stealing daemon’s first threshold,
indicating that entire program’s RAM memory regions are being paged-out to the swap disk. As
the process continues, we will begin to see high page-in numbers as the entire process in paged-
back into RAM memory.

Carefully review the list below from HP/UX vmstat. The scan rate is the furthest right column,
and here we see the value of sr rising steadily as the page stealing daemon prepares for a page-
in. As the sr value peaks, we see the page-in operation (pi) as the real RAM memory on the
Oracle server is exceeded.
root> vmstat 2
procs memory page
r b w avm free re at pi po fr de sr
3 0 0 144020 12778 17 9 0 14 29 0 3
3 0 0 144020 12737 15 0 1 34 4 0 8
3 0 0 144020 12360 9 0 1 46 2 0 13
1 0 0 142084 12360 5 0 3 17 0 0 21
1 0 0 142084 12360 3 0 8 0 0 0 8
1 0 0 140900 12360 1 0 10 0 0 0 0
1 0 0 140900 12360 0 0 9 0 0 0 0
1 0 0 140900 12204 0 0 3 0 0 0 0
1 0 0 137654 12204 0 0 0 0 0 0 0

However, before we go into detecting if your server memory is exceeded, we must first give you
a tool for determining how much memory you have available on your server. Next is a command
that you can issue to see how much RAM exists on your server.

Display RAM Size on DEC-UNIX

In DEC-UNIX, you can use the uerf command in conjunction with grep to display memory size.
For example:

L 5-11

uerf -r 300 | grep -i mem

Here, the output of the uerf command is piped to grep to filter out and display the segments
relating to memory. The –i option causes grep to find both uppercase and lowercase strings. With
respect to the example shown here, grep –i mem looks for both “MEM” and “mem.”

Display RAM Size on HP/UX

In HP/UX, the dmesg command can display memory information:

L 5-12
dmesg
Memory Information:
physical page size = 4096 bytes, logical page size = 4096 bytes
Physical: 5242880 Kbytes,lockable: 4051216 Kbytes,available: 4651796 Kbytes

Display RAM Size on AIX

In IBM's AIX dialect of UNIX, you must issue two separate commands. You start with the lsdev
command followed by the lsattr command to display the amount of memory on a server. First,
execute lsdev to list all devices. Then pipe that output through grep to filter out everything not
related to memory. That will get you the name of the memory devices that are installed. For
example:

L 5-13
>lsdev -C|grep mem

mem0 Available 00-00 Memory


Here you can see that mem0 is the name of the memory device. Now that you have the name,
you can issue the lsattr –El command to see the amount of memory on the server. In the
following example, the server has 3GB of RAM installed.

L 5-14
>lsattr -El mem0

size 3064 Total amount of physical memory …

You must issue the lsattr –El command separately for each memory device.

Display RAM Size on Solaris

The prtconf command can also be used on all Solaris servers to quickly see the amount of
available memory:

L 5-15
>prtconf|grep -i mem

Memory size: 2048 Megabytes


memory (driver not attached)
virtual-memory (driver not attached)

Display RAM Size in Linux

In Linux, the free command can be used to quickly display the amount of RAM memory on the
server:

L 5-16
>free
total used free shared buffers cached
Mem: 3728668 504688 3223980 41316 430072 29440
-/+ buffers/cache: 45176 3683492
Swap: 265032 608 264424

Using the Top Utility for Displaying RAM

While each dialect has unique mechanisms and commands to display RAM, most UNIX servers
support the top utility. The top utility is invoked by issuing the top command from the UNIX
prompt. This will display the total amount of RAM available, and will also show the usage of
both RAM and virtual memory:

L 5-17
root> top

9:43am up 16 days, 22:33, 24 users, load average: 0.00, 0.00, 0.00


123 processes: 122 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.0% user, 0.9% system, 0.0% nice, 99.0% idle
Mem: 257568K av, 244988K used, 12580K free, 88732K shrd, 179772K buff
Swap: 530104K av, 9972K used, 520132K free 15452K cached
PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
22417 oracle 14 0 904 904 668 R 0 0.9 0.3 0:00 top
1 root 1 0 160 120 88 S 0 0.0 0.0 0:11 init
2 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 kflushd
3 root 0 0 0 0 0 SW 0 0.0 0.0 0:08 kupdate
4 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 kpiod

In this example, we see from the top utility that we have 257,568 kilobytes of RAM on the
server. The top utility is also great for seeing the top CPU consumer tasks on your server.

Using Glance to See Memory

In HP/UX and Solaris, you can enter the glance utility in order to see the amount of RAM
available. The glance utility displays a screen showing CPU and memory utilization for the
system as a whole, and for individual processes (see Figure 5-3). For more information on
glance, look to the man pages on your UNIX server. Glance is started by entering glance from
the UNIX prompt, and exited by entering CTRL-C.

Figure 5-25: A sample glance screen

Glance is a great tool for reactive tuning because it shows the current status of your Oracle
server, along with consumption details for the CPU and RAM.

RAM Memory and the Swap Disk

Now that we know the amount of RAM on our server, we are ready to investigate the RAM and
swap disk usage for the Oracle server. Whenever the memory demands of the server exceed that
amount of RAM, the virtual memory facility is invoked. With virtual memory, segments of
RAM are moved onto a special disk segment called the swap disk. The swap disk is a special
segment of disk defined by the systems administrator to hold excess RAM memory contents. The
virtual memory system commonly pages-out memory segments, and this is not an indicator of a
memory problem. However, a page-in operation indicates that the server has exceeded the
amount of available RAM and is recalling memory segments from the swap disk (see Figure 5-
4).

Figure 5-26: RAM memory paging from the swap disk

Swapping (pi) slows down a server because it takes a large amount of time to copy memory
segments from the swap disk back into RAM. On an Oracle database server, the solution to a
page-in problem involves:

 Smaller SGAReduce the demand for RAM by making the SGA smaller. The SGA can
be made smaller by reducing the db_block_buffers on Oracle8i and earlier versions,
db_cache_size, sga_max_size, db_xK_cache_size on Oracle9i or shared_pool_size or
java_pool_size init.ora parameters.
 More RAM memoryAdd additional RAM memory to the server. (Remember that
some 32-bit versions of Oracle cannot use more than 1.7GB of RAM.)
 Reduce RAM demandReduce RAM consumption of a database server by reducing
the demands on Program Global Area (PGA) memory. Oracle parameters such as
sort_area_size can greatly increase the amount of RAM allocated to each connected
user's PGA.

A memory-bound database server always experiences paging from the swap disk. This is
displayed in the vmstat utility as the po (page-out) and pi (page-in) columns of vmstat. Here we
see that the database server is experiencing five page-in and nine page-out operations. You can
tell that a server is having excessive memory requests by looking at page-in operations.

L 5-18
Root> vmstat 1 2

kthr memory page faults cpu


----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 218094 166 0 4 0 4 16 0 202 301 211 14 19 45 22
0 0 218094 166 0 5 9 4 14 0 202 301 211 14 19 45 22

In sum, page-out (po) operations are a normal part of virtual memory operation, but page-in (pi)
operations indicate that the server has excessive RAM demands. Now let's take a look at a vmstat
metric that is sometimes associated with a disk I/O bottleneck.

Detecting Potential I/O Bottlenecks in AIX

Using IBM's AIX dialect of UNIX, an I/O bound database server is usually evidenced by a high
value in the wa (wait) column of vmstat. In the next listing, we see that 22 percent of the CPU
time is being used waiting for database I/O.

L 5-19
root> vmstat
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 217485 386 0 0 0 4 14 0 202 300 210 14 19 45 22

It is important to note that a high wait value does not always signal an I/O bottleneck. The CPU
wait could be due to other factors, and this metric simply means that the CPU is waiting for
external OS services.

Now that we understand the basics of the vmstat utility, let's explore an easy method for
capturing vmstat data, storing it inside an Oracle table, and producing server performance
reports.

Potrebbero piacerti anche