Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
10 August 2006
(First published 19 April 2004)
Just like a high performance sports car, a database requires some checks to keep it running
optimally. This article is broken down into tasks or checks that can be run at different intervals
on your DB2 for Linux, UNIX, and Windows database, to do just that. Learn when to
monitor and what you should be doing daily, weekly, and monthly. Updated for DB2 9.
Introduction
While databases are becoming more and more self-aware and self-healing, they still require some
monitoring to keep them running as efficiently as possible. Just like your car, a database requires
some checks to keep it running optimally. This document is broken down into tasks or checks that
you can run at different time intervals to ensure that your DB2 databases are running optimally,
and detect potential issues before they happen.
The first set of checks or tasks should be run every day to make sure there are no current or
imminent problems. The second set should be run weekly to check for issues or problems that
may have occurred during the week or are likely to occur in the coming week. The final set of
checks or tasks need not be run every day or week, but should be run monthly to keep the system
running without problems, and to prevent further issues in the event that a problem does occur.
Trademarks
Page 1 of 20
developerWorks
ibm.com/developerWorks/
When capturing the information for analysis, make sure that the DB2 and operating system
information is captured at the same time, as you cannot correlate information captured at different
times.
NOTE: the -tx option on iostat is not supported on all UNIX/Linux versions, but is useful since it
embeds the timestamp for when the snapshot was taken.
Also make sure to capture the snapshots are normal/average workload times as well as peak
workload times. While it is important to ensure the normal workloads are handled efficiently, it is
also important to ensure that the system can handle the peak workloads without overloading the
server.
Windows Tools
On Windows, you can look at the CPU usage and memory usage in the Task Manager as seen
below, but you cannot capture this information into a file like you can with vmstat and iostat:
Page 2 of 20
ibm.com/developerWorks/
developerWorks
DB2 tools
DB2 has a number of tools that can be used to monitor the activity of the databases and instances.
These include:
The Health Monitor / Health Center
Snapshot Monitors / SQL Snapshot Functions
Event Monitors
There are also other tools and logs available that provide information about the databases and
instances including:
The administration notification log
This is a separate file on Linux and UNIX and incorporated into the Event Log on
Windows.
DB2DIAG.LOG
Memory Visualizer
1. Health Monitor
In Version 8, DB2 introduced two new features to help you monitor the health of your DB2
systems: the Health Monitor and the Health Center. These tools add a management by
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 3 of 20
developerWorks
ibm.com/developerWorks/
exception capability to DB2 9 by alerting you to potential system health issues. This enables you to
address health issues before they become real problems that affect your system's performance.
The Health Monitor runs on the DB2 server and continually monitors the health of the DB2
instance and databases. If the Health Monitor detects that a user-defined threshold has been
exceeded (for example, the available log space has dropped below a set percentage of the total
space available), or if it detects an abnormal state for an object (for example, the DB2 instance is
no longer running), the Health Monitor will raise an alert.
When an alert is raised two things can occur:
The alert notification will be sent.
This can be sent by e-mail or to a pager
Pre-configured actions can be taken.
A CLP script or a Task Center task can be executed.
A health indicator is a system characteristic that the Health Monitor checks. The Health Monitor
comes with a set of predefined thresholds for these health indicators. The Health Monitor checks
the state of your system against these health-indicator thresholds when determining whether to
issue an alert. Using the Health Center, commands, or APIs, you can customize the threshold
settings of these health indicators, and define who should be notified and what script or task
should be run if an alert is issued.
The Health Center provides the graphical interface to the Health Monitor. You use it to configure
the Health Monitor, and to see the rolled up alert state of your instances and database objects.
Using the Health Center's drill-down capability, you can access details about current alerts and
obtain a list of recommended actions that describe how to resolve the alert. You can also choose
to follow a recommended action right inside the tool. The Health Center is easily configured to
show status line health beacons and/or pop up a dialog box telling you that the Health Center has
an object in alert status.
2. Snapshot Monitors / SQL Snapshot Functions
DB2 maintains data about its operation, its performance, and the applications that are accessing
it. This data is maintained as the database manager runs, and can provide important performance
and troubleshooting information. For example, you can find out:
The number of applications connected to a database, their status, and which SQL statements
each application is executing, if any.
Information that shows how well the database manager and database are configured, and
helps you to tune them.
When deadlocks occurred for a specified database, which applications were involved, and
which locks were in contention.
The list of locks held by an application or a database. If the application cannot proceed
because it is waiting for a lock, there is additional information on the lock, including which
application is holding it.
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 4 of 20
ibm.com/developerWorks/
developerWorks
The list of SQL statements executed against a particular database, how many times they
were executed, how many sorts were performed on behalf of the statement and the total
amount of CPU time used by each statement.
The number of sorts that have occurred and the number currently in progress.
Because the monitors do add some overhead to the system, the monitor switches can be
enabled or disabled independently. They can also be set for the entire instance, and all databases
in the instance, or can be set within a database session. If the monitor switches are enabled within
a session, they are only 'active" for that session, and a snapshot taken from another session
will not capture the monitor information. If the switches are enabled using the DB2 instance
configuration parameters, they are enabled for all sessions, unless explicitly turned off within a
session.
To set the monitor switches within a session, use the UPDATE MONITOR SWITCHES command
or the sqlmon() API.
For example, to enable buffer pool, lock, and dynamic SQL statement monitoring, turn on the
monitor switches using the following command:
update monitor switches using bufferpool on lock on statement on
NOTE: You must have SYSADM, SYSCTRL, SYSMAINT or SYSMON(new in DB2 9) authority to
update the monitor switches and/or take a DB2 snapshot.
You can access the data that the database manager maintains either by taking a snapshot or by
using an event monitor. You can take a snapshot in one of the following ways:
3. Event Monitors
Once an event monitor has been created and activated, it will collect information about the
database and any database applications when the specified event occurs. An event is a change in
database activity which can be caused by one of the following:
An event monitor is created based on the type of event that you want it to detect and record. For
example, a deadlock event monitor waits for a deadlock to occur; and when one does occur
it will collect and record information about the applications and locks involved in the deadlock
condition.
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 5 of 20
developerWorks
ibm.com/developerWorks/
Event monitors are created using the CREATE EVENT MONITOR statement and will collect event
information only when they are active. An event monitor is activated and deactivated using the
SET EVENT MONITOR STATE statement. The EVENT_MON_STATE function will return the
current state of the specified event monitor.
When the CREATE EVENT MONITOR statement is executed, the definition of the event monitor is
created and stored in the system catalog tables.
SYSCAT.EVENTMONITORS: Event monitors defined for the database.
SYSCAT.EVENTS: Events types being monitored for the database.
SYSCAT.EVENTTABLES: The names of the target tables for table event monitors.
Daily procedures
Verify that all instances are up and running
This can be done in a number of ways:
1. Use the Health Center.
2. export/set DB2INSTANCE=instancename
and run db2start.
3. Attach to all instances.
4. On UNIX or Linux, run ps -ef | grep db2sysc
Verify there is one db2sysc process for each instance.
5. On Windows, check that the service for each DB2 instance is started.
The attach method can be easily scripted as long as all of the instances (that is, NODEs) are
cataloged on your workstation.
To use the ps command on UNIX and Linux you first need to telnet into each of the servers.
Page 6 of 20
ibm.com/developerWorks/
developerWorks
A good method, because it will also make inconsistent databases consistent and therefore reduce
the times for future connect requests, is to successfully connect to all databases. This also can be
easily scripted, as long as all of the databases are cataloged on your workstation.
On Linux and UNIX, the log is written to a file named <instance_ID>.nfy that is located in
the directory specified by the DIAGPATH instance level configuration parameter. To view the
notification log you can:
Connect to each of the servers using telnet or remote terminal services.
For each instance, go to the DIAGPATH directory.
At the command prompt:
Run the tail command on the notification log to dump the last 100 entries
Edit the file and look at the most recent entries at the bottom of the file.
Page 7 of 20
developerWorks
ibm.com/developerWorks/
The first step is to ensure that the backups were successful. This is done using the List History
command as follows:
list history backup all for db_name
This can be scripted so that it is run for all databases after the backups complete, and the report
emailed to you. You can then simply verify the report each morning.
In the event that the whole server goes down for a sustained period of time, you may need to
revert to your disaster recovery plan, restore the database to another server, maybe in another
location. Therefore it is important that the backup images be stored in a safe site, not only on the
server where the backup is taken. This can be easily accomplished by copying the backup image
to a LAN drive, an NFS mounted drive or to a tape device.
Page 8 of 20
ibm.com/developerWorks/
developerWorks
Make sure that you capture the output of the commands to a file, and name the file so that is has
the date as part of the name, like DB_DBM_CFG.07152006.out. Then you can use a tool like diff
to compare the current output with the previous day's output as follows:
diff DB_DBM_CFG.07142006.out DB_DBM_CFG.07152006.out
This way, if there is a change you will see something like the following:
< Degree of parallelism
--> Degree of parallelism
(DFT_DEGREE) = 1
(DFT_DEGREE) = 4
Note: The nullif function is used in the query above to return a null when the number inside the
bracket (i.e. pool_data_l_reads or pool_index_l_reads) is zero (0), otherwise the calculation would
cause a divide by zero error and the statement will fail.
Examine the usage patterns for the tables in your database using the query below. This query
examine how many rows were read, written, and the number of overflow records accessed using
the following statement.
select
substr(table_schema,1,8) as Schema,
substr(table_name,1,30) as Table_Name,
rows_read,
rows_written,
overflow_accesses
from table (snapshot_table ('sample', -1) ) as snapshot_table;
Examine the overall database usage patterns using the query below. This query examines:
How many rows were read vs. selected
How many lock waits occurred, the total lock wait time and the average lock wait time
How many deadlocks and lock escalations were detected
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 9 of 20
developerWorks
ibm.com/developerWorks/
How many sorts occurred, the total sort time, and the average sort time, the percent of sorts
that overflowed
select
db_name,
SNAPSHOT_TIMESTAMP,
rows_read,
rows_selected,
lock_waits,
lock_wait_time,
lock_wait_time/nullif(lock_waits,0) as avg_wt_time,
deadlocks,
lock_escals,
total_sorts,
total_sort_time,
total_sort_time/nullif(total_sorts,0) as avg_sort_time,
sort_overflows,
sort_overflows/nullif(total_sorts,0) as pct_ovflow_sorts
from table (snapshot_database (' ', -1) ) as snapshot_database;
used
717240
226304
0
free
319008
809944
1048784
shared
0
buffers
60200
cached
430736
Study DB2
Nothing is more valuable in the long run than that a DBA who is widely experienced, and as widely
read as possible. This study should include DBA manuals, magazines, newsgroups and mailing
lists.
The comp.databases.ibm-db2 news group is a great place to learn from, and share information
with, your fellow DBAs.
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 10 of 20
ibm.com/developerWorks/
developerWorks
For more detailed information you should also look for our DB2 Certification Guide series, as these
books are very informative.
Weekly procedures
Look for new objects
It is important to know if people are creating new tables, indexes, stored procedures, etc. in your
production database. New objects typically indicate that a new application has been installed on
the server and any new applications and/or objects will impact the operational characteristics of
your system.
In addition, new objects will consume space within the database, so it is important to identify these
objects before they grow too large and could potentially fill a table space. If these objects are not
created by a DBA, they very likely may have been created in the wrong table space, which can
cause space and/or performance issues.
There are a few alternatives available to check for any new objects within the system:
1. Run db2look and write the report to a file every week.
Check for differences between the new output and the previous week's output.
2. Select object names from SYSCAT.TABLES, SYSCAT.INDEXES, SYSCAT.PROCEDURES
Check for differences between the new output and the previous week's output.
For any differences, you determine the CREATOR of the object from the catalog table and track
the information back to the person that created the object.
You can then retrieve the SQL statements from the current package cache and insert them into a
table for analysis using the following statement:
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 11 of 20
developerWorks
ibm.com/developerWorks/
You can then examine this table for any SQL statements that have not been executed previously
using the statement:
select distinct stmt,
count(stmt),
tstamp from sqlstmts
group by stmt,
tstamp
In the output of this statement, any statement with a count of 1, and the timestamp column
showing the current date is one that has not been run previously.
You should redirect the output of this command to a file for further analysis.
When viewing the output of the reorgchk tool, find the F1, F2 and F3 columns for your tables, and
the F4, F5, F6, F7, and F8 columns for your indexes. If there is an asterisk (*) in any one of these
columns, that indicates that DB2 has calculated that your current table and/or indices currently
breach that threshold.
It is important to note that for tables, if you see an asterisk in any of the columns, then you typically
need to reorg the table. However, since many tables have more than one index, by definition
if one of them is 100% clustered, the other indices will not be clustered. Therefore you need to
investigate the index portion of the reorgchk output in more detail and consider all of the indexes
on the table when determining whether or not to reorg the index.
The calculations for the measures used by reorgchk are:
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 12 of 20
ibm.com/developerWorks/
developerWorks
F1: the percentage of rows that are overflow records. When this is greater than 5% there will be an
asterisk (*) in the F1 column of the output.
F2: the percentage of used space on the data pages. When this is less than 70% there will be an
asterisk (*) in the F2 column of the output.
F3: the percentage of pages that contain data that contain some records. When this is less than
80% there will be an asterisk (*) in the F3 column of the output.
F4: the cluster ratio, i.e. the percentage of rows in the table that are in the same order as the
index. When this is less than 80% there will be an asterisk (*) in the F4 column of the output.
F5: the percentage of space that is used on each index page used for index keys. When this is
less than 50% there will be an asterisk (*) in the F6 column of the output.
F6: the number of keys that can be stored on each index level. When this is less than 100 there
will be an asterisk (*) in the F6 column of the output.
F7: the percentage of record IDs (keys) on a page that have been marked as deleted. When this is
more than 20% there will be an asterisk (*) in the F7 column of the output.
F8: the percentage of empty leaf pages in the index. When this is more than 20% there will be an
asterisk (*) in the F8 column of the output.
When reorganizing a table you can optionally specify which on which index DB2 should cluster the
data. To reorg the ORG table based on the ORGX index, use the command
reorg table org index orgx
The DB2 optimizer uses database statistics to determine the optimal access plans for your SQL
statements. When you make significant changes to the amount of data, or to the data organization
in your tables you should use the runstats tool to capture new statistics and store them in the
system catalogs. You should also be sure to capture statistics for any new table or index.
To capture statistics for the ORG table, and its indexes you can use the command
runstats on table <schema>.org with distribution and detailed indexes all
NOTE: You must specify the schema for the table when using the runstats command.
Page 13 of 20
developerWorks
ibm.com/developerWorks/
select substr(name,1,30),substr(creator,1,10),stats_time
from sysibm.systables
where stats_time < ((current timestamp) - 7 days)
or stats_time is null
select substr(name,1,30),substr(creator,1,10),stats_time
from sysibm.sysindexes
where stats_time < ((current timestamp) - 7 days)
or stats_time is null
To find the 10 most updated tables, based on the number of rows written, use the following
statement:
select substr(table_schema,1,10) as tbschema,
substr(table_name,1,30) as tbname,
rows_read,
rows_written,
overflow_accesses,
page_reorgs
from table (SNAPSHOT_TABLE(' ',-1)) as snapshot_table
order by rows_written desc
fetch first 10 rows only
These tables are also likely candidates for at least a runstats, if not a reorg and a runstats.
Page 14 of 20
ibm.com/developerWorks/
developerWorks
For the DB2DIAG.LOG file as well as the administration notification log file on Linux and UNIX, you
should compress these files, and name then with the current date in the file name as well.
On Linux or UNIX, you can tar the *.nfy and db2diag.log files together, and then use either gzip or
compress to reduce the size of the resulting file.
Monthly procedures
Look for indicators of exceptional growth
Review your tables and table spaces to see how much they have grown in the past month. By
knowing how fast the tables and table spaces are growing, and how much space is still available,
you can detect potential space issues before they happen.
You can retrieve the size of the table space and the amount of space available using the statement
below.
select substr(tablespace_name,1,120) as TBSPC_NAME,
used_pages,
free_pages,
from table (snapshot_tbs_cfg ('sample', -1) ) as snapshot_tbs_cfg
You can see how big each of your tables is by looking at the system catalog tables. As long as
your statistics are current, this information will be accurate. To get the size of your tables use the
statement
select tabname,
npages
from syscat.tables
where tabname not like 'SYS%'
NOTE: If statistics have not been captured for a table, it will have a value of -1 for npages.
The DB2 for Linux, UNIX, and Windows DBA Checklist
Page 15 of 20
developerWorks
ibm.com/developerWorks/
Create a history table or a spreadsheet to store this information so that you can scrutinize the
space usage for your tables and table spaces over time. An easy way to do this is to create an
export statement using the select statements above, and create a delimted ASCII (DEL) file which
you can then import directly into a spreadsheet.
-- Insert the snapshot info into the tablespaceinfo table to be stored for analysis.
insert into tablespaceinfo
select
current timestamp,
substr(tablespace_name,1,120) as TBSPC_NAME,
(case
-- We can calculate pct free for DMS table spaces only as total_pages is
set to 0 for SMS by this stmt...
-- Therefore, check if DMS, and then calculate pct_free as 1(used/total) * 100%
when tablespace_type = 0 then (int( (1- (decimal(used_pages) /
decimal(total_pages))) * 100) )
Page 16 of 20
ibm.com/developerWorks/
developerWorks
-- For SMS set pct_free to 100... Could set to any numeric value.
else 100
end) as pct_free,
(case
-- Display the table space type, i.e. DMS or SMS as a string, not the numeric
value in the info.
when tablespace_type = 0 then 'DMS'
when tablespace_type = 1 then 'SMS'
-- Only 0 and 1 are VALID, therefore return an error for anything else.
else 'Error'
end) as Managed_By,
(case
-- Display the type of data that can stored in the table space, i.e. TEMP,
LARGE/LOB OR ALL,
not the numeric value in the info.
when tbs_contents_type = 2 then 'TEMP'
when tbs_contents_type = 1 then 'LARGE'
when tbs_contents_type = 0 then 'ALL' end) as Data_Type,
-- Also return the total_pages using the heading ALLOCATED PAGES,
total_pages as allocated_pages,
usable_pages,
used_pages,
free_pages,
page_size
from table (snapshot_tbs_cfg ('sample', -1) ) as snapshot_tbs_cfg
order by pct_free;
select tablespace_name,
date(timestmp) as dte,
pct_free
from tablespaceinfo
group by tablespace_name, pct_free, timestmp ;
Page 17 of 20
developerWorks
ibm.com/developerWorks/
Page 18 of 20
ibm.com/developerWorks/
developerWorks
Resources
Learn
Visit developerWorks DBA Central to sharpen your skills on installation, migration,
administration, problem determination, monitoring, availability, security, and performance.
Visit the developerWorks resource page for DB2 for Linux, UNIX, and Windows to read
articles and tutorials and connect to other resources to expand your DB2 skills.
Learn about DB2 Express-C, the no-charge version of DB2 Express Edition for the
community.
Stay current with developerWorks technical events and Webcasts.
Get products and technologies
Download a free trial version of DB2 Enterprise 9.
Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2
Express Edition for the community that offers the same core data features as DB2 Express
Edtion and provides a solid base to build and deploy applications.
Discuss
Participate in the discussion forum for this content.
Participate in developerWorks blogs and get involved in the developerWorks community.
Page 19 of 20
developerWorks
ibm.com/developerWorks/
Page 20 of 20