Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
How it Works
Cary Millsap
Hotsos Enterprises, Ltd.
Conventional Oracle “tuning” methods are outrageously complex, and they don’t
work. For more than three decades, good software developers have known how
to optimize computer-executed tasks using a simple method that works every
time. This presentation describes how to use this method in an Oracle
environment, and it demonstrates the method’s use with an example.
In your Oracle life, though, you’ve been trained to view things differently.
Professionals have taught you that looking directly at how long an individual
task takes is either unproductive or impossible. You’re taught that the only way
to analyze Oracle performance is by studying a complex network of statistics that
don’t exactly answer the real question you’re trying to answer.
Asking Statspack how long something took is rather like asking how long a flight
from Dallas to Chicago should take, and then having someone give you an
answer like this:
“The average fuel consumption per flight for all flights conducted upon the
planet in the past year is x.”
There are only a couple of things you can do with information like that:
• You could try to extract the information you really wanted from the
information you got. For some, it’s a fun technical challenge. For example,
you might use the relationship average-flight-duration = total-fuel ÷
average-fuel-consumption-rate to try to compute some approximation of a
world-wide average flight duration. But even if you got this far, you would
need to realize that this “average flight duration” you’ve found includes
thousands of different kinds of flights. Such a number would include some
18½-hour flights from Los Angeles to Singapore, some 35-minute flights
from Honolulu to Maui, and everything in-between. What can an average
like that possibly tell you about the duration of a flight from Dallas to
Chicago? If you’re lucky, the duration of a Dallas-to-Chicago flight might
resemble the worldwide average. But, realize… You’ve got to be lucky for
the answer to do you any good at all.
• Or you could insist upon a direct answer to your question: How long does a
flight from Dallas to Chicago take? Such information is, of course,
attainable through other means.
Why doesn’t a Statspack report tell you how long your program takes? One good
reason is that such information cannot be derived from the kinds of operational
data that Statspack uses as its input (at least not from the stuff you’re taught to
give it as its input).2 Perhaps the biggest problem with Statspack is that when
1
Statspack is “a set of SQL, PL/SQL, and SQL*Plus scripts that allow the collection,
automation, storage, and viewing of performance data. This feature has been replaced by
the Automatic Workload Repository.” From Oracle® Database Performance Tuning
Guide 10g Release 1 (10.1) available at otn.oracle.com.
2
Chapter 8 of Optimizing Oracle Performance, by Cary Millsap and Jeff Holt, contains
more than 40 pages of detail describing why this is so.
Traditional Oracle performance analysis methods just don’t work for several
classes of commonly occurring performance problems.3 The problem is that
there’s no direct correspondence between the metric values you’re “supposed to”
study and the goodness or badness of performance in your Oracle system.
The traditional performance analysis methods that the Oracle culture teaches its
newcomers are medieval in comparison to the tools and methods that
professional software developers use.5 Why not just look at what you’re trying to
see (what took so long)? The answer “because it’s impossible” is simply not
acceptable, because developers do it all the time. The way we do it is by using
software profilers.
3
One might argue that they’re so “commonly occurring” specifically because traditional
methods leave them undiagnosed.
4
For details and more examples, see “Why a 99%+ database buffer cache hit ratio is not
ok” and “Why you should focus on LIOs instead of PIOs,” available at www.hotsos.com.
5
That’s not to say that tools that Oracle performance analysts use don’t look nice enough.
Most traditional Oracle performance analysis tools are tremendously attractive. They just
don’t tell you where your code is spending its time.
This particular program’s response time was 3.12 seconds (you can see this in
the bottom row of the cumul. seconds column). From the output, you can see
exactly where the time went: 68.59% of the program’s response time was
consumed by 62,135,400 executions of the subroutine called step.
6
Knuth, D. E. 1971. “Empirical study of FORTRAN programs” in Software—Practice
and Experience, Apr/Jun 1971, Vol. 1, No. 2, pp105–133. Mainframers in the audience
might remember the famous mainframe profiling tool called STROBE, or the Candle
Omegamon feature called INSPECT.
7
Fenlason, J.; Stallman, R. 1988. “The GNU profiler,” available at http://www.gnu.org/-
software/binutils/manual/gprof-2.9.1/html_mono/gprof.html.
8
Gough, B. J. 2004. An Introduction to GCC—for the GNU compilers gcc and g++.
Bristol UK: Network Theory Ltd.
But you, the Oracle performance analyst, can’t use gprof to debug and optimize
the code in the Oracle kernel (I’ll show you why in a minute). You have to rely
upon the operational data that the Oracle kernel developers have given you. In
the 1980s, you were literally condemned to the state of not being able to find out
where your response time is going. But in late 1992, Oracle Corporation made a
significant step toward solving the problem. If you’re using any version of
Oracle at least as new as 7.0.12, then you automatically possess an essential key
to being able to profile your Oracle application programs.
Once you have your TIMED_STATISTICS setting in proper shape, you can
activate and deactivate tracing using standard Oracle packages, as shown in
Exhibits 2 and 3.
9
Software profiling is a momentous rediscovery of the twenty-first century. Newer
profiling tools like the DTrace tool that is built into Sun Microsystems’s Solaris 10
kernel provide spectacular diagnostic capabilities See http://www.sun.com/2004-0518/-
feature/ for more information.
/* Otherwise, do this */
dbms_monitor.session_trace_enable(:sid, :serial, true, true)
/* Code to be profiled runs during this time. */
dbms_monitor.session_trace_disable(:sid, :serial) /* …or just disconnect */
/* Otherwise, do this */
dbms_support.start_trace_in_session(:sid, :serial, true, true)
/* Code to be profiled runs during this time. */
dbms_support.stop_trace_in_session(:sid, :serial) /* …or just disconnect */
4 Profiling Oracle
The problem with profiling Oracle prior to 1992 was that Oracle customers
couldn’t do it. The Oracle kernel simply couldn’t emit the operational data
required to produce a profile. This is why traditional methods of looking at
system-wide resource consumption statistics were invented in the first place (like
the ones implemented in Statspack).
Since the year 2000, a few colleagues and I have devoted our careers to
determining the minimal necessary sequence of steps required to solve an
After you learn how to extricate meaning from Oracle extended SQL trace data,
there remains only one problem. To manage the information contained within the
very large extended SQL trace files that the Oracle kernel emits, you need
software assistance. This used to be a difficult technical problem. Now it’s
merely a question of economics, because today you can buy textbooks and
education that teach you how to extract what you need from your trace data, and
you can buy prepackaged software or services if you prefer to have someone else
do the work for you.10
10
Further information about one means of obtaining such courses, books, software, and
services is rendered in the About Hotsos section at the end of this paper.
Another type of flat profile that is valuable in an Oracle context is the profile by
database call. For example, it is interesting to know how much time was
consumed by parsing for a given SQL statement. How much time by executing?
And how much time by fetching? For such information to constitute a true
profile the total response time rendered in the table needs to be the total response
time contributed by the statement. To do this requires the addition of a row
denoting time spent between database calls. Sometimes the time spent between
calls dominates a statement’s total response time.
Exhibit 5 shows an example of a database call profile. In this example, you can
see that the dominant response time consumer for the statement was the time
spent between database calls, followed distantly by the time spent executing and,
finally, parsing the statement. If you’re familiar with the behavior that causes
most between-call event time that is attributable to SQL statements, the greatest
benefit of the database call profile in Exhibit 5 is the information about how
11
The problem and its solution are described in their entirety in Optimizing Oracle
Performance, pages 327–332.
Exhibit 5. A flat profile that decomposes Oracle response time by database call.
-----Duration (seconds)-----
Database call Elapsed CPU Other Calls Rows LIOs PIOs
-------------------- ------------- ----- ------ ------ ----- ------ ----
Between-call events 23.850 93.2% 0.000 23.850 0 0 0 0
EXEC 0.880 3.4% 0.890 -0.010 348 348 3,859 351
PARSE 0.870 3.4% 0.820 0.050 696 0 0 0
-------------------- ------------- ----- ------ ------ ----- ------ ----
Total 25.600 100.0% 1.710 23.890 1,044 348 3,859 351
-------------------- ------------- ----- ------ ------ ----- ------ ----
Total per EXEC 0.074 0.3% 0.005 0.069 3 1 11 1
Total per row 0.074 0.3% 0.005 0.069 3 1 11 10
A call graph is interesting in the Oracle context because of the way that Oracle
database calls can nest (Oracle Corporation uses the term recursive SQL to refer
to nested database calls). For example, the following statement produces such
nested database calls:
select object_name from dba_objects where object_id=:id
Specifically, the Oracle parse call for this statement motivates three database
call “children”: a parse call, an exec call, and a fetch call upon a new SQL
statement that is never mentioned in the top-level application code. The SQL
statement that the Oracle kernel must parse, execute, and fetch before it can
finish parsing our application’s select statement is this one:
select text from view$ where rowid=:1
A call graph for the relationship between our application SQL and its recursive
SQL might look like this:
------Duration (seconds)------
Including
Statement Self children
-------------------------------------- --------------- -------------
select object_name from dba_objects... 0.200s 40.0% 0.500 100%
│ select text from view$ where ro... 0.300s 60.0% 0.300 60%
-------------------------------------- --------------- -------------
Total 0.500s 100.0%
A true call graph for any task that had an n-second duration would show a total
response time of n seconds, even if it had to add a line item labeled “time spent
The great benefit of having an Oracle call graph is that it enables you to affix the
blame for poor performance accurately upon the specific SQL statement that is
causing your pain. For example, if a recursive SQL statement that reloads your
dictionary cache (part of your shared pool) is a dominant contributor to your
response time, a call graph shows you which SQL statements the recursive
statement is a child of. Perhaps it’s possible to parse those particular statements
less often. With the call graph, you can target those specific statements that will
produce the greatest result for the least effort. Without the call graph, you might
mistakenly believe that you had to reduce parse calls system-wide, or that your
best recourse is to increase the size of your shared pool.
5 Case Study
When you use a profiler, your work follows a repeatable plan. This section
shows a case that illustrates the process. The root cause of the performance
problem in this case is like dozens we see each year in the field.
5.1 Baseline
The case begins with a batch job that takes too long to run. The database server
is a single-CPU Windows XP machine running Oracle version 9.2.0.4. There are
two concurrent processes on the system, each running the same kind of batch
job, but manipulating completely different tables in a single database. Each
client process inserts 5,000 rows into its own table. The application client
process runs on the same host as the Oracle kernel. The client connects via the
following alias defined in the system’s tnsnames.ora file:
v92 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = CVM-LAP02)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = v92.hotsos)
)
)
Exhibit 6 shows the baseline flat profile for the slow job. The job ran for
11.6 seconds, and the flat profile shows why.
The profile is clear: 70.5% of the slow program’s response time is due to a
response time component called instrumentation gap. Whether you know what
that means or not, the profile says undeniably where you need to invest your
effort. But what does instrumentation gap mean? There’s no “official” Oracle
timed event named this (if you review the content of Oracle’s V$EVENT_NAME
fixed view, you’ll notice that there’s no Oracle timed event named
instrumentation gap).
The instrumentation gap timed event is an event that a profiler must synthesize
to recognize otherwise unexplained jumps in the timer values rendered within an
Oracle trace file. It’s all part of the effort to account for exactly the response
time that the user has experienced. In the Oracle case, when instrumentation gap
time dominates a profile, it usually indicates that a process has spent a lot of its
time in the operating system’s preempted state.13 The most effective way to
reduce the amount of time that a program spends in this state is to reduce
competition for the system’s CPU. (An operating system scheduler will not
preempt a process if there’s no other process waiting in the run queue.)
The second timed event listed in the profile is SQL*Net message from client.
Most Oracle instruction teaches you that you’re supposed to ignore this
particular event, because it represents time that the Oracle kernel spends “idle,”
awaiting some command. However, this particular “idle” event accounts for
20.4% of our program’s response time. To ignore the time attributed to this
event would be an exceptionally bad policy, as you’ll see in a moment.
12
All the profiles for this case study were computed using a pre-beta version of a
software product called Hotsos Profiler P5. The profiles for this particular case may vary
slightly in later, more thoroughly tested versions of the software. However, the gist of the
case—the flow of the diagnosis—are exactly correct.
13
For details about why it works this way, see pages 170–175 of Optimizing Oracle
Performance.
The fourth row in the profile (CPU service, EXEC calls) shows that the program
consumed 0.411 seconds of its total response time actually inserting the rows
into its table. Contemplation of this row puts things into perspective: of the
11.632 seconds it took to run this thing, the program spent less than half a
second (only 3.5% of the job’s run time) doing the work I had asked it to do. An
efficient program would spend less time doing anything other than inserting the
rows we need inserted.
5.2 Drill-down
So, which SQL is it that’s being parsed too much? The answer to this question
lies in the job’s profile by SQL statement (the job’s call graph), shown in
Exhibit 7. From this profile, it’s easy to tell that the offending SQL statement is
an insert statement, which single-handedly accounts for 97.7% of the job’s
11.601-second total response time.
The database call profile for this SQL statement tells the story of where the
statement spent all this time. This profile is shown in Exhibit 8.
-----Duration (seconds)-----
Database call Elapsed CPU Other Calls Rows LIOs PIOs
-------------------- ------------- ----- ------ ------ ----- ------ ----
Between-call events 6.541 57.7% 0.000 6.541 0 0 0 0
PARSE 3.801 33.5% 1.843 1.958 5,000 0 0 0
EXEC 0.991 8.7% 0.411 0.580 5,000 5,000 30,851 0
-------------------- ------------- ----- ------ ------ ----- ------ ----
Total 11.333 100.0% 2.253 9.079 10,000 5,000 30,851 0
-------------------- ------------- ----- ------ ------ ----- ------ ----
Total per EXEC 0.002 0.0% 0.000 0.002 2 1 6 0
Total per row 0.002 0.0% 0.000 0.002 2 1 6 0
Most of the job’s time was consumed between database calls. What’s the best
way to reduce the time consumed between database calls? The best way to
reduce time spent between database calls is to reduce the number of database
calls to begin with. Exhibit 8 shows that this job’s insert “statement” executed
10,000 database calls altogether, with 5,000 parse calls and 5,000 exec calls.
Parsing consumed far more time than executing.
So, why does the job execute 5,000 Oracle parse calls? The highlighted text in
the profile heading tells the answer: there are 5,000 similar but distinct versions
of this SQL text. A quick look either at the raw trace data or the application code
itself shows the following sequence of SQL statements that were processed
during the problem job:
insert into parse2 values (1, lpad('1',20))
insert into parse2 values (2, lpad('2',20))
insert into parse2 values (3, lpad('3',20))
…
insert into parse2 values (5000, lpad('5000',20))
Each distinct SQL statement is, in turn, parsed and then executed. That’s a big
waste of time and energy, because with Oracle, an application doesn’t need to do
so much work to get the job done of inserting 5,000 rows into a table. A look at
the application source code reveals the problem. The source code says,
essentially:14
for each row, varying $n over some range of values
$result = do("insert into t values ($n, …)")
14
Each year, we see this problem manifested in many languages including Java, C,
Visual Basic, and others. In this paper, I use a pseudocode that’s similar in structure to
Unix shell or Perl programming, using the convention that scalar variables are denoted
with a $ prefix. For example, the statement $x=1 assigns the value 1 to the variable
named $x, and f($x) executes the function called f upon the value of $x.
One rule of thumb that every Oracle application developer ought to know is this
one:
It’s almost never appropriate to execute Oracle parse calls inside a loop.
1. The new code doesn’t use do() at all. Instead, it separates the parse and
exec calls into two distinct function calls, so that the parse may be executed
once and only once, outside the loop.
2. In the parse call, it uses a placeholder (in this case, the ? symbol) instead of
some actual value to be inserted. This allows for one parse call to create a
reusable cursor handle that can accommodate many different exec calls with
different values.
3. In the exec call, the argument list includes the actual values that are to be
bound into the original SQL statement in the position denoted by the
placeholder.
Writing the code this way should eliminate 4,999 parse calls and of course some
amount of time that is attributable to those eliminated parse calls.
It is reasonable to expect that the proposed elimination of parse calls will result
in the improvements depicted in Exhibit 10. The highlighted figures show the
estimated result of the call reduction caused by rewriting the code. I expect for
the code change to impact more than just the CPU service, PARSE calls
component. In fact, there are four timed events for which I expect an elimination
of parse calls to create a response time improvement.
I have derived each “estimated time after” figure using the following relation:
callsafter
timeafter = timebefore .
callsbefore
It is therefore reasonable to suppose that response time for the statement will
drop from 11.333 seconds (before) to 4.843 seconds (after). Plugging this
estimated new response time value back into the SQL statement profile for the
job (Exhibit 7) shows that we should expect the job that presently takes
11.601 seconds to consume only 5.116 seconds after the proposed change.
That’s not a bad day’s work: it’s a 56% reduction in response time. As I’ll reveal
shortly, a forecast computed like I’ve shown here is almost always a
conservative estimate, which is additional good news.
Exhibit 11. The profile for the job formerly known as the “problem job.”
Oracle subroutine Duration (secs) # Calls Dur/call
--------------------------------- ----------------- --------- -----------
instrumentation gap 0.974 63.7% 4,993 0.000195
SQL*Net message from client 0.393 25.7% 5,010 0.000078
CPU service, EXEC calls 0.381 24.9% 5,011 0.000076
CPU service, PARSE calls 0.090 5.9% 11 0.008194
log file sync 0.027 1.8% 1 0.027396
SQL*Net message to client 0.008 0.5% 5,010 0.000002
latch free 0.000 0.0% 9 0.000000
instrumentation overlap -0.343 -22.4% 38 0.009035
--------------------------------- ----------------- --------- -----------
Total 1.530 100.0%
I hope that two thoughts are swirling through your head at this point:
• Wow. This program runs even faster than I thought it was going to!
So, what happened!? Why were the results so much better than the forecast?
15
In my test, I used Perl with the DBI module version 1.37 and the DBD::Oracle
module version 1.12.
16
See chapter 9, Queueing Theory for the Oracle Practitioner, in Optimizing Oracle
Performance, by Cary Millsap and Jeff Holt.
That is a question for your business to decide. The answer comes from
answering the following questions:
One strategy for eliminating even more database calls plus undoubtedly some of
the SQL*Net message from client duration is to use Oracle’s array insertion
feature. So the question becomes:
Is the performance you might gain by using Oracle’s array insertion feature
within the application worth the cost of learning how to use it?
6 Conclusion
A profiler should be the backbone of any performance measurement strategy. A
good Oracle profiler gives you the first information you need in any performance
problem diagnosis project: exactly where all of your program’s time has gone.
A profiler gives you two capabilities that you can’t find anywhere else:
Measuring
The core benefit of using a profiler is that you can see exactly what impact a
given problem creates for exactly the program you care about. This
capability is indispensable in problem diagnosis situations, and even in
preemptive performance management tasks.
Forecasting
A profiler aids forecasting in two distinct ways. First, it allows you to play
what if? for each component of your program’s response time. It shows you
the details of what impact a proposed change will have upon your program,
with a minimum of mathematical effort. Second, it allows you to see the
exact result of a change after you make it. When you use a profiler in a test
environment, you get a much more detailed idea of how the change you are
testing will behave when you implement your change in production.
By selecting the particular case study that I’ve shown in this paper, I’ve meant to
illustrate several lessons, including the following.
• Some common performance problems, like the one described here, are much
easier to diagnose with a good profiler than with any other tool. Other
metrics (low system-wide parse-to-exec ratios, for example) don’t provide
the measurement and forecasting capabilities that profiler provides.
Acknowledgments
Thank you to the people who have made Oracle profiling possible: Anjo Kolk,
the father of response-based Oracle performance analysis; Mogens Nørgaard, the
man who introduced me to the Oracle features that make profiling possible;
Virag Saksena, who showed me the true potential of extended SQL trace data;
Jeff Holt, the father of our Hotsos Profiler product line; Jon Bentley, the author
of More Programming Pearls, which planted the profiling seed in my mind;
Jonathan Gennick, my O’Reilly editor who helped me learn; Gary Goodman and
the staff at Hotsos Enterprises, who make it possible for me to feed my family
while having fun at work. Thank you to my contributors and proofreaders: Larry
Klein, Karen Morton, and James Steel. And thank you to my beautiful and
loving family—Mindy, Alex, and Nik, and my parents Van and Shirle—who
support me every day.
About Hotsos
Hotsos Enterprises, Ltd. is a Dallas-based company with more than
600 customers and clients worldwide that use the company’s software products,
classroom and on-site education, and consulting services to improve the
performance of their Oracle systems. Hotsos’s revolutionary methods and tools
help database administrators and application developers solve performance
problems faster and more permanently than ever before. For more information
on Hotsos products and solutions, visit www.hotsos.com or call
+1.817.488.6200.
You can find more information about Hotsos products and services related to
this paper by visiting the following resources:
http://www.oreilly.com/catalog/optoraclep/index.html
This is the home page for the book Optimizing Oracle Performance. It
includes a free download of the first chapter.
http://www.hotsos.com/courses/PD101.php
This page describes the Hotsos educational course that covers collection and
interpretation of Oracle performance diagnostic data.
http://www.hotsos.com/products/profiler.html
This page describes the Hotsos Profiler, a software profiler for Oracle
extended SQL trace data. The Hotsos Profiler P5 (scheduled for release in
early 2005) is the only profiler for Oracle that possesses all of the profiling
features discussed in this paper.
http://www.hotsos.com/services/index.html
This page describes Hotsos performance on-site and remote services for
helping you optimize your Oracle systems.