Sei sulla pagina 1di 42

Basics Of SQL Tuning

Introduction to SQL
What is SQL
SQL is structured Query Language which is a computer
language for storing, manipulating and retrieving
data stored in relational database.

what happen when you fired the SQL Statement


 Parsing
 Binding
 Executing.
 Fetching (For Select Statement)

SQL Tuning Constitute of Three parts


 SQL Writing (50 %)
 SQL Tuning (30 %)
 Advance SQL Tuning(20 %)
Tips on Writing Tuned SQL

 Know your indexes


 Know the required response time
 Have sufficient data volumes
 Have the optimizers, statistics and Oracle Parameters the
same as production
 Know how to obtain and interpret an execution plan
 Know when and when not to use indexes
 Know what prevents an index from being used
 Understand how Oracle joins tables and get the Driving
Table right
 Understand the behavior of the Oracle Optimizers
 Know which hints are available to you and which ones to
use
 Having an architecture and data model that will assist
performance
Efficient Sql Writing Tips-

Never use >SELECT * from Statement:


Each query should access only the columns needed for the function that will be
performed. But if all the columns are required then, there is no harm in using
SELECT *.

Never use a Cursor for fetching a single row:


Because for using cursors you will have to DECLARE, OPEN, FETCH and CLOSE
them. A single SELECT is much faster.

Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result
table. If duplicates are not a problem and if the chance of having a duplicate
does not exist, then avoid using it because it adds to the overheads.

Use Equivalent Data Types: Use the same data type and lengths when
comparing column values to host variables or literals. This eliminates the need
for data conversion. For examples, to compare a CHAR (6) column with a CHAR
(5) column, Oracle has to do the data conversion and when Oracle has to convert
data the available indexes are not used. The same is the case when there is a
mismatch in lengths or data type between a column and a host variable that are
being compared.
Use BETWEEN instead of <= and >=:
The BETWEEN predicate is more efficient that the `greater/less than or
equal to’ predicates because the optimizer select the most efficient path for
the BETWEEN predicate.

Use IN instead of LIKE:


Use IN or BETWEEN instead of LIKE whenever possible. If you know that
only a certain number of occurrences exist, using IN with the specific list will
be much more efficient than LIKE. You can imitate the functionality of LIKE
by using a range of values.

Avoid using NOT:


Whenever possible avoid the use of NOT (<>). Instead reword the SQL so
that the same results are obtained. For example use ‘WHERE sum > 100’
instead of ‘WHERE sum <> 100’.

Code the most restrictive predicate first:


Choose the condition that will eliminate most of the rows as the first
predicate. For example, if you want to get the names of all the person whose
name starts with the letter ‘M’ in ‘Mumbai’ city, instead of coding ‘WHERE
name BETWEEN ‘NNN’ AND ‘MMM’ and city=’Mumbai’, code city=’Mumbai’
AND name BETWEEN ‘NNN’ AND ‘MMM’.
Use UNION ALL instead of UNION:
The UNION operator always results in a sort. So in cases where there is no
possibility of duplicates or in case where elimination of duplicates is
unnecessary use UNION ALL instead of UNION.

Use NOT EXISTS instead of NOT IN:


When you are coding a sub query using negation logic, use NOT EXISTS
instead of NOT IN to increase the efficiency of the SQL.

Use joins instead of sub queries:


A join can be efficient than a sub query or a correlated sub query using IN.

Join on indexed columns:


The efficiency of your programs improves when tables are joined based on
indexed columns rather than non-indexed ones.

Limit the columns Grouped:


When you are using the GROUP BY clause, specify only those columns that
need to be grouped.

Use same cases for all the tables, Views accessed in select statements,
which will allow Oracle to use shared pool efficiently
Limit updating indexed columns:
When columns in indexes are updated, a corresponding update is applied to
all indexes in which the column participates. This will reduce the
performance considerably because of additional I/O overhead.

Avoid Bitmap Index in OLTP DB:


Do not use Bitmap Index on columns which are part of OLTP database ,as
these OLTP tables are getting updated, deleted and inserted frequently. As
To update one value in Bitmap index, Oracle has to rearrange all the Bitmap
index elements

Prefer Truncate in place of Delete:


As Delete statements doesn't reset the High water mark of a table, it leaves
behind empty blocks which may hamper FULL table access of such tables.
Truncate deletes all the data and reset the high-water mark ,which results in
no empty blocks.

Use /*+ APPEND */ Hint while inserting data in Global Temporary Or


Temporary tables used for data manipulation, this will insert data faster as
data will be stored straight way after the High water mark.
Tips on Writing Tuned SQL

 Know your indexes


 Know the required response time
 Have sufficient data volumes
 Have the optimizers, statistics and Oracle Parameters the
same as production
 Know how to obtain and interpret an execution plan
 Know when and when not to use indexes
 Know what prevents an index from being used
 Understand how Oracle joins tables and get the Driving
Table right
 Understand the behavior of the Oracle Optimizers
 Know which hints are available to you and which ones to
use
 Having an architecture and data model that will assist
performance
Indexes
 All developers require a sound knowledge of the indexes on
their tables
 Indexes should be used to avoid FULL Table Scans ONLY
when the SQL requires the index to perform acceptably
 Occasionally Oracle will prefer FULL Scans and the
developer will need to place a hint into the SQL to make it
use the correct index
 If the developer is required to add an index

 Be aware of the impact on other transactions

 Be aware of the issues caused by single column indexes,


adding one concatenated index in preference to two
single column indexes
What is an Index?

 An index is an hierarchically organized structure designed to facilitate the


rapid lookup of row identifiers (rowids) based on key values .

Types Of Index
 B-tree indexes
This is the standard tree index that Oracle has been using since the earliest
releases.

 Bitmap indexes
Bitmap indexes are used where an index column has a relatively small
number of distinct values (low cardinality). These are super-fast for read-only
databases, but are not suitable for systems with frequent updates

 Bitmap join indexes


This is an index structure whereby data columns from other tables appear in
a multi-column index of a junction table. This is the only create index syntax
to employ a SQL-like from clause and where clause
Inside Oracle indexes

 In order to properly manage the blocks, Oracle controls the allocation of


pointers within each data block. As an Oracle tree grows (via inserting rows
into the table), Oracle fills the block, and when full it splits, creating new
index nodes (data blocks) to manage the symbolic keys within the index.

 Hence, an Oracle index block may contain two types of pointers


- Pointers to other index nodes (data blocks)
- ROWID pointers to specific table rows

 Oracle manages the allocation of pointers within index blocks, and this is the
reason why we are unable to specify a PCTUSED value (the freelist re-link
threshold) for indexes. When we examine an index block structure, we see
that the number of entries within each index node is a function of two values
1) The length of the symbolic key
2) The blocksize for the index tablespace
 Each data block within the index contains "nodes" in the index tree, with the
bottom nodes (leaf blocks), containing pairs of symbolic keys and ROWID
values. As an Oracle tree grows (via inserting rows into the table), Oracle
fills the block, and when the block is full, it splits, creating new index nodes
(data blocks) to manage the symbolic keys within the index. Hence, an
Oracle index block may contain pointers to other index nodes or
ROWID/Symbolic-key pairs.

Why Indexes Aren't Used


 The presence of an index on a column does not guarantee it will be used.
The following is a small list of factors that will prevent an index from being
used:
 The optimizer decides it would be more efficient not to use the index. As a
rough rule of thumb, on evenly distributed data an index will be used if it
restricts the number rows returned to 5% or less of the total number of rows.
In the case of randomly distributed data, an index will be used if it restricts
the number of rows returned to 25% or less of the total number of rows.
 • You perform a function on the indexed column i.e. WHERE
UPPER(name) = 'JONES'
• You perform mathematical operations on the indexed column i.e. WHERE
salary + 1 = 10001
• You concatenate a column i.e. WHERE firstname || ' ' || lastname = 'JOHN
JONES'
• The use of 'OR' statements confuses the Cost Based Optimizer (CBO).
Explain plan
The EXPLAIN PLAN statement displays execution plans chosen by the
Oracle optimizer for SELECT, UPDATE, INSERT, and DELETE statements.
A statement's execution plan is the sequence of operations Oracle performs
to run the statement.

Whenever an SQL query is issued to the Oracle database, Oracle creates a


plan for executing the query. This execution plan can be saved in a plan
table.
The plan table contains the order of the tables referenced by the statement,
the method used to access the table, the join method, and data operations.

The EXPLAIN PLAN results let you determine whether the optimizer select a
particular execution plan, such as, join, Scan Methods. It also helps you to
understand the optimizer decisions, such as why the optimizer chose a
nested loops join instead of a hash join,
To explain a SQL statement, use the EXPLAIN PLAN FOR clause
immediately before the statement.

For example:
EXPLAIN PLAN FOR SELECT last_name FROM employees; .

You can Display the plan_Table output by using the below sql statement
select * from table(dbms_xplan.display());
Reading Explain Plan
What is COST ?
Cost of the operation as estimated by the optimizer's query approach. Cost
is not determined for table access operations. The value of this column does
not have any particular unit of measurement; it is merely a weighted value
used to compare costs of execution plans. The value of this column is a
function of the CPU_COST and IO_COST columns.

CPU_COST
CPU cost of the operation as estimated by the query optimizer's approach.
The value of this column is proportional to the number of machine cycles
required for the operation. For statements that use the rule-based approach,
this column is null.
IO_COST
I/O cost of the operation as estimated by the query optimizer's approach.
The value of this column is proportional to the number of data blocks read by
the operation. For statements that use the rule-based approach, this column
is null.
What is CARDINALITY ?
Estimate by the query optimization approach of the number of rows
accessed by the operation.

What is BYTES?
Estimate by the query optimization approach of the number of bytes
accessed by the operation.

There are other parameter Like Object Owner which depict the
owner of that particular object, Object Name etc.
Significance of Access Paths

To choose an access path, the optimizer first determines which access paths
are available by examining the conditions in the statement's WHERE clause.
The optimizer then generates a set of possible execution plans using
available access paths and estimates the cost of each plan using the
statistics for the index, columns, and tables accessible to the statement. The
optimizer then chooses the execution plan with the lowest estimated cost

 To choose among available access paths, the optimizer considers the


Selectivity which is the percentage of rows in the table that the query
selects
 A query that selects a small percentage of a table's rows has good
selectivity, while a query that selects a large percentage of rows has poor
selectivity.
 The optimizer is more likely to choose an index scan over a full table scan
for a query with good selectivity than for one with poor selectivity. Index
scans are usually more efficient than full table scans for queries that access
only a small percentage of a table's rows, while full table scans are usually
faster for queries that access a large percentage
FULL TABLE SCAN

This type of scan reads all rows from a table and filters out those that do not
meet the selection criteria

When Oracle performs a full table scan, the blocks are read sequentially.
Each block is read only once.

Why a Full Table Scan Is Faster for Accessing Large Amounts of Data ?
This is because full table scans can use larger I/O calls, and making fewer
large I/O calls is cheaper than making many smaller calls.

When the Optimizer Uses Full Table Scans?


The optimizer uses a full table scan in any of the following cases:
 Lack of Index
 Large Amount of Data
 Small Table
Rowid Scans

The rowid of a row specifies the datafile and data block containing the row
and the location of the row in that block. Locating a row by specifying its
rowid is the fastest way to retrieve a single row, because the exact location
of the row in the database is specified.
To access a table by rowid, Oracle first obtains the rowids of the selected
rows, either from the statement's WHERE clause or through an index scan
of one or more of the table's indexes. Oracle then locates each selected row
in the table based on its rowid.

When the Optimizer Uses Rowids?

This is generally the second step after retrieving the rowid from an index.
The table access might be required for any columns in the statement not
present in the index.
Access by rowid does not need to follow every index scan. If the index
contains all the columns needed for the statement, then table access by
rowid might not occur.
INDEX SCAN
In this method, a row is retrieved by traversing the index, using the indexed
column values specified by the statement. An index scan retrieves data from
an index based on the value of one or more columns in the index. To
perform an index scan, Oracle searches the index for the indexed column
values accessed by the statement. If the statement accesses only columns
of the index, then Oracle reads the indexed column values directly from the
index, rather than from the table.

An index scan can be one of the following types:


 Assessing I/O for Blocks, not Rows

 Index Unique Scans

 Index Range Scans

 Index Skip Scans

 Full Scans

 Fast Full Index Scans

 Index Joins

 Bitmap Indexes
Assessing I/O for Blocks, not Rows

Oracle does I/O by blocks. Therefore, the optimizer's decision to use full
table scans is influenced by the percentage of blocks accessed, not
rows. This is called the index clustering factor. If blocks contain single
rows, then rows accessed and blocks accessed are the same.

However, most tables have multiple rows in each block. Consequently,


the desired number of rows could be clustered together in a few blocks,
or they could be spread out over a larger number of blocks.

Although the clustering factor is a property of the index, the clustering


factor actually relates to the spread of similar indexed column values
within data blocks in the table. A lower clustering factor indicates that the
individual rows are concentrated within fewer blocks in the table.
Conversely, a high clustering factor indicates that the individual rows are
scattered more randomly across blocks in the table. Therefore, a high
clustering factor means that it costs more to use a range scan to fetch
rows by rowid, because more blocks in the table need to be visited to
return the data. shows how the clustering factor can affect cost.
Assume the following situation:
There is a table with 9 rows.
There is a non unique index on col1 for table.
The c1 column currently stores the values A, B, and C.
The table only has three Oracle blocks.

Case 1:
The index clustering factor is low for the rows as they are arranged in the
following diagram.
Block 1 Block 2 Block 3
------- ------- --------
AAA BBB CCC

This is because the rows that have the same indexed column values for c1
are located within the same physical blocks in the table. The cost of using a
range scan to return all of the rows that have the value A is low, because
only one block in the table needs to be read.
Case 2:

If the same rows in the table are rearranged so that the index values are
scattered across the table blocks, then the index clustering factor is higher.

Block 1 Block 2 Block 3


------- ------- --------
ABC ABC ABC

This is because all three blocks in the table must be read in order to retrieve
all rows with the value A in col1.
Index Unique Scan :

Index unique scans, which occur when the Oracle database engine uses an
index to retrieve a specific row from a table. Generally we have the unique
index on the column of the table.
SELECT *
FROM emp e
WHERE e.empno = 1234;
Explain Plan--
SELECT STATEMENT ()
TABLE ACCESS ( BY INDEX ROWID) EMP
INDEX ( UNIQUE SCAN ) UQ_EMP

Index Full Scan :


Oracle will choose an index full scan when the CBO statistics that indicate
that a full-index scan is going to be more efficient than a full-table scan and a
sort of the result set. The full-index scan is normally invoked when the CBO
determines that a query will return numerous rows in index order, and a full-
table scan and sort option may cause a disk sort to the TEMP tablespace.
Index Range Scan

The most common method of index access in Oracle is the index


range scan. An index range scan is used when the SQL statement
contains a restrictive clause that requires a sequential range of
values that are indexes for the table.

SELECT *
FROM emp e
WHERE e.empno > 1234;
Explain Plan
SELECT STATEMENT ()
TABLE ACCESS ( BY INDEX ROWID) EMP
INDEX ( RANGE SCAN ) UQ_EMP
Index Skip Scan:

Index skip-scan feature enables the optimizer to use a concatenated index


even if its leading column is not listed in the WHERE clause.
Skip scanning lets a composite index be split logically into smaller
subindexes. In skip scanning, the initial column of the composite index is not
specified in the query. In other words, it is skipped.

Index skip-scans are faster than full scans of the index, requiring fewer
reads to be performed.
Consider, for example, a table employees (sex, employee_id, address) with
a composite index on (sex, employee_id). Splitting this composite index
would result in two logical subindexes, one for M and one for F.
For this example, suppose you have Created index on sex and employee ID
The index is split logically into the following two subindexes:
The first subindex has the keys with the value F.
The second subindex has the keys with the value M.
The column sex is skipped in the following query
>SELECT * FROM employees WHERE employee_id = 101;
A complete scan of the index is not performed, but the subindex with the
value F is searched first, followed by a search of the subindex with the value
M.
Fast Full Index Scans
Fast full index scans are an alternative to a full table scan when the index
contains all the columns that are needed for the query, and at least one
column in the index key has the NOT NULL constraint. A fast full scan
accesses the data in the index itself, without accessing the table. It cannot
be used to eliminate a sort operation, because the data is not ordered by the
index key. It reads the entire index using multiblock reads, unlike a full index
scan, and can be parallelized.

A fast full scan is faster than a normal full index scan in that it can use
multiblock I/O and can be parallelized just like a table scan.
When a index contains all of the values required to satisfy the query and
table access is not required. The fast full-index scan execution plan will read
the entire index with multi-block reads (using db_file_multiblock_read_count)
and return the rows in unsorted order
The fast full scan usually requires fewer physical I/Os than a full table scan ,
allowing the query to be resolved faster.
There is concatenated index on the columns empno, ename, and deptno.
SELECT e.empno,e.ename,e.deptno
FROM emp e
WHERE e.deptno = 30;
----Explain Plan----
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=5)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN)
Since all of the columns in the SQL statement are in the index, a fast full
scan is available.
Index fast full scans are commonly performed during joins in which only the
indexed join key
columns are queried.
Oracle Joining Methods :

Types of Joins:
 Nested loop
 Sort merge joins
 Hash joins
 Index joins

Nested loop :

Oracle reads the first row from the first row source and then checks the
second row source for matches.
All matches are then placed in the result set.
This continues until all rows in the first row source have been processed.
The first row source is often called the outer or driving table,
Second row source is called the inner table.
It is the fastest methods of receiving the first records back from a join.
Suppose somebody gave you a telephone book and a list of 20 names
to look up, and asked you to write down each person’s name and
corresponding telephone number. You would probably go down the list
of names, looking up each one in the telephone book one at a time.
This task would be pretty easy because the telephone book is
alphabetized by name. Moreover, somebody looking over your
shoulder could begin calling the first few numbers you write down
while you are still looking up the rest. This scene describes a NESTED
LOOPS join.

NESTED LOOPS joins are ideal when the driving row source (the
records you are looking for) is small and the joined columns of the
inner row source are uniquely indexed or have a highly selective non-
unique index.

However, NESTED LOOPS joins can be very inefficient if the inner row
source (second table accessed) does not have an index on the joined
columns or if the index is not highly selective. If the driving row source
(the records retrieved from the driving table) is quite large.

SELECT /*+ ordered */ e.ename, d.deptno


FROM dept d, emp e
WHERE d.deptno = e.deptno;
Sort Merge Join

Oracle sorts the first row source by its join columns,


Oracle sorts the second row source by its join columns
Merges the sorted row sources together.
As matches are found, they are put into the result set.

Suppose two salespeople attend a conference and each collect over


100 business cards from potential new customers. They now each
have a pile of cards in random order, and they want to see how many
cards are duplicated in both piles. The salespeople alphabetize their
piles, and then they call off names one at a time. Because both piles of
cards have been sorted, it becomes much easier to find the names that
appear in both piles. This example describes a SORT-MERGE join.

SORT-MERGE joins can be effective when lack of data selectivity or


useful indexes render a NESTED LOOPS join inefficient. SORT-MERGE
joins can be used only for equijoins (WHERE D.deptno = E.deptno, as
opposed to WHERE D.deptno >= E.deptno).
Hash Join
Oracle accesses one table (usually the smaller of the joined results) and
builds a hash table on the join key in memory.
It then scans the other table in the join (usually the larger one) and probes
the hash table for matches to it.
Oracle uses a HASH join efficiently only if the parameter
PGA_AGGREGATE_TARGET is set to a large enough value.
HASH joins work only on equijoins
HASH joins can be effective when the lack of a useful index renders
NESTED LOOPS joins inefficient. The HASH join might be faster than a
SORT-MERGE join, in this case, because only one row source needs to be
sorted, and it could possibly be faster than a NESTED LOOPS join because
probing a hash table in memory can be faster than traversing a b-tree index.
HASH joins use memory resources and can drive up I/O in the temporary
tablespace if the sort memory is not sufficient (which can cause this join
method to be extremely slow).
SELECT /*+ ordered */ e.ename, d.deptno
FROM dept d, emp e
WHERE d.deptno = e.deptno;
Index Join :
If a set of indexes exists that contains all of the information required by the
query then the optimizer can choose to generate a sequence of HASH joins
between the indexes.
Each of the indexes are accessed using a range scan or fast full scan,
depending on the conditions.
This method is extremely efficient when a table has a large number of
columns, but you want to access only a limited number of those columns.
The advantage of INDEX joins over fast full scans is that fast full scans have
a single index satisfying the entire query. INDEX joins have multiple indexes
satisfying the entire query.
Two indexes (one on ENAME and one on DEPTNO) have been created
prior to the execution of the corresponding query in this next listing. The
query does not need to access the table
SELECT e.ename,e.deptno
FROM emp e
WHERE e.deptno = 20
AND e.ename = 'DULLY';
Efficient Sql Writing Tips-

Never use >SELECT * from Statement:


Each query should access only the columns needed for the function that will be
performed. But if all the columns are required then, there is no harm in using
SELECT *.

Never use a Cursor for fetching a single row:


Because for using cursors you will have to DECLARE, OPEN, FETCH and CLOSE
them. A single SELECT is much faster.

Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result
table. If duplicates are not a problem and if the chance of having a duplicate
does not exist, then avoid using it because it adds to the overheads.

Use Equivalent Data Types: Use the same data type and lengths when
comparing column values to host variables or literals. This eliminates the need
for data conversion. For examples, to compare a CHAR (6) column with a CHAR
(5) column, Oracle has to do the data conversion and when Oracle has to convert
data the available indexes are not used. The same is the case when there is a
mismatch in lengths or data type between a column and a host variable that are
being compared.
Use BETWEEN instead of <= and >=:
The BETWEEN predicate is more efficient that the `greater/less than or
equal to’ predicates because the optimizer select the most efficient path for
the BETWEEN predicate.

Use IN instead of LIKE:


Use IN or BETWEEN instead of LIKE whenever possible. If you know that
only a certain number of occurrences exist, using IN with the specific list will
be much more efficient than LIKE. You can imitate the functionality of LIKE
by using a range of values.

Avoid using NOT:


Whenever possible avoid the use of NOT (<>). Instead reword the SQL so
that the same results are obtained. For example use ‘WHERE sum > 100’
instead of ‘WHERE sum <> 100’.

Code the most restrictive predicate first:


Choose the condition that will eliminate most of the rows as the first
predicate. For example, if you want to get the names of all the person whose
name starts with the letter ‘M’ in ‘Mumbai’ city, instead of coding ‘WHERE
name BETWEEN ‘NNN’ AND ‘MMM’ and city=’Mumbai’, code city=’Mumbai’
AND name BETWEEN ‘NNN’ AND ‘MMM’.
Use UNION ALL instead of UNION:
The UNION operator always results in a sort. So in cases where there is no
possibility of duplicates or in case where elimination of duplicates is
unnecessary use UNION ALL instead of UNION.

Use NOT EXISTS instead of NOT IN:


When you are coding a sub query using negation logic, use NOT EXISTS
instead of NOT IN to increase the efficiency of the SQL.

Use joins instead of sub queries:


A join can be efficient than a sub query or a correlated sub query using IN.

Join on indexed columns:


The efficiency of your programs improves when tables are joined based on
indexed columns rather than non-indexed ones.

Limit the columns Grouped:


When you are using the GROUP BY clause, specify only those columns that
need to be grouped.

Use same cases for all the tables, Views accessed in select statements,
which will allow Oracle to use shared pool efficiently
Limit updating indexed columns:
When columns in indexes are updated, a corresponding update is applied to
all indexes in which the column participates. This will reduce the
performance considerably because of additional I/O overhead.

Avoid Bitmap Index in OLTP DB:


Do not use Bitmap Index on columns which are part of OLTP database ,as
these OLTP tables are getting updated, deleted and inserted frequently. As
To update one value in Bitmap index, Oracle has to rearrange all the Bitmap
index elements

Prefer Truncate in place of Delete:


As Delete statements doesn't reset the High water mark of a table, it leaves
behind empty blocks which may hamper FULL table access of such tables.
Truncate deletes all the data and reset the high-water mark ,which results in
no empty blocks.

Use /*+ APPEND */ Hint while inserting data in Global Temporary Or


Temporary tables used for data manipulation, this will insert data faster as
data will be stored straight way after the High water mark.

Potrebbero piacerti anche