9ib BI

Introduction to Business
Intelligence and Data

Warehousing
Oracle9i Business Intelligence (Beta) Page 1-1

Historical Trends
• Previous RDBMS releases have primarily focused on
basic requirements of DW
– Oracle 7.3:
– Query performance: star query technology, bitmap
indexes, hash joins
– Oracle8:
– Performance: star transformation, partitioning
– Scalability: partitioning, parallelism enhancements
– Manageability: partitioning, backup/recovery
– Oracle8i:
– Performance: materialized views
– Scalability: new partitioning techniques
– Manageability: resource manager, simplified PQ
tuning ®
1-2
Copyright  Oracle Corporation, 2001. All rights reserved.
Historical Trends
When you look back over these previous three major releases (7.3, 8.0, and
Oracle8i), you see some themes emerge. Oracle has focused on what most people
would consider to be 'primary requirements' for data warehousing:
• Query performance
• Scalability
• Manageability
Examples include:
• In Version 7.3, Oracle introduced bitmap indexes and hash joins to improve
Data Warehouse (DW) query performance (perhaps the most significant
primary requirement for DW).
• In Oracle 8.0, the most significant data warehousing features were
partitioning and enhancements to our parallelism. These improved both
scalability and manageability.
• Oracle8i made enhancements in many areas:
– Materialized views for performance
– New hashed and composite partitioning techniques for manageability,
scalability and query performance (solution pruning)
– Several features for manageability such as Resource Manager
– Simplified parallel query tuning

Historical Trends (Continued)
• Present and future challenges to meet;
– New data warehouses are going to be bigger.
– They will have more users.
– They will have more stringent performance requirements.
– They will have more stringent manageability and availability
requirements.

The Traditional Way:
Fragmented Information Supply Chain
OLAP
Data Engine
Data Warehouse
Integration Engine
Engine
Mining
Engine
Protracted implementation and maintenance cycle

– Synchronization and currency issues
– Information Management chaos
®
1-4
The Traditional Way: Fragmented Information Supply Chain

Previously, building a business intelligence system required the integration of
multiple server products.
The result was that such systems were unnecessarily complex. The integration of
multiple servers was costly. Once the system was implemented, there was an
ongoing administration costs in maintaining different servers and keeping the data
synchronized across all servers.

Oracle9i: Major DW Initiatives
Scale to the Internet

Volume of Data
Number of Customers
Business Analysis
Advanced analytic operations
Integrated OLAP Services
Easy to Build
‘ETL’enhancements
Manageability
Internal Infrastructure
Performance
Extensibility and Maintainability
1-5
Oracle9i: Major DW Initiatives

The continuing boom of the Internet brings along an unprecedented volume of data.
Examples include:
• Website clickstreams
• Detailed customer information
• Customer preference data
The second data warehouse focus area of Oracle9i is the seamless integration of
products:
• The Oracle9i server has analytical functions integrated that allow it to
respond to queries that normally only multidimensional OLAP (MOLAP)
systems have to deal with.
• Integration OLAP services in Oracle9i let Express clients work directly
against the Oracle Server.
Extraction, Transformation and Loading (ETL) of big volumes of data is made
easier. Among the features introduced in Oracle9i are:
• Resumeable statements
• External tables
• New SQL statements and options
The above improvements made some enhancements to the underlying infrastructure
necessary. Examples include:
• The Change Data Capture feature
• Multiple block sizes
• Additions to Oracle Enterprise Manager (OEM)
The New Way: Oracle9i
Data Warehousing
ETL
Data
OLAP
Warehouse
Engine
Data Mining
Personalization
Oracle9i
Single business intelligence platform

– Reduce administration and implementation costs
– Faster deployment
– Improved scalability and reliability ®
1-6

Oracle9i is the solution to this information management problem.
Oracle9i provides a single server platform for all business intelligence needs. All
data is stored in the relational database. All administration is done via Oracle
Enterprise Manager. All business intelligence processes benefit from the scalability
and reliability of Oracle9i.
With Oracle9i, implementing and administering business intelligence systems is
faster and less costly.

Oracle Warehouse Builder

Operational
Data
Oracle Business Intelligence
Data Transformation
Data Mining Suite

Engine
Beans
Oracle9i OLAP Services
Data mining algorithms
Oracle9i
1-7

• Oracle Warehouse Builder: Design and manage ETL processes
• ETL Transformation Engine: Scalable (parallel), Extensible (Java, PL/SQL),
Efficient (no data staging). Oracle9i has been significantly enhanced to address
many of the specific requirements of ETL environments. The enhancements make
Oracle9i a fully functional transformation engine’. These new features have four
main characteristics:
– Scalable: Enterprise DW’s may transform many gigabytes of data in a
single load cycle; the ability to parallelize is a crucial requirement (and is
one that some proprietary transformation solutions are lacking)
– Efficient: Each of the new Oracle9i features improves the efficiency of
doing common ETL operations in the relational database.
– Open: The ETL features are extensions to SQL. Any customer or tool
can leverage these new features in their own ETL environments.
– Extensible: Transformations can be arbitrarily complex, so it is
important to provide extensibility so that customers can implement their
own transformations. The Oracle database supports Java and PL/SQL
procedures (as well as C and C++ callouts), and in Oracle9i these
procedures can be fully parallelized to achieve scalable, extensible
transformations.
Moreover, the enhancements in Oracle9i are in general ‘open’. They are
implemented in SQL, so that they can be widely used in all ETL environments.

The New Way: Oracle9i (Continued)
• Oracle9i OLAP Services: Oracle9i is the new platform for analytical
applications on the Internet. Oracle9i offers a scalable data platform, analytical
functions and an Internet-ready OLAP, and a complete development environment
for analytical applications. As a result, customers can quickly build applications
with high analytical content, easily deploy these applications to large,
geographically distributed user communities, and analyze larger data sets than
previously possible. In addition, the Business Intelligence Beans provides the
OLAP ready application building blocks to support rapid application development
of Internet-based business intelligence applications.
• Data Mining: Oracle Personalization is Oracle’s first product deliverable in our
product direction of moving the data mining algorithms into the database and
adopting standards. Oracle Personalization uses the Transactional Naïve Bayes
and Predictive Association Rules data mining algorithms because they are
algorithms that work well for real-time recommendations on the Web. All of the
models built in Oracle Personalization are built using PL/SQL, and Oracle
Personalization’s API is Java-based.
Oracle Data Mining will continue to move additional data mining algorithms into
the database, including decision trees, clustering techniques, and classic Naïve
Bayes. Oracle will continue to package these data mining algorithms in thin-client
GUIs that focus on solving business problems and to provide standards-based APIs
for custom applications and integration.
Lastly, Oracle is actively adopting and driving open standards in the industry,
including the JSR-73 Java data mining API standard. Oracle’s implementation of
data mining in the database using PL/SQL and Java APIs provides the most
powerful and open data mining engine in the industry.

Oracle9i Server As a Data Warehouse Platform
• All Oracle DW products can leverage performance, scalability

and reliability of the Oracle server
– Oracle Warehouse builder for ETL operations
– Oracle OLAP Services for analytic operations
– Oracle Data Mining Suite for scalable, integrated data-
mining capabilities
– Oracle Personalization: Real-time personalization engine
for building 1:1 relationships over the web
• All server enhancements are ‘open’to third-party DW vendors
– All features implemented in SQL
– Analytic extensions to SQL have been submitted to
ANSI/ISO as a proposed standard
– Support of Standards, e.g. Java OLAP API
1-9
Oracle9i Server As a Data Warehouse Platform

The enhancements within the Oracle Server will produce better data-warehousing
solutions. The Oracle Server provide better scalability, performance, and reliability
for customers using Oracle Warehouse builder, Oracle OLAP Services, and Oracle
Data Mining Suite and Oracle Personalization.
Meanwhile, the enhancements to the Oracle Server are in general 'open'. They are
implemented in SQL, and third-party software vendors can leverage the Oracle
Server as a data warehouse platform. Oracle has actively worked to make sure that
these enhancements are useful and attractive to independent solution vendors
(ISVs).
For example, the analytic functions implemented in Oracle8i, Release 2 will
become part of the ANSI/ISO SQL standard; this will make it much simpler for
third-party analytic-tool and analytic-application vendors to adopt these new SQL
extensions. Oracle is looking to continue to use standard SQL and extend SQL
standards where appropriate to support its new directions.

Analytic Applications
Data Warehousing
Operational Reports
Data
ETL
Discoverer
Data
OLAP
ERP Warehouse Warehouse iAS
Builder Engine
Data Data Mining
Personalization
External Oracle9i Express

Data
CWM and Repository
Designer and Enterprise Manager

®
1-10

Oracle leads the industry in providing a complete, integrated solution for data
warehousing. The enhancements in the Oracle Server are part of Oracle's larger
data warehouse strategy.
Note: Oracle 9i’s OLAP repository is fully compliant with the CWM (Common
Warehouse Metamodel) standard, an Object Management Group (OMG) standard.
This means that Oracle9i dimensions, cubes, measures/fact, and data source
mappings are 100% compliant with CWM, thus allowing Oracle to support the
emerging CWM standard. This ensures that investments made in Oracle9i
analytical application development are preserved as the industry moves toward the
CWM standard.

SQL Enhancements

Objectives
After this lesson, you should be able to:
• Understand the enhanced Oracle9i analytical

functions
• Use grouping sets
• Create SQL statements with the new WITH clause
2-2

New Enhancements to the Analytical
Functions in Oracle9i
Supplement the power of the relational database

for decision support processing by using:
• Inverse percentile functions
• What-if rank and distribution functions
• FIRST/LAST aggregate functions
• WIDTH_BUCKET function
• Grouping sets
2-3
Analytical Functions
• Inverse Percentile Functions - allow queries to find the data which corresponds
to a specified percentile value
• Hypothetical Rank and Distribution - allow queries to find what rank or
percentile value a hypothetical data value has if it is added to an existing data set
• FIRST/LAST - enables queries to specify sorted aggregate groups and return
the first or last value of each group
• WIDTH_BUCKET - the histogram bucket in which a certain input value would
fall
• Grouping sets - extensions to the GROUP BY clause that allows you to
determine which result set rows are subtotals, what is the exact level of aggregation
for a given subtotal

Benefits
Key benefits provided by the new functions:

• Improved query speed
• Enhanced developer productivity
• Minimized learning effort
• Standardized syntax
2-4
Benefits
• Improved Query Speed - by using native SQL, the processing optimizations
supported by these functions enable significantly better query performance. The
performance enhancements enabled by the new functions enhance query speeds for
Oracle’s Express system and other ROLAP products.
• Enhanced Developer Productivity - the functions enable developers to perform
complex analyses with much clearer and more concise SQL code. Tasks which in
the past required multiple SQL statements or the use of procedural languages can
now be expressed using single SQL statements.
• Minimized Learning Effort - the syntax leverages existing aggregate functions,
such as SUM and AVG, so that these well-understood keywords can be used in
extended ways.
• Standardized Syntax - the new syntax is part of the ANSI SQL standard. The
new analytic functions will be supported by a large number of independent software
vendors.

Inverse Percentile Functions - Description
Two new functions, PERCENTILE_CONT and

PERCENTILE_DISC determine the value that
corresponds to a specific percentile.
• Require a sort specification: use the new WITHIN
GROUP clause to specify the data ordering
• Require a percentile parameter value between 0
and 1
• Can be used as either aggregate functions or
reporting aggregate functions
2-5
Inverse Percentile Functions - Description

One very common analytic question is to find the value in a data set that
corresponds to a specific percentile. Two new Oracle9i functions,
PERCENTILE_CONT and PERCENTILE_DISC, compute inverse percentiles.
These functions require a sort specification and a percentile parameter value
between 0 and 1. The functions can be used as either aggregate functions or
reporting aggregate functions. When used as aggregate functions, they return a
single value per ordered set. When used as reporting aggregate functions, they
repeat the data on each row.
The functions use a new WITHIN GROUP clause to specify the data ordering.
PERCENTILE_DISC function returns the actual “discrete” value which is closest
to the specified percentile values.
PERCENTILE_CONT function calculates a “continuous” percentile value using
linear interpolation.

Inverse Percentile Functions - Example
Example: Find the discrete value closest to the 50th

percentile of costs per channel of distribution for the
month of November, 1999.
SELECT
SELECT c.channel_
c.channel_desc,
desc, avg(s.amount),
avg(s.amount),
percentile_disc(0.5)
percentile_disc(0.5) WITHIN
WITHIN GROUP
GROUP
(ORDER
(ORDER BY
BY s.amount
s.amount desc)
desc) PERCENTILE_50
PERCENTILE_50
FROM
FROM sales
sales s,
s, channels
channels cc
WHERE
WHERE c.channel_id == s.channel_id
c.channel_id s.channel_id
AND
AND time_id
time_id BETWEEN
BETWEEN '01
'01-NOV-1999'
-NOV-1999'
AND
AND '30-NOV-1999'
'30-NOV-1999'
GROUP
GROUP BY c.channel_desc;
BY c.channel_ desc;
2-6
Inverse Percentile Functions - Example

For example, in the Sales History schema, cost of goods is tracked through the
channels of distribution for catalog, direct sales, internet, partner and telephone
sales. To find the discrete value closest to the 50th percentile of costs per channel
of distribution for the month of November, 1999, use the PERCENTILE_DESC
function.
To find the median value of costs data, specify that the data is sorted by cost, and
specify a percentile value of 0.5.
Note: This question is the inverse of the information provided by the Oracle8i
Release 2 CUME_DIST function, which answers the question “what is the
percentile value for each row?”

Inverse Percentile Functions - Results
Example results: Find the discrete value closest to

the 50th percentile of costs per channel of
distribution for the month of November, 1999.
CHANNEL_DESC
CHANNEL_DESC AVG(S
AVG(S.AMOUNT)
.AMOUNT) PERCENTILE_50
PERCENTILE_50
--------------------
-------------------- -------------
------------- -------------
-------------
Catalog
Catalog 338.670678
338.670678 205.8
205.8
Direct
Direct Sales
Sales 254.718435
254.718435 158.4
158.4
Internet
Internet 231.018052
231.018052 187.2
187.2
Partners
Partners 308.701916
308.701916 185.4
185.4
Tele Sales
Tele Sales 90.3982314
90.3982314 48
48
2-7
Inverse Percentile Functions - Results

The results from the query show the discrete values closest to the 50th percentile
for costs per channel of distribution in November, 1999. This information can be
useful for cost projections in future Novembers.

What-if Rank and Distribution - Description
• New syntax available for:

– RANK – ranks items in a group
– DENSE_RANK – ranks items in a group
excluding duplicates
– PERCENT_RANK – returns the percent rank of a
value relative to a group of values
– CUME_DIST – computes the position of a
specified value relative to a set of values
• Find out how a data value would rank if it were
added to the data set. It returns the rank or
percentile value which a row is assigned if the row
was hypothetically inserted.
• Require a WITHIN GROUP clause containing an
ORDER BY specification.
®
2-8
What-if Rank and Distribution - Description

In certain analyses, such as financial planning, you may want to know how a data
value would rank if it is added to a data set. For instance, a new worker is hired at
a salary of $100,000, where would his salary rank compared to the other salaries in
the company?
The hypothetical rank and distribution functions support this form of what-if
analysis. They return the rank or percentile value which a row would be assigned
if the row was hypothetically inserted into a set of other rows.
The hypothetical functions can calculate RANK, DENSE_RANK, PERCENT_RANK,
and CUME_DIST. Like the inverse percentile functions, the hypothetical rank and
distributions functions use a WITHIN GROUP clause containing an ORDER BY
specification.
Note: Compare to Oracle8i Release 2, when those functions were introduced,
RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST functions are overloaded
with new syntax for the purpose of ’what if’analysis.

What-if Rank and Distribution - Example
Example: A new worker is hired at a salary of $10,000,

where does his salary rank compared to the salaries
per department?
SELECT
SELECT department_id,
department_id, ROUND(AVG(salary)),
ROUND(AVG(salary)),
RANK(10000) WITHIN GROUP
RANK(10000) WITHIN GROUP
(ORDER
(ORDER BY
BY salary
salary DESC)
DESC) RANK,
RANK,
DENSE_RANK(10000)
DENSE_RANK(10000) WITHIN
WITHIN GROUP
GROUP
(ORDER
(ORDER BY
BY salary
salary DESC)
DESC) DR
DR
FROM employees
FROM employees
GROUP
GROUP BY
BY department_id;
department_id;
2-9
What-if Rank and Distribution - Example

For example, determine where a new salary of $10,000 ranks when compared to
the salaries for each department in the company in the Human Resources schema.

What-if Rank and Distribution - Results
Example results: Compare how a salary of $10,000

ranks within each department. Find the actual and
dense ranks.
DEPARTMENT_ID
DEPARTMENT_ID ROUND(AVG(SALARY))
ROUND(AVG(SALARY)) RANK
RANK DR
DR
-------------
------------- ------------------
------------------ ---------
--------- ---------
---------
10
10 4400
4400 11 11
20
20 9500
9500 22 22
30
30 4150
4150 22 22
40
40 6500
6500 11 11
50
50 3476
3476 11 11
60
60 5760
5760 11 11
70
70 10000
10000 11 11
80
80 8900
8900 99 77
90
90 19333
19333 44 33
100
100 8600
8600 22 22
110
110 10150
10150 22 22
®
2-10
What-if Rank and Distribution - Results

The salaries for department 90 and 100 are:
DEPARTMENT_ID SALARY
------------- ---------
90 17000
90 17000
90 24000
100 6900
100 7700
100 7800
100 8200
100 9000
100 12000
In department 100, a new salary of 10000 ranks second highest.
In department 90, a new salary of 10000 ranks fourth. The dense rank rates third,
as there are two salaries which are equal in department 90.

FIRST And LAST Aggregate Values - Description
Enables queries to specify sorted aggregate groups

and returns the first or last value of each group by
using the FIRST and LAST aggregate functions
Ÿ Obtains the first/last value of a column based on
ordering of another column
Ÿ Can be used in non-additive aggregate
calculations
Ÿ Simpler syntax and improved performance
2-11
First and Last Aggregate Values - Description

FIRST and LAST aggregate functions allow you to specify an order within the
aggregated groups and then return the first or last row of each group.
While an equivalent query can be created using a join or subquery, the SQL syntax
is cumbersome and performance can be inefficient. The FIRST and LAST
functions do this work with simpler SQL syntax and greater performance.

FIRST And LAST Aggregate Values - Example
Example: Per manager, determine the salary with the

lowest commission and the salary with the highest
commission.
SELECT
SELECT manager_id,
manager_id,
MIN(salary)
MIN(salary) KEEP
KEEP (DENSE_RANK
(DENSE_RANK FIRST
FIRST ORDER
ORDER BY
BY
commission_pct) AS low_comm,
commission_pct) AS low_comm,
MAX(salary)
MAX(salary) KEEP
KEEP (DENSE_RANK
(DENSE_RANK LAST
LAST ORDER
ORDER BY
BY
commission_pct)
commission_pct) AS
AS high_comm
high_comm
FROM
FROM employees
employees
WHERE
WHERE commission_pct
commission_pct IS
IS NOT
NOT NULL
NULL
GROUP
GROUP BY
BY manager_id;
manager_id;
2-12
First and Last Aggregate Values - Example

For example, to find the salary with the lowest commission and the salary with the
highest commission per manager, use the FIRST/LAST functions.
Sample Data
LAST_NAME MANAGER_ID SALARY COMMISSION_PCT

--------- ---------- --------- --------------
Zlotkey 100 10500 .2
Cambrault 100 11000 .3
Errazuriz 100 12000 .3
Partners 100 13500 .3
Russell 100 14000 .4
Tuvault 145 7000 .15
Cambrault 145 7500 .2
Olsen 145 8000 .2
Hall 145 9000 .25
Bernstein 145 9500 .25
Tucker 145 10000 .3
Sewall 146 7000 .25
Doran 146 7500 .3
Smith 146 8000 .3
McEwen 146 9000 .35
Sully 146 9500 .35
King 146 10000 .35
...
FIRST And LAST Aggregate Values - Results
Example results: Per manager, determine the salary

with the lowest commission and the salary with the
highest commission.
MANAGER_ID
MANAGER_ID LOW_COMM
LOW_COMM HIGH_COMM
HIGH_COMM
----------
---------- ---------
--------- ---------
---------
100
100 10500
10500 14000
14000
145
145 7000
7000 10000
10000
146
146 7000
7000 10000
10000
147
147 6200
6200 10500
10500
148
148 6100
6100 11500
11500
149
149 6200
6200 11000
11000
2-13
First and Last Aggregate Values - Results

The results display that the lowest commission is earned on the salary amount of
10500 for manager 100. The highest commission is earned on the salary amount of
14000 for manager 100. Per each manager id, the report displays the salary with
the lowest commission and the salary with the highest commission.

FIRST And LAST Aggregate Values - Example
Example: Per manager, display both the salary and
commission for the lowest commission and the salary and
commission for the highest commission.
SELECT
SELECT manager_id,
manager_id,
MIN(salary)
MIN(salary) KEEP
KEEP (DENSE_RANK
(DENSE_RANK FIRST
FIRST
ORDER
ORDER BY
BY commission_pct)
commission_pct) AS
AS low_
low_ comm,
comm,
MIN(commission_pct)
MIN(commission_pct) KEEP (DENSE_RANK FIRST
KEEP (DENSE_RANK FIRST
ORDER
ORDER BY
BY commission_pct)
commission_pct) AS
AS low_
low_ comm_value,
comm_value,
MAX(salary)
MAX(salary) KEEP
KEEP (DENSE_RANK
(DENSE_RANK LAST
LAST
ORDER
ORDER BY commission_pct) AS high_
BY commission_pct) AS high_ comm,
comm,
MAX(commission_pct)
MAX(commission_pct) KEEP
KEEP (DENSE_RANK
(DENSE_RANK LAST
LAST
ORDER
ORDER BY
BY commission_pct)
commission_pct) AS
AS high_
high_ comm_value
comm_value
FROM
FROM employees
employees
WHERE
WHERE commission_pct
commission_pct IS
IS NOT
NOT NULL
NULL
GROUP
GROUP BY
BY manager_id;
manager_id;
®
2-14
First and Last Aggregate Values - Example

The example query shows how to see the actual commission value earned on the
lowest salary and highest salaries per manager.
These are the results:
MANAGER_ID L_COMM L_C_VALUE H_COMM H_C_VALUE

---------- ------ --------- ------ ---------
100 10500 .2 14000 .4
145 7000 .15 10000 .3
146 7000 .25 10000 .35
147 6200 .1 10500 .25
148 6100 .1 11500 .25
149 6200 .1 11000 .3

WIDTH_BUCKET Function
Returns the bucket number that the result of an

expression will be assigned to after it is evaluated.
Ÿ Allows you to generate equiwidth histograms
Ÿ Valid on numeric, date, or datetime types
Ÿ Takes four parameters: the input expression, low
boundary, high boundary, and number of buckets
WIDTH_BUCKET(input_expression,
WIDTH_BUCKET(input_expression,
low_boundary,
low_boundary, high_boundary,
high_boundary,
bucket_count)
bucket_count)
2-15
WIDTH_BUCKET Function
For any given expression, the WIDTH_BUCKET function returns the bucket
number that the result of this expression will be assigned to after it is evaluated.
You can generate equiwidth histograms with this function. Equiwidth histograms
divide data sets into buckets with an equal interval size.
You provide the input expression, the minimum boundary value, the maximum
boundary value, and the number of buckets.

WIDTH_BUCKET Example
Generate SALARY equiwidth buckets:

• Low boundary value = 3000
• High boundary value = 13000
• Number of buckets = 5
0 1 2 3 4 5 6
3000 5000 7000 9000 11000 13000
2-16
If you ask for five buckets, you actually get seven buckets – five regular ones and
two artificial ones to catch values outside the boundary range:
• Bucket 0 holds the values less than the minimum boundary value
• Bucket 6 holds the values greater than the maximum boundary value
In the example, salaries less than 3000 are placed in bucket 0; salaries greater than
13000 are placed in bucket 6. The other salaries are placed in buckets 1 - 5,
depending on the salary value.

Generate SALARY equiwidth buckets:

• Low boundary value = 3000
• High boundary value = 13000
• Number of buckets = 5
SELECT
SELECT last_name,
last_name, salary,
salary,
WIDTH_BUCKET(salary,3000,13000,5)
WIDTH_BUCKET(salary,3000,13000,5)
AS
AS sal_hist
sal_hist
FROM
FROM employees;
employees;
2-17

LAST_NAME
LAST_NAME SALARY
SALARY SAL_HIST
SAL_HIST
--------------------
-------------------- ------
------ -----------
-----------
King
King 24000
24000 66
Kochhar
Kochhar 17000
17000 66
De Haan
De Haan 17000
17000 66
Hunold
Hunold 9000
9000 44
Ernst
Ernst 6000
6000 22
Austin
Austin 4800
4800 11
Pataballa
Pataballa 4800
4800 11
Lorentz
Lorentz 4200
4200 11
Greenberg
Greenberg 12000
12000 55
Faviet
Faviet 9000
9000 44
Sciarra
Sciarra 7700
7700 33
Popp
Popp 6900
6900 22
Raphaely
Raphaely 11000
11000 55
Khoo
Khoo 3100
3100 11
Baida
Baida 2900
2900 00
...
...
®
2-18
WIDTH_BUCKET Example (continued)

In the example, King has a salary of 24000, which is greater than the maximum
boundary value, so his salary is placed in bucket 6. Baida has a salary of 2900,
which is less than the minimum boundary value of 3000, so his salary is placed in
bucket 0.

Grouping Sets
A grouping set is a set of groups that the user wants

the system to form.
Ÿ Gives the user the power to specify exactly the
groupings of interest in the GROUP BY clause
Ÿ Produces a single result set which is equivalent to
a UNION ALL approach
Ÿ Adheres to ISO SQL:1999 standards
• Grouping set efficiency:
– Only one pass over base table is required
– No need to write complex UNION statements
– The more elements the grouping sets have, the
higher the gain using grouping sets
®
2-19
Grouping Sets
A grouping set is a set of groups that the user wants the system to form.
Without the enhancements in Oracle9i, multiple queries combined together with
UNION ALL are required to achieve these tasks. A multi-query approach is
inefficient, for it requires multiple scans of the same data. The extensions to the
GROUP BY clause in Oracle9i allow the optimizer to choose better plans, enabling
the SQL execution engine to execute the query very efficiently. Users can analyze
data in one dimension without completely rolling it up, analyze across multiple
dimensions without computing the whole CUBE, or specify multiple arbitrary
groupings to meet any need.
As with the analytic functions, the GROUP BY extensions follow the international
SQL:1999 standards.

Grouping Sets
Example
• Calculate aggregates over three groupings:
– Time, Channel, Product
– Time, Channel
– Channel, Product
• Include data from the first two days of December, 1999
and for products 10, 20, and 45.
SELECT
SELECT time_id,
time_id, channel_id,
channel_id, prod_id,
prod_id,
ROUND(SUM(amount))
ROUND(SUM(amount)) ASAS cost
cost
FROM
FROM sales
sales
WHERE
WHERE (time_id
(time_id == '01
'01-DEC-1999'
-DEC-1999' or
or
time_id
time_id == '02-DEC-1999')
'02-DEC-1999')
AND
AND prod_id
prod_id IN
IN (10,
(10, 20,
20, 45)
45)
GROUP
GROUP BY
BY GROUPING
GROUPING SETS
SETS
((time_id,
((time_id, channel_id,
channel_id, prod_id),
prod_id),
(time_id,
(time_id, channel_id),
channel_id),
(channel_id,
(channel_id, prod_id))
prod_id))
®
2-20
Grouping Sets
In the example shown, three grouping sets of data are requested:
• Time, Channel, Product
• Time, Channel
• Channel, Product
This means the resulting data will display for each of the three groups.
Note: For simplicity, only data for the first two days of December 1999, and for
the products 10, 20, and 45 are selected.

Grouping Sets
Example Results
TIME_ID
TIME_ID CC PROD_ID
PROD_ID COST
COST
-----------
----------- -- ---------
--------- ---------
---------
01-DEC-1999
01-DEC-1999 II 10
10 5673
5673
01-DEC-1999
01-DEC-1999 SS 20
20 47
47 Grouping set =
01-DEC-1999
01-DEC-1999 SS 45
45 356
356 time, channel, and
02-DEC-1999
02-DEC-1999 II 10
10 2232
2232 product
02-DEC-1999
02-DEC-1999 SS 20
20 242
242
02-DEC-1999
02-DEC-1999 SS 45
45 1624
1624
01-DEC-1999
01-DEC-1999 II 5673
5673
01-DEC-1999
01-DEC-1999 SS 403
403 Grouping set =
02-DEC-1999
02-DEC-1999 II 2232
2232 time and channel
02-DEC-1999
02-DEC-1999 SS 1865
1865
II 10
10 7905
7905
SS 20
20 289
289 Grouping set =
SS 45
45 1980
1980 channel, and product
®
2-21
Grouping Sets
The resulting data displays cost values for each of the three groups.
The first two rows for the grouping set of time, channel and product are interpreted
as:
• On December 1, through internet sales (I), product 10 was sold for a total
cost of 5673.
• On December 1, through direct sales (S), product 20 was sold for a total cost
of 47.
The first two rows for the grouping set of time and channel are interpreted as:
• On December 1, through internet sales, total cost was 5673.
• On December 1, through direct sales, total cost was 403 (which is equal to
356 + 47).
The first two rows for the grouping set of channel and product are interpreted as:
• Through internet sales, product 10 was sold for a total cost of 7905 (which
is equal to 5673 + 2232).
• Through direct sales, product 20 was sold for a total cost of 289 (which is
equal to 47 + 242).

GROUPING SETS vs. CUBE and ROLLUP
• The GROUPING SET clause allows you to identify

exact groups
• GROUP BY CUBE (time, channel, product) produces
2^3=8 groupings
• Only 3 groupings are needed
SELECT
SELECT time_id,
time_id, channel_id,
channel_id, prod_id,
prod_id,
round(sum(amount))
round(sum(amount)) AS cost
AS cost
FROM
FROM sales
sales
WHERE
WHERE (time_id
(time_id == '01
'01-DEC-1999'
-DEC-1999' OR
OR
time_id
time_id == '02-DEC-1999')
'02-DEC-1999')
AND
AND prod_id IN (10,
prod_id IN (10, 20,
20, 45)
45)
GROUP
GROUP BY
BY CUBE(time_id,
CUBE(time_id, channel_id,
channel_id, prod_id);
prod_id);
®
2-22
GROUPING SETS vs. CUBE and ROLLUP

The GROUPING SET statement shown previously uses composite columns to
identify the exact sets wanted. The GROUP BY CUBE statement computes all the
8 (2*2*2) groupings. The output is more than needed with more being calculated,
and more overhead.
TIME_ID C PROD_ID COST
--------- - --------- ---------
01-DEC-99 I 10 5673
01-DEC-99 I 5673
01-DEC-99 S 20 47
01-DEC-99 S 45 356
01-DEC-99 S 403
01-DEC-99 10 5673
01-DEC-99 20 47
01-DEC-99 45 356
01-DEC-99 6076
02-DEC-99 I 10 2232
02-DEC-99 I 2232
02-DEC-99 S 20 242
02-DEC-99 S 45 1624
02-DEC-99 S 1865
27 rows selected.

GROUPING SETS
• Allows you to define multiple groupings in the

same query
• GROUP BY computes all the groupings specified
and combines them with the UNION ALL operator
• The following 2 statements are equivalent
GROUP
GROUP BY
BY GROUPING
GROUPING SETS
SETS (time_id,
(time_id, channel_id,
channel_id,
prod_id)
prod_id)
GROUP
GROUP BY
BY time_id
time_id
UNION
UNION ALL
ALL
GROUP
GROUP BY
BY channel_id
channel_id
UNION
UNION ALL
ALL
GROUP
GROUP BY
BY prod_id;
prod_id;
®
2-23
GROUPING SETS
Using the UNION ALL operator instead of the GROUPING SETS clause, more
scans of the base table are required, making it inefficient.

Composite Columns
Example
• Show totals per product
• Show totals per product, channel, and time
• Show grand total
SELECT
SELECT prod_id,
prod_id, channel_id,
channel_id, time_id,
time_id,
ROUND(SUM(amount))AS
ROUND(SUM(amount))AS cost
cost
FROM
FROM sales
sales
WHERE
WHERE (time_id
(time_id == '01
'01-DEC-1999'
-DEC-1999' or
or
time_id
time_id == '02-DEC-1999')
'02-DEC-1999')
AND
AND prod_id
prod_id IN
IN (10,
(10, 20,
20, 45)
45)
GROUP
GROUP BY
BY ROLLUP
ROLLUP (prod_id,
(prod_id, (channel_id,
(channel_id, time_id));
time_id));
• Columns enclosed in parenthesis are treated as a

unit
®
2-24
Composite Columns
A composite column is a collection of columns that are treated as a unit during the
computation of groupings. You specify the columns in parenthesis. With CUBE
and ROLLUP, you do not have full control over the aggregation levels.
SELECT prod_id, channel_id, time_id, ROUND(SUM(amount))
costs
FROM sales
WHERE (time_id = '01-DEC-1999' or time_id = '02-DEC-
1999')
AND prod_id IN (10, 20, 45)
GROUP BY ROLLUP(prod_id, channel_id, time_id);
This results in computing 4 groupings:
1. (product, channel, time)
2. (product, channel)
3. (product)
4. (grand total)
The example in the slide results in three groups by creating a composite grouping
set:
GROUP BY ROLLUP(prod_id, (channel_id, time_id))
The three resultant groupings are:
(product, channel, time), (product), (grand total)

Composite Columns
Example Results
PROD_ID
PROD_ID CC TIME_ID
TIME_ID COSTS
COSTS
---------
--------- -- ---------
--------- ---------
---------
10
10 II 01-DEC-99
01-DEC-99 5673
5673
Totals per product, channel, time
10
10 II 02-DEC-99
02-DEC-99 2232
2232
10
10 7905
7905 Total per product (10)
20
20 SS 01-DEC-99
01-DEC-99 47
47
20 S 02-DEC-99
20 S 02-DEC-99 242
242
20
20 289
45
45 SS 01-DEC-99
01-DEC-99 356
356
45
45 SS 02-DEC-99
02-DEC-99 1624
1624
45
45 1980
10174
10174 Grand Total
2-25
Composite Columns
The results show groupings for:
• (product, channel, time)
• (product)
• (grand total)

Concatenated Groupings
• A concise way to generate useful combinations of

groupings
• Groupings specified with the concatenated
groupings yield the cross-product of groupings
from each set
• Concatenated groupings are specified by listing
multiple grouping sets, cubes, and rollups
• Easy to develop
• Useful for OLAP applications
2-26
Concatenated groupings offer a concise way to generate useful combinations of
groupings. Groupings specified with concatenated groupings yield the cross-
product of groupings from each grouping set. The cross-product operation enables
even a small number of concatenated groupings to generate a large number of final
groups. The concatenated groupings are specified simply by listing multiple
grouping sets, cubes, rollups, and separating them with commas.
Benefits of Concatenated Groupings
• Easy to develop
• Use by applications - SQL generated OLAP applications often involve
concatenation of grouping sets, with each grouping set defining groupings
needed for a dimension.

Example: Aggregate costs values for each product

rolled up across time, and across channel
SELECT
SELECT prod_id,
prod_id, channel_id,
channel_id, time_id,
time_id,
ROUND(SUM(amount))
ROUND(SUM(amount)) AS costs
AS costs
FROM sales
FROM sales
WHERE
WHERE (time_id
(time_id == '01
'01-DEC-1999'
-DEC-1999' or
or
time_id
time_id == '02-DEC-1999')
'02-DEC-1999')
AND
AND prod_id IN (10, 20,
prod_id IN (10, 20, 45)
45)
GROUP BY prod_id, CUBE(channel_id),
GROUP BY prod_id, CUBE(channel_id),
ROLLUP(time_id);
ROLLUP(time_id);
2-27
This results in the following groupings:
(product, channel, time), (product, channel), (product, time), (product)
PROD_ID C TIME_ID COSTS

--------- - --------- ---------
10 I 01-DEC-99 5673
10 I 02-DEC-99 2232
20 S 01-DEC-99 47
20 S 02-DEC-99 242
45 S 01-DEC-99 356
45 S 02-DEC-99 1624
10 01-DEC-99 5673
10 02-DEC-99 2232
20 01-DEC-99 47
20 02-DEC-99 242
45 01-DEC-99 356
45 02-DEC-99 1624
10 I 7905
10 7905
20 S 289
20 289
45 S 1980
45 1980

The WITH Clause Overview
• Allows you to use the same query block in a

SELECT statement when it occurs more than once
within a complex query
• Retrieves the results of a query block and stores
them in the user's temporary tablespace
• Can improve performance
2-28
The WITH Clause Overview

Using the WITH clause, you can reuse the same query through materialization
when it is high cost to evaluate the query block and it occurs more than once
within a complex query.
The WITH clause allows you to define a query block before using it in a query.
This can improve performance.

WITH Clause Example
Example: Find all departments whose total salary cost is
above 1/8 the total salary cost of the whole company
WITH
WITH
summary
summary AS
AS ((
SELECT
SELECT department_name,
department_name, SUM(salary)
SUM(salary) AS
AS dept_total
dept_total
FROM employees, departments
FROM employees, departments
WHERE
WHERE employees.department_id
employees.department_id ==
departments.department_id
GROUP
GROUP BY
BY department_name
department_name ))
SELECT
SELECT department_name, dept_total
department_name, dept_total
FROM
FROM summary
summary
WHERE
WHERE dept_total
dept_total >> ((
SELECT
SELECT SUM(dept_total)
SUM(dept_total) ** 1/8
1/8
FROM
FROM summary ))
summary
ORDER
ORDER BY
BY dept_total
dept_total DESC
DESC
//
®
2-29
The WITH Clause Example

The output from the query identify Sales and Shipping departments as having
salary costs one eighth above the total salary cost for the company.
DEPARTMENT_NAME DEPT_TOTAL
------------------------------ ----------
Sales 311500
Shipping 156400

The WITH Clause Implementation
Equivalent query, less powerful syntax:
SELECT
SELECT department_name,
department_name, SUM(salary)
SUM(salary) AS
AS dept_total
dept_total
FROM
FROM employees,
employees, departments
departments
WHERE
GROUP
GROUP BY
BY department_name
department_name HAVING
HAVING
SUM(salary)
SUM(salary) >> ((
SELECT
SELECT SUM(salary)
SUM(salary) ** 1/8
1/8
FROM
FROM employees,
employees, departments
departments
WHERE
departments.department_id))
ORDER
ORDER BY
BY sum(salary)
sum(salary) DESC
DESC
//
2-30

The query tries to retrieve all departments whose total salary cost is above 1/8 for
the total salary cost of the company. In the query, there are two blocks, one main
query block, and one subquery block. Both blocks need to do some common joins
and aggregations.
Using the WITH clause, we can materialize the query block that does the GROUP
BY, thus avoiding summarizing the department total cost more than once.

• Internally resolved as:

– In-line view
– Temporary table
• Depending on the cost/benefit of temporarily
storing the result of the WITH clause temporarily,
the optimizer chooses the appropriate resolution.
2-31

The WITH Clause Usage Notes
• Only used for SELECT statements

• A query name is visible to all WITH element query
blocks defined after it and the main query block
itself.
• When the query name is the same as an existing
table name, the parser searches from the inside
out, the query block name takes precedence over
the table name
• The WITH clause can hold more than one query,
each query is separated by a comma
2-32
The WITH Clause Usage Notes

A query name is visible to all WITH element query blocks (including their
subquery blocks) defined after it and the main query block itself (including its
subquery blocks).
A query name can be the same as some persistent table name or query name in
WITH list of another query block. If this happens, the parser searches for the right
definition inside out (in terms of query block nesting). The inner-most query name
definition is used for resolution. It is up to the users to make sure they don’t
conflict with permanent tables if they intend not to. This is similar to the C
language, in which local a variable has priority when its name is the same as some
global variable.

WITH Clause Benefit
Business Value:
• Isolate the business question from data gathering

• Make the query easy to read
• Evaluate a clause only once, even if they might appear
multiple times in the query
• Performance, since the instantiated subquery clause
must be calculated only once
2-33

Summary
In this lesson, you should have learned how to:
• Use new analytical functions in Oracle9i

• Use grouping sets
• Use the new WITH clause
2-34

Miscellaneous Enhancements

Objectives
• Use List Partitioning

• Create constraints on views
• Edit Outlines
3-2

List Partitioning Overview and Benefits
• New partition method introduced in Oracle9i

• User controls how rows map to partitions
• The LIST partition method allows for the distribution of
data, based on discrete column values
• Unordered and unrelated sets of data can be grouped
and organized together very naturally using LIST
partitioning
• No relationship between partitions
• Ideal for columns that consist of bounded set of discrete
values
• Powerful data-management capability
3-3
List Partitioning Overview and Benefits

The Partitioned Objects features has incrementally added new partition methods to the Oracle
RDBMS over two releases: Oracle8 and Oracle8i.There is however constantly emerging
requirements generated by customers who cannot take full advantage of Oracle8i partition methods
provided so far because their data-model does not dove-tail Oracle partition methods.
Thus, Oracle9i adds a new partitioning model called LIST partitioning to the set of partition methods
already being supported in the Oracle RDBMS. LIST method allows explicit control over how rows
map to partitions. This is done by specifying list of discrete values for the partitioning column in the
description for each partition. This is different from RANGE partitioning where a range of values is
associated with a partition and with HASH partitioning where the user has no control of the row to
partition mapping. This partition method has been specifically added to model data-distributions that
follow discrete values, this cannot be easily done with Oracle8i.

List Partitioning Example
CREATE
CREATE TABLE
TABLE locations
locations ((
location_id,
location_id, street_address
street_address,, postal_code,
postal_code,
city,
city, state_province,, country_id
state_province country_id ))
PARTITION
PARTITION BY BY LIST
LIST (state_province)
(state_province)
STORAGE(INITIAL
STORAGE(INITIAL 10K NEXT
10K NEXT 20K)
20K) TABLESPACE
TABLESPACE tbs5
tbs5
((
PARTITION
PARTITION region_east
region_east
VALUES ('MA','NY','CT','NH','ME','MD','VA','PA','NJ')
VALUES ('MA','NY','CT','NH','ME','MD','VA','PA','NJ')
STORAGE
STORAGE (INITIAL
(INITIAL 20K
20K NEXT
NEXT 40K
40K PCTINCREASE
PCTINCREASE 50)
50)
TABLESPACE tbs8,
TABLESPACE tbs8,
PARTITION
PARTITION region_west
region_west
VALUES
VALUES ('CA','AZ','NM','OR','WA','UT','NV','CO')
('CA','AZ','NM','OR','WA','UT','NV','CO')
PCTFREE
PCTFREE 25
25 NOLOGGING,
NOLOGGING,
PARTITION region_south
PARTITION region_south
VALUES
VALUES ('TX','KY','TN','LA','MS','AR','AL','GA')
('TX','KY','TN','LA','MS','AR','AL','GA') ,,
PARTITION
PARTITION region_central
region_central
VALUES
VALUES ('OH','ND','SD','MO','IL','MI',NULL,'IA'));
('OH','ND','SD','MO','IL','MI', NULL,'IA'));
®
3-4
List Partitioning Example

The details of LIST partitioning can best be described with an example. In this case we want to
partition the LOCATIONS table by region. i.e. we want to group states together according to their
geographical location.
A row is mapped to a partition by checking wheter the value of the partitioning column for a row
falls within the set of values that describes the partition.
For example, the following rows will be inserted as follows
• (1500,'2011 Interiors Blvd', '99236','South San Francisco', 'CA',
'US') maps to partition region_west
• (5000,'Chemin de la fanee','13840','ROGNES','BR','FR') will not map to
any partitions in the table. This is because the STATE_PROVINCE 'BR' is not part of any
listed values in the partitions definition. This parititioning method is useful when the partitioning
column consists of bounded set of descrete values. Note that right now, it is not possible to define
a ‘catch-all’partition (like MAXVALUE for Range partitioning). This will be available in a later
release of Oracle9i.
In the above example the partitions with specified physical attributes will override the table-level
defaults; however any partition with unspecified attributes inherit their physical attributes from the
table-level defaults.
Note: For formatting reasons, we didn’t specify the column data types. Refer to the common schema
definition for more details on those data types.

List Partitioning Pruning
Here are the three pruning types supported for LIST
partitioned tables:
–Equality
SELECT
SELECT **
FROM
FROM LOCATIONS
LOCATIONS WHERE
WHERE state_province
state_province == ''CA
CA'';;
–IN-LIST
SELECT
SELECT ** FROM
FROM LOCATIONS
LOCATIONS
WHERE
WHERE state_province IN
state_province IN ((''CA
CA'',, ''NY
NY'');
);
–Range
SELECT
SELECT **
FROM
FROM LOCATIONS
LOCATIONS WHERE
WHERE state_province
state_province <=
<= ''AZ
AZ'';;
®
3-5
List Partitioning Pruning

One of the interesting things to note about LIST partitioning is that there is no apparent sense of
ordering between partitions (unlike RANGE). Nevertheless, Oracle9i supports partition pruning for
objects partitioned by LIST method for queries involving the following predicates on the partitioning
key:
• Equality: Only the coresponding partitionis accessed. Based on the LOCATIONS table and the
first query above, only partition region_west is accessed.
• IN-LIST: Only the corresponding partitions are accessed. Based on the LOCATIONS table
and the second query above, Oracle access only region_west and region_east.
• Range: Only partitions that contain literal values in their list that correspond to the range
predicate are accessed. Based on the LOCATIONS table and the third query above, only partition
region_west ('CA','AZ','NM','OR','WA','UT','NV','CO') and partition
region_south ('TX','KY','TN','LA','MS','AR','AL','GA') are accessed.
Note: Like for other partitioning methods, the LIST partitioning feature allows pruning also with
bind variables.

ALTER TABLE ADD PARTITION
Example
• For this example to work, you must insure that any

value in the set of literal-values that describes the
partition being added must not exist in any of the
other partitions of the table
ALTER
ALTER TABLE
TABLE locations
locations
ADD
ADD PARTITION region_nonmainland
PARTITION region_nonmainland
VALUES
VALUES ((''HI
HI'',, ''PR
PR''))
STORAGE
STORAGE (INITIAL
(INITIAL 20K
20K NEXT
NEXT 20K)
20K)
TABLESPACE tbs_3
TABLESPACE tbs_3
NOLOGGING;
NOLOGGING;
3-6
ALTER TABLE ADD PARTITION Example

In Oracle9i, syntax and semantics of this DDL statement is modified to support adding a single
partition to a table partitioned using LIST method. This just means that the user is adding a new
partition to the set of partitions of a table. However there is no ordering amongst the partitions. The
newly created partition have the following characteristics:
• Newly added partition will have no data.
• A partition name, set of literal values describing the partition value list, physical attributes and
logging attributes may be specified for such partitions.
• Every literal value in the set that describe the partition value list for a partition, has to be a
unique value specified for that object. If there is a duplicate entry, Oracle will return an error.
• If any physical attributes of such a partition are not specified, they will be derived by
combining default table-level attributes with default attributes of a tablespace in which the
partition will be placed.
• If a partition-name is not specified, a system-generated name of form SYS_P### will be
assigned to the partition.
• Adding a partition to a table will add a corresponding index partition(with the same value list)
to all local indexes defined on the table. Global indexes are not affected.
• ALTER TABLE ADD PARTITIONto a List partitioned table locks the table in Exclusive
mode(X) for the duration of the operation. This is a fast dictionary operation and no other DDL or
DML is permitted on the table.

ALTER TABLE ADD PARTITION Example (Continued)
The above example adds a new partition to the LOCATIONS table partitioned by
the LIST method. The example specifies some new physical attributes for this new
partition and inherit the table-level default for the others.

ALTER TABLE MERGE PARTITION
Example
• This DLL command can be used to merge the contents of

any two arbitrary partitions of a LIST partitioned table
ALTER
ALTER TABLE
TABLE locations
locations
MERGE
MERGE PARTITIONS
PARTITIONS
region_northwest,
region_northwest, region_southwest
region_southwest
INTO
INTO PARTITION
PARTITION region_west
region_west
PCTFREE
PCTFREE 50
50 STORAGE(MAXEXTENTS
STORAGE(MAXEXTENTS 20);
20);
3-8
ALTER TABLE MERGE PARTITION Example

The syntax of this operation remains unchanged for a LIST partitioned table compare to a RANGE
partitioned table. But, the semantics is modified to operate on tables partitioned using the LIST
method. The statement can be used to merge the contents of any two arbitrary partitions of a table,
the candidate partitions do not neccessarily have to be adjacent, since LIST partitions do not assume
any order for partitions. The resulting partition will comprise of the union of the set of values that
formed the two partitions being merged.
Usage :
• If a name for the resulting partition is not specified, a name of form SYS_P# will be assigned.
• Any 2 Partitions can be merged.
• Resulting partition value list will comprise of a set of literal-values that represents the union of
the set of literal-value partition-values-list for the two partitions being merged.
• The data in the resulting partitions will consist of data from both the partitions.
• When merging partitions of tables partitioned using List method, tablespace in which the
resulting partition will be located and the partition’s attributes will be determined by the table-
level default attributes, except for those specified explicitly.
• The corresponding local index partitions will also be merged and the resulting local index
partition will be marked Index Unusable.
• All Global indexes defined on the table will be marked Index Unusable. These include both
partitioned and non-partitioned indexes,
• ALTER TABLE MERGE PARTITIONS will lock the table in SX mode and the partitions
being merged in X mode.
ALTER TABLE MERGE PARTITION Example
The above example merges two partitions of a the LOCATIONS table partitioned
using LIST method into a partition which will inherit all of its attributes from the
table-level default attributes, except for PCTFREE and MAXEXTENTS which are
specified in the statement.
If the value-list for:
– partition region_northwest was described as
('CA','OR','NV','UT') and
– partition region_southwest was described as
('AZ','NM','CO','WA')
then, the resulting partition-value-list will comprise the set that represents the
union of these two partition-value-lists:
– partition region_west will have value-list as
('CA','OR','NV','UT','AZ','NM','CO','WA')
Note: In this example, we suppose that at some point the initial partition
region_west was spilt into region_northwest and
region_southwest.

ALTER TABLE MODIFY PARTITION ADD VALUES
Example
• Specified new literal values must not already

exist in any of the partition’s value list
ALTER
ALTER TABLE
TABLE locations
locations
MODIFY
MODIFY PARTITION
PARTITION region_
region_south
south
ADD
ADD VALUES
VALUES ((''OK
OK'',, ''KS
KS'')) ;;
3-10
ALTER TABLE MODIFY PARTITION ADD VALUES Example

This is a new statement which can be used to ’extend’the partition value list of an existing partition
to contain additional literal values. The statement is used to describe the new set of literal values that
are being added to an existing partition value list to extend it. All the new literal values being added
must not have been previously specified in any of the partition’s value list that describe the partition
table.
The above example adds a new set of state codes ('OK','KS') to the existing partition list
region_south.
Usage:
• The set of literal-vales that are being added to the partition, have to be unique within the set of
literal value that describe partition value list, for all existing partitions. If duplicate values are
found, Oracle will return an error.
• The partition literal value list for the corresponding local index partition is also naturally
extended.
• Status (Index Unusable) of Local and Global index partitions are not affected by this
operation.
• This is a fast operation and during the operation a DML Share-Exclusive (SX) lock is
acquired on the table and a DML Exclusive (X) lock is acquired on the partition being modified.

ALTER TABLE MODIFY PARTITION DROP VALUES
Example
• There must be no rows in the partition for the literal

values being dropped
DELETE
DELETE locations
locations
where
where state_province
state_province in
in ((''OK
OK'',, ''KS
KS'')) ;;
ALTER
ALTER TABLE
TABLE locations
locations
MODIFY
MODIFY PARTITION
PARTITION region_
region_south
south
DROP
DROP VALUES
VALUES ((''OK
OK'',, ''KS
KS'')) ;;
3-11
ALTER TABLE MODIFY PARTITION DROP VALUES Example

This is a new statement which can be used to ’prune’the partition value-list of an existing partition to
contain fewer literal values. This operation removes a set of literal-values from an existing set of
literal values that describes the partition’s value list. It is important to note that this operation expects
no rows to exists in the partition for the literal values being dropped, it however checks for the
existence of rows in the partition that correspond to the literal values being dropped and fails with an
error message if any such qualifying rows are found. Thus, the user must drop the corresponding
rows first.
The above statement drops a set of state codes ('OK','KS') from the existing region_south
partition value-list. The operation is always run with validation, which means it will check to see if
any rows exist in the partition that correspond to the set of values being dropped, if any such rows
are found then Oracle returns an error message and the operation fails. The user is supposed to issue
a DELETE statement to delete the corresponding rows from the table. This operation will also drop
the corresponding values from the dictionary.
Usage
• The set of literal-vales that are being dropped from the partition have to be a subset of the set
of literal value that describe the current partition value list. Oracle will return an error message
otherwise.
• This operation cannot be used to drop all values that correspond to a partition. One should use
ALTER TABLE DROP PARTITIONinstead.

ALTER TABLE MODIFY PARTITION DROP VALUES Example
(Continued)
• The partition is checked for existence of any row whose value for the
partitioning key corresponds to any of the values being dropped. An error is
returned if such row is found and the operation fails. The user must issue a
DML DELETE statement to remove the rows that correspond to the values
being dropped before re-issuing the command.
• The dictionary is also update to reflect the new partition-value-list for the
partition being modified.
• During the operation a DML Row Exclusive (SX) lock is acquired on table
and a DML Exclusive (X) lock is acquired on partition.
• The DDL timestamp of the table and partition will be updated.
• Since a query will be executed to check for the existence of rows in the
partition that correspond to the literal value being dropped from the partition it
is advisable to create a local prefixed index on the table as this will speed up the
execution of the query and the overall operation.
• Status (Index Unusable) of local indexes remain unaffected, however the
index partition being affected will retain the new partition-value-list. Global
Indexes remain unaffected by this operation.

ALTER TABLE SPLIT PARTITION
Example
• In this example, ('CT', 'VA', 'MD') is the list of values for
the first new partition.
• The list of values for the second new partition is:
list of values of region_east excluding ('CT', 'VA', 'MD')
ALTER
ALTER TABLE
TABLE locations
locations
SPLIT
SPLIT PARTITION region_east
PARTITION region_east
VALUES
VALUES ('CT',
('CT', 'VA',
'VA', 'MD')
'MD')
INTO
INTO ((
PARTITION
PARTITION region_east_1
region_east_1
TABLESPACE
TABLESPACE tbs2,
tbs2,
PARTITION regi on_east_2
PARTITION regi on_east_2
STORAGE
STORAGE (NEXT
(NEXT 2M
2M ))
))
PARALLEL 5;
PARALLEL 5;
®
3-13
ALTER TABLE SPLIT PARTITION Example

The syntax of this statement is modified to support LIST partitioned tables. Semantics of this
statement is also extended to allow a single partition value list of a partition to be split-up into two
distinct partitions with non-overlapping value list’s. The syntax is exactly the same as that for
splitting RANGE partitions except that the AT keyword is replaced with the VALUES keyword.
The list of values following the VALUES clause apply to the first partition of the two new partitions
being created. The list of values for the second partition is obtained by subtracting the first new
partition literal-values list from the original literal-values list of the partition being split.
Additionally the new partitions can have specified partition names, physical attributes and logging
clause.
If the INTO clause is not specified, system generated names will be used for the two new partitions.
The above example splits the partition region_east into two partitions: region_east_1 with
literal-value list of ('CT'.'VA','MD'), and region_east_2 inheriting the remaining literal-
value list of ('NY','NH','ME','MA','PA','NJ'). The individual partitions have new
physical attributes specified at the partition level. This operation is run with parallelism of degree 5.
Usage
• The set of literal-values that are being specified for the first new partition have to be a subset
of the set of literal value that describe the current partition value list being split, Oracle will return
an error message otherwise.
• No new literal-values can be added to the partition description other than those that existed for
the partition being split.

ALTER TABLE SPLIT PARTITION Example (Continued)
After the split, rows with partitioning keys with value equal to the specified
VALUES list will be inserted into first partition. Rows with partitioning keys
not equal to the specified VALUES list will be inserted into the second
partition.
• The new partition will inherit all unspecified physical attributes from the
partition being split(not the table-level default).
• This statement also performs a matching split on the corresponding partition
in each local index defined on the table. The index partitions are split even if
they are marked Index Unusable.
• New local index partitions will be assigned the same name as that of the
corresponding new base table partition except for indices which already have a
partition with such name, in which case a name will be generated by Oracle
with the form SYS_Pn.
• With the exception of TABLESPACE attribute, the physical attributes of the
LOCAL index partition being split are used for both new index partitions. If the
parent LOCAL index lacks default TABLESPACE attribute, new LOCAL index
partitions will reside in the same tablespaces as the corresponding newly
created partitions of the underlying table.
• All global indexes will be marked Index Unusable if the partition is not
empty.
• ALTER TABLE SPLIT PARTITION locks the table in Row Exclusive
(SX) mode and locks partition in Exclusive (X) mode.

List Partitioning Usage
• Currently supported only for Heap tables
• Multi-column partitioning not supported
• The specified literal values have to be unique across all
literal values of all partition value lists of the object
• NULL can be specified as a partition literal value
• MAXVALUE cannot be specified
• Any lists must have at least one literal
• Any listed literal’s strings cannot exceed 4K
• Partition pruning, partition wise joins, and parallelism
supported
• Local indexes and global range partitioned indexes
supported
®
3-15
List Partitioning Usage

In general all semantics that apply for RANGE method also apply for LIST. Here are some specific
features that apply to LIST partition method :
• The LIST method is not extended to IOT’s in Oracle9i
• Unlike RANGE and HASH partitioning, multi-column partitioning will not be supported for
LIST partitioning. If a table is partitioned by LIST, the partitioning key can consist only of a single
column of the table. Otherwise all columns that can be partitioned by RANGE or HASH can be
partitioned by LIST
• The partitions do not have any implicit ordering like Range, and hence can be specified in any
order
• The specified literal values that define the partitioning criterion of any given partition has to be
unique across all literal values of all partition value lists of the object. i.e a literal value for a partition
value list(such as ’AZ’) cannot be specified as a value in any other partition. If there are duplicate
values, Oracle will return an error
• The keyword NULL can be specified as a value-element of this list, so NULL values for a
partitioning key can map to a specific partition. However one has to be cautious with predicates
using IN-LIST clause that involves NULL values, because SQL language semantics treats the NULL
literal differently than other literal values. Queries that test equality predicates for NULL literals
should be evaluated using existential (is) predicates as opposed to regular equality (=) predicates.
• The literal MAXVALUE cannot be specified as a partition literal value anymore since it has no real
meaning

List Partitioning Usage (Continued)
• The set of literal values that describe a partition value list must have at least one
element i.e. it cannot be empty
• The string comprising the list of values for a partition cannot exceed 4K bytes.

View Constraints: Overview and Benefits
• Table constraints allow Data Warehouse applications

to identify cubes in the Database
• In an environment where Data Warehouse applications
must access views instead of tables, it is no longer
possible to identify cubes
• Constraints on views will enable :
– Data Warehouse applications to identify view-
based cubes
– New rewrites for Materalized Views containing
views
• Enforced at the table level only
3-17
View Contraints: Overview and Benefits

Data Warehouse applications recognize multi-dimensional cubes in the Oracle
RDBMS by identifying Referential Integrity (RI) constraints in the relational
schema. RI constraints represent primary and foreign key relationship between fact
and dimension tables. By querying Oracle data dictionary, applications can
recognize RI constraints and hence the cubes in the database.
However, this does not work in an environment, where DBAs, for schema
complexity or security reasons, define views on fact and dimension tables. In such
environment, applications cannot identify the cubes properly.
By allowing constraint definitions between views in Oracle9i, DBAs can propagate
base table constraints to the views thereby allowing applications to recognize cubes
even in a restricted environment.
Please note that while view constraint definition are declarative in nature,
operations on views are subject to the integrity constraints defined on the
underlying base tables, and constraints on views can be enforced through
constraints on base tables.
Defining constraints on base tables is necessary, not only for data correctness and
cleanliness, but also for materialized view query rewrite purpose.

View Constraints Types
• The following types of constraints declaration are

supported:
– Primary key constraint
– Unique constraint
– Referential Integrity constraint:
– A table can reference a view constraint
– A view can reference a table constraint
• NOT NULL constraints are propagated from tables
• DISABLE NOVALIDATE is the only valid state
• RELY or NORELY states are also supported
3-18
View Contraints Types

These are the following supported types for view constraints:
• Primary key constraint,
• Unique constraint, and
• Referential Integrity constraint. Note that it is possible for a table to
reference a Primary key or Unique key constraint defined on a view and also a
view can reference a constraint defined on a table.
NOT NULL constraint on views will not be supported explicitly, as it is propagated
from the NOT NULL constraint on base tables.
Given that view constraint are declarative, DISABLE NOVALIDATE is the only
valid state for a view constraint. However, we allow the option of choosing RELY
or NORELY state for view constraints as constraints on views may be used to
enable more sophisticated query rewrites. A view constraint in RELY state would
allow query rewrite to occur when query integrity level is set to TRUSTED mode.

Creating View Constraints Example
CREATE
CREATE VIEW
VIEW DEP
DEPT10_EMPLOYEES
T10_EMPLOYEES ((
id
id PRIMARY
PRIMARY KEY
KEY RELY
RELY DISABLE
DISABLE NOVALIDATE
NOVALIDATE ,,
first_name, last_name, email
first_name, last_name, email ) )
AS
AS
SELECT
SELECT employee_id,
employee_id, first_name,
first_name, last_name,
last_name, email
email
FROM EMPLOYEES
FROM EMPLOYEES
WHERE
WHERE department_id
department_id == 10;
10;
CREATE
CREATE VIEW
VIEW DEP
DEPT30_EMPLOYEES
T30_EMPLOYEES ((
id,
id, first_name, last_name,
first_name, last_name, email,
email,
CONSTRAINT pk_d30e
CONSTRAINT pk_d30e
PRIMARY
PRIMARY KEY
KEY (id)
(id) RELY
RELY DISABLE
DISABLE NOVALIDATE
NOVALIDATE ))
AS
AS
SELECT
SELECT employee_id,
last_name, email
email
FROM EMPLOYEES
FROM EMPLOYEES
WHERE
WHERE department_id
30;
®
3-19
Creating View Constraints Examples

As you can see in the above examples, the syntax for defining view constraints
using CREATE VIEW statement is quite similar to defining constraints on tables
using CREATE TABLE...AS SELECTstatement. However, unlike CREATE
TABLE AS SELECT which does not allow Referential Integrity constraints,
CREATE VIEW statement supports Referential Integrity constraints definition.
In the above examples, alias id specifies an employee id and a primary key
constraint is defined on that view column.
Note:
• In the first example above, Oracle will automatically generate a constraint
name while in the second example, the user explicitly specify a name for the
constraint
• Like for a table, the constraint can be defined at the column level or at the
view level
• As the above examples show, for each view constraint, you MUST specify
RELY|NORELY DISABLE NOVALIDATE. Otherwise, Oracle signal an
error.

Modifying View Constraints Examples
ALTER
ALTER VIEW
VIEW DEPT10_EMPLOYEES
DEPT10_EMPLOYEES
ADD
ADD CONSTRAINT u_d10e
CONSTRAINT u_d10e
UNIQUE
UNIQUE (email)
(email) RELY
RELY DISABLE
DISABLE NOVALIDATE;
NOVALIDATE;
ALTER
ALTER VIEW
DEPT30_EMPLOYEES
DROP
DROP CONSTRAINT
CONSTRAINT pk_
pk_d30e;
d30e;
ALTER
ALTER VIEW
DEPT10_EMPLOYEES
DROP
DROP PRIMARY
PRIMARY KEY
KEY;;
ALTER
ALTER VIEW
VIEW DEPT
DEPT10_EMPLOYEES
10_EMPLOYEES
MODIFY
MODIFY CONSTRAINT
CONSTRAINT u_d10e
u_d10e NORELY
NORELY;;
®
3-20
Modifying View Constraints Examples

Note:
• One cannot drop a UNIQUE or PRIMARY key constraint if it is part of a
referential integrity constraint on views, even if the Referential Integrity
constraint is in NORELY DISABLE NOVALIDATEmode
• One cannot change the state of a UNIQUE or PRIMARY KEY constraint on
a view from RELY to NORELY if it is part of a RELY referential integrity
constraint without also dropping the foreign key or changing its state from
RELY to NORELY.

Dropping Constrained Views Examples
• Because a table or view can reference a view, one may
have to specify the new CASCADE CONSTRAINTS
clause for the DROP VIEW statement
CREATE
CREATE VIEW
DEPT20_EMPLOYEES ((
id
id PRIMARY
PRIMARY KEY
KEY RELY
RELY DISABLE
DISABLE NOVALIDATE
NOVALIDATE ,,
first_name,
last_name, email
email))
AS
AS
SELECT
SELECT employee_id,
last_name, email
email
FROM
FROM EMPLOYEES
EMPLOYEES
WHERE
WHERE department_id
20;
CREATE
CREATE TABLE
TABLE TEST
TEST (( seq
seq number,
number, test
test char(50),
char(50),
id number REFERENCES DEPT20_EMPLOYEES
id number REFERENCES DEPT20_EMPLOYEES
RELY
RELY DISABLE
DISABLE NOVALIDATE
NOVALIDATE);
);
DROP
DROP VIEW
DEPT20_EMPLOYEES CASCADE
CASCADE CONSTRAINTS
CONSTRAINTS;;
®
3-21
Dropping Constrained Views Examples

Since a view constraint can be referenced by other table or view, for DROP VIEW
statement, Oracle9i provides the option of CASCADE CONSTRAINTS.
The CASCADE CONSTRAINTS option drops all referential integrity constraints
that refer to primary and unique keys in the dropped view. If you omit this clause,
and such referential integrity constraints exist, Oracle returns an error and does not
drop the view.
Note: In the above example, when the user defines table TEST, the reference
MUST be declared RELY|NORELY DISABLE NOVALIDATE. Otherwise
Oracle generates an error.

View Constraints Restrictions
• Only possible for

– Primary Key
– Unique
– Referential Integrity
• Default values are not allowed
• Cannot be validated nor enforced
• Cannot be deferred
• Cannot use the USING INDEX clause
• ON DELETE actions not supported for Referential
Integrity constraints
• Cannot use EXCEPTION INTO clause
®
3-22
View Constraints Restrictions

The view constraints declarations are similar to the table constraint clause but has
the following restrictions for views:
• Only unique, primary key, and referential integrity constraints are supported
on views. Check constraints are not supported. Note that views have a with
check option clause that has the same semantics of check constraints. Note that
default values can’t be specified.
• Constraints on views are supported in DISABLE NOVALIDATE mode
only. (i.e. the constraints are declarative only, it is never validated nor
enforced).
• Does not support deferrability status as constraints on views are disabled
and not validated
• As constraints on views are disabled and not validated, no
physical_attributes_clause in constraint’s definition is supported. Especially the
USING INDEX clause can’t be used.
• ON DELETE actions are not supported for Referential Integrity constraints
as the constraints are declarative only
• Does not support EXCEPTIONS INTO clause as constraints are not
checked
Notes:
• The user can name the constraint through key word constraint. If the name
is omitted, then Oracle will automatically generate a name for the constraint
with the form SYS_Cn
• The Alter View clause checks if the view is valid before processing the
constraint definition
Outline Editing Overview
• Stored outlines were introduced in Oracle8i as a way to
preserve execution plan stability across
– Different releases of the database
– Different operating environments
• Stored outline editing is new in Oracle9i . It enables
users, and third party vendors to tune execution plans
without having to change the application
• This is possible by editing the content of the saved
plan
• The idea is to clone the outline in a staging area where
the outline can be safely edited without impacting the
user community
• Once satisfied by the result, the editor can publicize
the result to the user community
®
3-23
Outline Editing Overview

Stored outlines were introduced in Oracle8i as a way to preserve execution plan stability across
releases of the database and across different operating environments. In Oracle9i, Outline Editing
extends the usefulness of stored outlines by introducing a user-editing interface, which enables users,
and third party vendors to tune their execution plans by editing the stored outlines used to influence
the optimizer. This might be useful when using the same application in different environments (small
vs big databases).
This feature will benefit both application developers and customer support personnel. While the
optimizer usually chooses optimal plans for queries, there are times when the user knows something
about the execution environment that is inconsistent with the heuristics the optimizer follows.
Sometimes an execution plan is acceptable in one environment but not in another. By editing the
outline directly, the user can tune the query without having to change the application.
The application developer might generate outlines in a staging area and notice that some plan did not
take advantage of an index that could improve performance. It might be easier to simply edit the
outline to use the index rather than searching through the application code and tuning the SQL till it
eventually yields the desired result.
Customer support might be at a customer site investigating a problem query. By creating an outline
and then editing it, it is likely that with some simple edit such as changing join order the problem
could be solved quickly. In this way, the customer’s problem is solved immediately without having
to go through the process of debugging and updating the application itself.
For the customer whose environment has unique characteristics that might cause an outline to yield a
less than optimal execution plan, the ability to make minor adjustments to the outline enhances the
ability to support specific customer needs. In this sense, stored outlines are made more adaptive as
users can make finely tuned adjustments to the saved plan.
Outline Editing Overview (Continued)
Stored outline metadata is maintained in the OUTLN schema and maintained by the
server and user’s have been advised not to update these tables directly in the same
way they are advised not to update system tables. Therefore, Oracle must provide
users with a way to safely edit an outline without compromising the integrity of the
outline for the rest of the user community.
To accomplish this, Oracle is proposing that the outline be cloned into the user’s
schema at the onset of the outline editing session. All subsequent editing
operations will be performed on that clone until the user is satisfied with his edits
and chooses to publicize them. In this way, any editing done by the user does not
impact the rest of the user community which would continue to use the public
version of the outline until the edits are explicitly saved.

Editable Attributes
• Join Order
• Join Methods
• Access Methods
• Distribution Methods
• Query Rewrite
• View/Subquery Merging
3-25
Editable Attributes
• Join Order: Join order defines the sequence in which tables are joined during query execution.
This includes tables produced by evaluating subqueries and views as well as tables appearing in the
FROM clauses of subqueries and views.
• Join Methods: Join methods define the methods used to join tables during query execution.
Examples are nested loops join and sort-merge join.
• Access Methods: Access methods define the methods used to retrieve table data from the
database. Examples are indexed access and full table scan.
• Distributed Execution Plans: Distributed queries have execution plans that are generated for
each site at which some portion of the query is executed. The execution plan for the local site at
which the query is submitted can be controlled by plan stability and equivalent plans must be
produced at that site. In addition, driving site selection can be controlled centrally even though it
might normally change when certain schema changes are made.
• Distribution Methods: For parallel query execution, distribution methods define how the inputs
to execution nodes are partitioned.
• View and Subquery Merging and Summary Rewrite: The topic of View and Subquery
Merging and Summary Rewrite is meant to include all transformations in which objects and/or
operations that occur in one subquery of the original SQL statement are caused to migrate to a
different subquery for execution. Summary rewrite may also cause one set of objects and/or
operations to be replaced by another.

Outline Cloning
• Public Outlines
– Default Setting when Creating Outlines
– Stored in the OUTLN Schema
– Used when USE_STORED_OUTLINES is set to
TRUE
• Private Outlines
– Stored in User's Schema
– Can be Edited
– Used when USE_PRIVATE_OUTLINES is set to
TRUE
– Changes can be saved to Public
3-26
Outline Cloning
• Public Outlines: In Oracle8i, all outlines are public objects indirectly available to all users on the
system for whom the USE_STORED_OUTLINES configuration parameter setting applies. Outline
data resides in the OUTLN schema that can be thought of as an extension to the SYS schema in the
sense that it is maintained and consumed by the system only. Users have been discouraged from
manipulating this data directly to avoid security and integrity issues associated with outline data.
Outlines will continue to be public by default and only public outlines are generally available to the
user community.
• Private Outlines: In Oracle9i, the notion of a private outline to aid in outline editing is
introduced. A private outline is an outline seen only in the current session and whose data resides in
the current parsing schema. By storing the outline data for a private outline directly in the user’s
schema, users are given the opportunity to directly manipulate the outline data through DML in
whatever way they choose. Any changes made to such an outline are not seen by any other session
on the system and applying a private outline to the compilation of a statement can only be done in the
current session through a new session parameter. Only when a user explicitly chooses to save edits
back to the public area will the rest of the users see them.
An outline clone is a private outline that has been created by copying data from an existing outline.

Outline: Administration and Security
• Privileges required for using CREATE OUTLINE FROM

– SELECT_CATALOG_ROLE
– CREATE ANY OUTLINE
• DBMS_OUTLN_EDIT. create_edit_tables
– Creates required temporary tables in User's Schema
for Cloning and Editing Outlines (equivalent to
execute utleditol.sql )
– Requires EXECUTE privilege on DBMS_OUTLN_EDIT
• v$sql
– Column OUTLINE_SID added
– Identifies the session id from which the outline was
retrieved
®
3-27
Outline: Adminitration and Security

• SELECT_CATALOG_ROLE : This role is required for the CREATE OUTLINE FROM command
unless the issuer of the command is also the owner of the outline. Any CREATE OUTLINE
statement requires the CREATE ANY OUTLINE privilege. Specification of the FROM clause will
require the additional SELECT_CATALOG_ROLErole since such a command exposes SQL text to
different users who might otherwise not be privileged to read the text.
• DBMS_OUTLN_EDIT .create_edit_tables : A supporting command procedure that
creates the metadata tables in the invoker’s schema. This precedure is callable by anyone with
EXECUTE privilege on DBMS_OUTLN_EDIT. Refer to the Supplied PL/SQL Packages Reference
Release 9.0.0 – BETA December 2000 Part No. A86815-01 for more information. Note also that the
package DBMS_OUTLN is synonymous with OUTLN_PKG. Note that Oracle chose to use session
temporary tables to meet the requirement that private outlines be private to the session editing them.
To accomodate the possibility of multiple users editing the same outline at the same time (not good
practice but possible) the data had to be partitioned by session which temporary tables does. This is
really scratch data not intended for long term use and this removes the need for users to worry about
cleaning up when they are done as the temporary table data goes away with their session without
their having to do anything.
• v$sql extension: A column is added to the V$SQL fixed view to help users distinguish whether
a shared cursor was compiled while using a private outline or a public outline. outline_sid is the
name of this new column and it identifies the session id from which the outline was retrieved. The
default is 0 which implies a lookup in OUTLN schema.
• utleditol.sql : This script is provided as a harness for outline editing, containing the
commands that create the prerequisite outline tables and indices in the target user schema. You
should never use it and you should prefer the supported procedure above.
Configuration Parameters
• USE_PRIVATE_OUTLINES is a session parameter

that control the use of private outlines instead of
public outlines
ALTER
ALTER SESSION
SESSION SET
SET USE_PRIVATE_OUTLINES
USE_PRIVATE_OUTLINES ==
{TRUE
{TRUE || FALSE
FALSE || category_name}
category_name}
• Where :
– TRUE enables use of private outlines in the
DEFAULT category
– FALSE disables use of private outlines
– category_name enables use of private
outlines in the named category
®
3-28
Configuration Parameters
The USE_PRIVATE_OUTLINESsession parameter is added to control the use of private outlines
instead of public outlines. When an outlined SQL command is issued, this parameter will cause
outline retrieval to come from the session private area rather than the public area normally consulted
as per the setting of USE_STORED_OUTLINES. If no outline exists in the session private area, no
outline will be used for the compilation of the command.
You can specify a value for this session parameter by using the following syntax :
ALTER SESSION SET USE_PRIVATE_OUTLINES = {TRUE | FALSE |
category_name}
Where :
• TRUE enables use of private outlines and defaults to the DEFAULT category
• FALSE disables use of private outlines
• category_name enables use of private outlines in the named category
When a user begins an outline editing session, the parameter should be set to the category to which
the outline being edited belongs. This enables the feedback mechanism in that it allows the private
outline to be applied to the compilation process.
Upon completion of outline editing, this parameter should be set to false to restore the session to
normal outline lookup as dictated through the USE_STORED_OUTLINES parameter.

Create Outline Syntax Changes
• The CREATE OUTLINE command has been
extented to clone outlines :
CREATE
CREATE [OR
[OR REPLACE]
REPLACE] [[PUBLIC
PUBLIC || PRIVATE]
PRIVATE] OUTLINE
OUTLINE [outline_name]
[outline_name]
[FROM
[FROM [PUBLIC
[PUBLIC || PRIVATE]
PRIVATE] source_outline_name
source_outline_name]]
[FOR
[FOR CATEGORY
CATEGORY category_name]
category_name] [ON
[ON statement]
statement]
• With the following meaning :

– PUBLIC: The outline is to be created for use by
PUBLIC. This is the default
– PRIVATE: The outline is to be created for
private use by the current session only
– FROM: Provides a way to create an outline by
copying an existing one
– source_outline_name : This is the name of
the outline being cloned ®
3-29
Create Outline Syntax Changes

The new syntax elements are highlighted in blue in the above slide:
• PUBLIC: The outline is to be created for use by PUBLIC. This is the default since outline
creation is intended for system-wide use.
• PRIVATE: The outline is to be created for private use by the current session only and its data is
stored in the current parsing schema. When specified, the prerequisite outline tables and indices must
exist in the local schema.
• FROM: This construct provides a way to create an outline by copying an existing one.
• source_outline_name: This is the name of the outline being cloned. By default, it is found
in the public area but if preceded by the PRIVATE keyword will be found in the local schema.
The addition of the PRIVATE and FROM keywords enable outline cloning. When a user wishes to
edit an outline, he does so on a private copy which is created by specifying the PRIVATE keyword.
In the FROM clause, the source outline to be edited is named and is found in the public area unless
preceded by the PRIVATE keyword, in which case the user would be copying a private version of
the named outline.
When specifying the FROM clause, existing semantics apply to outline name and category so if
unspecified, an outline name will be generated under the DEFAULT category.
When a PRIVATE outline is being created, if the prerequisite outline tables to hold the outline data
do not exist in the local schema, an error will be returned.
Note: It is also possible to use the PUBLIC and PRIVATE key words with the ALTER OUTLINE
command.

Outline Cloning Examples
• Clone a public outline to the session private area

CREATE
CREATE PRIVATE
PRIVATE OUTLINE
OUTLINE private_outline
private_outline
FROM public_outline;
FROM public_outline;
• Clone a private copy to another private copy

CREATE
CREATE PRIVATE
PRIVATE OUTLINE
OUTLINE private_outline2
private_outline2 FROM
FROM
PRIVATE
PRIVATE private_outline1
private_outline1 FOR
FOR CATEGORY
CATEGORY cat2;
cat2;
• Copy a private outline back to the public area

CREATE
CREATE OR
OR REPLACE
REPLACE OUTLINE
OUTLINE public_outline
public_outline FROM
FROM
PRIVATE
PRIVATE private_outline
private_outline
3-30

• CREATE PRIVATE OUTLINE private_outline FROM public_outline;
This example shows how to clone a public outline to the session private area for editing purposes. In
this case, outline public_outline is cloned into the private area and given the name
private_outline.
• CREATE PRIVATE OUTLINE private_outline2 FROM PRIVATE
private_outline1 FOR CATEGORY cat2;
This example shows how to clone a private copy of an outline to another private copy. In this case,
outline private_outline2 is being created in category cat2 from the outline
private_outline1 which is local to the current session. This might be done if the user wishes
to save away several versions of the outline. Specification of category is necessary here to avoid
namespace collision.
• CREATE OR REPLACE OUTLINE public_outline FROM PRIVATE
private_outline;
This example shows how to copy a private outline back to the public area for general use. This is a
way to save local edits back to the master copy of the outline. This operation is also
called ’publicize’.

• Create a private outline independently

CREATE
CREATE PRIVATE
PRIVATE OUTLINE
OUTLINE private_outline
private_outline ON
ON
select
select count(*)
count(*) from
from employee;
employee;
• Copy one public outline to another
CREATE
CREATE OR
OR REPLACE
REPLACE OUTLINE
OUTLINE public_outline2
public_outline2
FROM
FROM public_outline1
public_outline1 FOR
FOR CATEGORY
CATEGORY cat2;
cat2;
• Replace an outline with itself

CREATE
CREATE OR
OR REPLACE
REPLACE PRIVATE
PRIVATE OUTLINE
OUTLINE
private_outline1 FROM PRIVATE
private_outline1 FROM PRIVATE
private_outline1;
private_outline1;
dbms_outln_edit.refresh_private_outline(
dbms_outln_edit.refresh_private_outline( 'private_outline1
'private_outline1')
')
®
3-31
Outline Cloning Examples (Continued)

• CREATE PRIVATE OUTLINE private_outline ON select count(*) from
employee;
This example demonstrates how one might create a private outline independently without copying
from a pre-existing outline. private_outline is created in the session private area. This would
be useful for users wishing to experiment with an outline before creating it for public consumption.
• CREATE OR REPLACE OUTLINE public_outline2 FROM public_outline1
FOR CATEGORY cat2;
This example shows how to copy one public outline to another. Here public outline
public_outline1 is copied to public_outline2 and placed in category cat2.
Specification of category is necessary here to avoid namespace collision.
• CREATE OR REPLACE PRIVATE OUTLINE private_outline1 FROM PRIVATE
private_outline1;
This example shows how to replace an outline with itself. This might seem like a useless operation
but it actually serves the dual purpose of invalidating dependent shared cursors and bringing the
shared pool view of the object in line with what is on disk due to the user’s edits. The feedback
mechanism will rely on this aspect of the command. You can also use
DBMS_OUTLN_EDIT.REFRESH_PRIVATE_OUTLINEor ALTER SYSTEM FLUSH
SHARED_POOL to accomplish this.

Manual Outline Editing Example
1. Connect under a user that has the following privileges:

– CREATE ANY OUTLINE
– SELECT_CATALOG_ROLE
2. Create the outline editing tables in that schema
EXECUTE
EXECUTE DBMS_OUTLN_EDIT
DBMS_OUTLN_EDIT.CREATE_EDIT_TABLES
.CREATE_EDIT_TABLES;;
3. Clone the outline for editing

CREATE
CREATE PRIVATE
PRIVATE OUTLINE
OUTLINE OL1
OL1 FROM
FROM OL1;
OL1;
4. Edit the outline: DMLs against the local ol$hints table

UPDATE
UPDATE OL$HINTS
OL$HINTS
SET
SET HINT_TEXT == 'INDEX(T1
HINT_TEXT 'INDEX(T1 I1)'
I1)'
WHERE
WHERE HINT#
HINT# == 5;
5;
®
3-32

1. Connect as SCOTT (for example), which must have both CREATE ANY OUTLINE and
select_catalog_role in order to edit an outline
2. Create the outline editing tables in that schema
EXECUTE_OUTLN.PKG.CREATE_EDIT_TABLES;
3. Clone the outline for editing:
CREATE PRIVATE OUTLINE OL1 FROM OL1;
4. Edit the outline by performing DML against the local ol$hints table i.e.
UPDATE_OL$HINTS SET HINT_TEXT='INDEX(T1 I1)' WHERE HINT#=5;
To do it in the right way, it is useful to understand the structure of the outline tables, and since
manual hint editing is an advanced, potentially dangerous feature, it is highly recommended
to use the graphical outline editor provided.

5. Prepare to test the edits

CREATE
CREATE OR
OR REPLACE
REPLACE PRIVATE
PRIVATE OUTLINE
OUTLINE OL1
OL1
FROM
FROM PRIVATE
PRIVATE OL1
OL1;;
ALTER
ALTER SESSION
SESSION SET
SET
USE_PRIVATE_OUTLINES=true
USE_PRIVATE_OUTLINES=true ;;
6. Execute the outlined query

7. Publicize the edits
CREATE
CREATE OR
OR REPLACE
REPLACE OUTLINE
OUTLINE OL1
OL1 FROM
FROM
PRIVATE OL1;
PRIVATE OL1;
3-33
Manual Outline Editing Example (Continued)

5. Prepare to test the edits:
CREATE_PRIVATE_OUTLINE_OL1_FROM_PRIVATE_OL1;
This one is first needed to re-read the edited version of the outline on disk and to invalidate all
corresponding cursors to recompile them.
ALTER SESSION SET USE_PRIVATE_OUTLINES=true;
6. Execute the outlined query
7. Save, or publicize the edits
CREATE_OR_REPLACE_OUTLINE_OL1_FROM_PRIVATE_OL1;

Outline Editor
3-34
Outline Editor
Although outlines are currently shipping (introduced in Oracle8i) and users can create, delete and
copy them manually using SQL (and can potentially create a great number of them by automatically
generating them using “ALTER SESSION SET CREATE_STORED_OUTLINES = TRUE”), there
is currently no convenient way to manage outlines.
Oracle9i gives users the ability to manage their outlines directly using new outline management GUI
functionality, the “Outlines Window”, to browse, sort, delete, export and import them. Users are also
able to tell if an outline, once defined or imported, has been used, or continues to be used after a
database or application upgrade, and are able to change the category of a group of outlines.
The Outlines Window is accessible via SQL Analyze (the primary interface for dealing with
outlines) and the console. Because outlines are primarily used for tuning SQL statements, users
create (other than through SQL), edit, analyze and compare them using the Outline Editor
functionality of SQL Analyze.

Summary
• Use List Partitioning

• Create constraints on views
• Edit Outlines
3-35

Extraction, Transformation,
And Loading (ETL)

Objectives
• Understand Oracle’s core ETL framework inside the
database and its integration advantage
• ETL framework components:
– Understand how to use Change Data Capture
– Read non-Oracle data from external tables
– Use the new table functions
– Insert into multiple tables in one statement
– Use the new Upsert statement
– Use transportable tablespace between databases of different
block sizes
• Understand parallel insert operations enhancements
• Use the new Bulk Bind features in PL/SQL
• Use some of the new SQL*Loader possibilities
®
4-2

Classical ETL Process
Flat Files
Load Into Staging Staging Table 1

Tables
Validate Data Staging Table 2 Transform Data
Merge Into
Staging Table 3 Warehouse Warehouse Table
Tables
INSERT & UPDATE
4-3
Classical ETL Process

Data transformations are often the most complex and, in terms of processing time,
the most costly part of the Extraction Transformation Loading (ETL) process. They
can run the gamut from simple data conversions to extremely complex data
scrubbing techniques. Many, if not all, data transformations can occur within an
Oracle9i database, although transformations are often implemented outside of the
database (for example, on flat files) as well.
From an architectural perspective, you can transform your data in two ways:
• Multi-Stage Data Transformation
• Pipelined Data Transformation
Multi-Stage Data Transformation
The data transformation logic for most data warehouses consists of multiple steps.
For example, in transforming new records to be inserted into a sales table, there
may be separate logical transformation steps to validate each dimension key. A
graphical way of looking at the transformation logic is presented on the above
slide. When using Oracle8i as a transformation engine, a common strategy is to
implement each different transformation as a separate SQL operation and to create
a separate, temporary staging table to store the incremental results for each step.
This load-then-transform strategy also provides a natural checkpointing scheme to
the entire transformation process, which enables to the process to be more easily
monitored and restarted. However, a disadvantage is that, due to the multistaging,
the space and time required increases. It may also be possible to combine many
simple logical transformations into a single SQL statement or single PL/SQL
procedure. Doing so may provide better performance than performing each step
independently, but it may also introduce difficulties in modifying, adding,
recovering, or dropping individual transformations.

Pipelined Data Transformation in Oracle9i
External Tables
External Table
Flat Files
Table Functions
Merge Into
Validate Data Transform Data Warehouse
Tables
UPSERT
Warehouse Table
4-4
Pipelined Data Transformation in Oracle9i

With the introduction of Oracle9i, Oracle’s database capabilities have been
significantly enhanced to address specifically some of the tasks in ETL
environments. The ETL process flow can be changed dramatically, and the
database becomes an integral part of the ETL solution. Taking advantage of the
new functionality, some of the former necessary process steps become obsolete
whilst some others can be remodeled to enhance the data flow and the data
transformation to become more scalable and non-interruptive. We are no longer
talking about serial transform-then-load (with most of the tasks done outside the
database) or load-then-transform, rather we are talking about an enhanced
transform-while-loading.
Oracle9i offers a wide variety of new capabilities to address all the issues and tasks
relevant in an ETL scenario. It is important to understand that the database offers
toolkit functionality rather than trying to address a one-size-fits-all solution. The
underlying database has to enable the most appropriate ETL process flow for a
specific customer need, and not dictate or constrain it from a technical perspective.
The above slide illustrates the new functionality, which is discussed throughout
later slides in this lesson.

Overview of External Tables
• External tables are read-only tables where the data

is stored outside the database in flat files
• The data can be queried like a ‘virtual table’, using
any supported language inside the Database
• No DML is allowed and no indexes can be created
• The metadata for an external table is created using
a CREATE TABLE statement
• Access rights are controlled via SELECT TABLE
and READ DIRECTORY privilege
• An external table describes how the external data
should be presented to the database
4-5
Overview of External Tables

External tables are like regular SQL tables with the exception that the data is read only and does
not reside in the database, thus the organization is external. The external table can be queried
directly and in parallel using SQL. As a result , the external table acts as a view. The metadata for
the external table is created using the "CREATE TABLE … ORGANIZATION EXTERNAL"
statement.
No DML operations are possible and no indexes can be created on them.
The CREATE TABLE … ORGANIZATION EXTERNALoperation involves only the creation of
metadata in the Oracle Dictionary since the external data already exists outside the database. Once
the metadata is created, the external table feature enables the user to easily perform parallel
extraction of data from the specified external sources.

Applications of External Tables
External tables :
• Allow external data to be queried and joined directly
and in parallel without requiring it to be loaded into the
database.
• Eliminate the need for staging the data within the
database for ETL in data warehousing applications
• Are useful in environments where an external source
has to be joined with database objects and then
transformed
• Are useful when the external data is large and not
queried frequently
• Complement SQL*Loader functionalities
4-6
Applications of External Tables

Oracle9i‘s external table feature allows you to use external data as a “virtual table” and can be
queried and joined directly and in parallel without requiring the external data to be first loaded in the
database.
External tables enable the pipelining of the loading phase with the transformation phase. The
transformation process can be merged with the loading process without any interruption of the data
streaming. It is no longer necessary to stage the data inside the database for comparison or
transformation.
The main difference between external tables and regular tables is that externally organized tables are
read-only. No DML operations are possible and no indexes can be created on them. Oracle9i’s
external tables are a complement to the existing SQL*Loader functionality, and are especially useful
for environments where the complete external source has to be joined with existing database objects
and transformed in a complex manner or the external data volume is large and used only once.
SQL*Loader, on the other hand, might still be the better choice for loading of data where additional
indexing of the staging table is necessary. This is true for operations where the data is used in
independent complex transformations or the data is only partially used in further processing.

Example of Defining External Tables
CREATE table employees_ext (employee_id NUMBER,
first_name Char(30)), last_name CHAR(30 ), …)
ORGANIZATION EXTERNAL ( -- External Table
TYPE oracle_loader –- Access Driver
DEFAULT DIRECTORY delta_ dir –- Files Directory
ACCESS PARAMETERS –- Similar to SQL*Loader
(RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ' ,'
BADFILE 'bad_ emp_ext'
LOGFILE 'log_ emp_ext'
MISSING FIELDS ARE NULL)
LOCATION ('emp1.txt','emp2.txt')
PARALLEL 5 –- Independent from the number of files
REJECT LIMIT UNLIMITED;
®
4-7
Example of Defining External Tables

An External Table can be created with single CREATE TABLE DDL command. This will create the
meta information which is necessary to access the external data seemlessly from inside the database.
The following information must be provided:
• Columns and data types for access in the database
• Where to find the external data
• Access driver: It is the responsibility of the access driver and the external table layer to do the
necessary transformations required on the data in the data file so that it matches the external table
definition. There is one access driver for every implementation of an external table type.
• Format of the external data, similar to SQL*Loader
• Degree of parallelism: Note, that the degree is not dependent on the number of external data files
In the example above an external table named employees_ext is defined. Delta_dir is a directory
where the external flat files are residing. This example also use the default access driver for this
implementation of external tables. It is called oracle_loader.The access parameters control the
extraction of data from the flat file using record and file formatting information. The directory object
was introduced in Oracle8i.

Querying External Tables
SELECT * FROM employees_ext ;
Oracle9i
Flat Files
4-8
Querying External Tables

In the above example when the external table employees_ext is queried the
data is retrieved from the external data files. When the user selects data from the
external table, the dataflow goes from the external data source to the Oracle SQL
engine where data is processed. As data is extracted, it is transparently converted
by the external agent(s) from its external representation into an equivalent Oracle
native representation (the loader stream).

Data Dictionary Information
for External Tables
• DBA_EXTERNAL_TABLES • DBA_EXTERNAL_LOCATIONS
- OWNER – OWNER
- NAME – TABLE_NAME
- TYPE_OWNER – LOCATION
- TYPE_NAME – DIRECTORY_OWNER
- DEFAULT_DIRECTORY_ OWNER – DIRECTORY_NAME
- DEFAULT_DIRECTORY_ NAME
- REJECT_LIMIT
4-9
Data Dictionary Information for External Tables

DBA_EXTERNAL_TABLES lists the specific attributes of all the external tables in the system:
OWNER : Owner of the external table
NAME : Name of the external table
TYPE_OWNER : Implementation type owner
TYPE_NAME : Implemenation type name
DEFAULT_DIRECTORY_OWNER: Owner of the default directory for this external table
DEFAULT_DIRECTORY_NAME: Name of the default directory for this external table
REJECT_LIMIT : Reject limit
DBA_EXTERNAL_LOCATIONS lists the specific flat files and corresponding Oracle Directories
OWNER : Owner of the external table
TABLE_NAME : Name of the external table
LOCATION : Flat File name
DIRECTORY_OWNER : Owner of the directory for this external table
DIRECTORY_NAME : Name of the directory for this external table

Overview of Table Functions
• Oracle9i supports pipelined and parallelizable table

functions
– Output is a set of rows
– Input can be a set of rows
– The output can be pipelined
– Evaluation of the table function can be parallelized
• Table Functions are used in the FROM clause of a
SELECT statement
• Table Functions can be defined in PL/SQL using a
native PL/SQL interface or in Java or C using the
Oracle Data Cartridge Interface (ODCI)
4-10
Overview of Table Functions

In the ETL process, the data extracted from a source system passes through a
sequence of transformations before it is loaded into a data warehouse. A large class
of user-defined transformations are implemented in a procedural manner, either
outside the database or inside the database in PL/SQL. Oracle9i’s table functions
provide the support for pipelined and parallel execution of such transformations
implemented in PL/SQL, C, or Java. The scenarios mentioned above can be done
without requiring the use of intermediate staging tables, which interrupt the data
flow through various transformations steps.

4
Example of Creating Table Functions
CREATE OR REPLACE FUNCTION transform(p ref_cur_type)

RETURN table_order_items_type
PIPELINED
PARALLEL_ENABLE ( PARTITION p BY ANY) IS
BEGIN
FOR rec IN p LOOP
… -- Transform this record
PIPE ROW (rec);
END LOOP;
RETURN;
END;
4-11
Example of Creating Table Functions

In the above example REF_CUR_TYPE and TABLE_ORDER_ITEMS_TYPEare user defined object
types. REF_CUR_TYPE is defined as a ref cursor. TABLE_ORDER_ITEMS_TYPEis an object table.
Pipelined functions must have return statements that do not return any values.
Here, the idea is to transform each record corresponding to the cursor passed as the parameter for the
function and return the corresponding row to the caller. Note that one record is returned as soon as one
row is processed.
The new PIPELINED instruction in PL/SQL:
• Returns a single result row
• Suspend the execution of the function
• Function is restarted when the caller requests the next row
PARALLEL_ENABLE is an optimization hint indicating that the function can be executed in parallel.
The function should not use session state, such as package variables, as those variables may not be
shared among the parallel execution servers.
The optional PARTITION argument BY clause is used only with functions that have a REF
CURSOR argument type. It lets you define the partitioning of the inputs to the function from the REF
CURSOR argument. Partitioning the inputs to the function affects the way the query is parallelized
when the function is used as a table function (that is, in the FROM clause of the query). ANY indicates
that the data can be partitioned randomly among the parallel execution servers. Alternatively, you can
specify RANGE or HASH partitioning on a specified column list.

Using Table Functions Examples
SELECT *
FROM TABLE(transform(cursor(SELECT *
FROM order_items_ext)));
INSERT /*+ APPEND, PARALLEL */

INTO order_items
SELECT *
FROM TABLE(transform(cursor(SELECT *
FROM order_items_ext)));
4-12
Using Table Functions Examples

Pipelined functions can be used in the FROM clause of SELECT statements. The result rows
are retrieved by Oracle iteratively from the table function implementation. Multiple
invocations of a table function , either within the same query or in separate queries will result
in multiple executions of the underlying implementation, i.e. there is no buffering and reuse
of rows.

Advantages of PL/SQL Table Functions
• Table functions can reduce response time by

‘pipelining’the results to the consuming process as
soon as they are produced
• Table functions can return multiple rows during each
invocation (pipelining of data).
• The number of invocations are reduced thereby
improving performance
• Pipelining eliminates the need for buffering the
produced rows
4-13

4
Overview of Multi-table Insert Statements
• Allows the INSERT … SELECT statement to insert

rows into multiple tables as part of a single DML
statement
• Can be used in data warehousing systems to
transfer data from one or more operational
sources to a set of target tables
• Is used internally for refreshing materialized views
• Allows you to still benefit from:
– Parallelization
– Direct-load mecanism
4-14
Overview of Multi-table Inserts Statements

The INSERT … SELECT statement with the new syntax can be parallelized and used with the
direct-load mechanism. The multi table INSERT statement inserts computed rows derived from the
rows returned from the evaluation of a subquery. There are two forms of the multi-table INSERT
statement: unconditional and conditional. For the unconditional form, an INTO clause list is executed
once for each row returned by the subquery. For the conditional form INTO clause lists are guarded
by WHEN clauses that determine whether the corresponding INTO clause list is executed.
• An INTO clause list consists of one or more INTO clauses. The execution of an INTO clause list
causes the insertion of one row for each INTO clause in the list.
• An INTO clause specifies the target into which a computed row is inserted . The target specified
may be any table expression that is legal for an INSERT … SELECT statement. However aliases
cannot be used. The same table may be specified as the target for more than one INTO clause.
• An INTO clause also provides the value of the row to be inserted using a VALUES clause . An
expression used in the VALUES clause can be any legal expression , but may only refer to columns
returned by the select list of the subquery. If the VALUES clause is omitted , the select list of the
subquery provides the values to be inserted. If a column list is given, each column in the list is
assigned a corresponding value from the VALUES clause or the subquery. If no column list is given ,
the computed row must provide values for all columns in the target table.
Note: The multi-table insert statement can be performed with or without direct-load, and with or
without parallelization for faster performance. In general, the rules are the same as for the single-
table INSERT statement. The idea is that all corresponding tables will be direct-loaded/parallelized,
or none of them will be. That is why, all you need to specify is the APPEND/PARALLEL hint without
specifying tables names.
Advantages of Multi-table INSERTs
• Eliminates the need for multiple INSERT … SELECT

statements to populate multiple tables
• Eliminates the need for a procedure to do multiple
INSERTs using IF … THEN syntax
• Significant performance improvement over above
two methods due to the elimination of the cost of
materialization and repeated scans on the source
data
4-15

Types of Multi-table INSERT Statements
• Unconditional INSERT
• Pivoting INSERT
• Conditional ALL INSERT
• Conditional FIRST INSERT
4-16
Note: This feature is an Oracle extension to SQL and is not a SQL: 1999 standard.

Example of Unconditional INSERT
INSERT ALL
INTO product_activity VALUES(today, product_id,
quantity)
INTO product_sales VALUES(today, product_id, total)
SELECT trunc(order_date) today, product_id,
SUM(unit_price) total, SUM(quantity) quantity
FROM orders, order_items
WHERE orders.order_id = order_items.order_ id AND
order_date = TRUNC(SYSDATE)
GROUP BY product_id;
4-17
Example of Unconditional INSERT

ALL into_clause:
Specify ALL followed by multiple insert_into_clauses to perform an unconditional multitable
insert. Oracle executes each insert_into_clause once for each row returned by the subquery.

Example of Pivoting INSERT
INSERT ALL
INTO sales VALUES (product_id , week, sales_sun)
INTO sales VALUES (product_id , week, sales_mon)
INTO sales VALUES (product_id , week, sales_tue)
INTO sales VALUES (product_id , week, sales_wed)
INTO sales VALUES (product_id, week, sales_thu)
INTO sales VALUES (product_id, week, sales_fri)
INTO sales VALUES (product_id, week, sales_sat)
SELECT product_id, TO_DATE(week_id, 'WW') week,
sales_sun, sales_mon, sales_tue, sales_wed,
sales_thu, sales_fri, sales_sat
FROM sales_source_data;
4-18
Example of Pivoting INSERT

The above slide is an example of inserting into the same table several times pivoting from a non-
normalized form to a normalized form

Syntax for Conditional ALL INSERT
INSERT ALL
WHEN product_id IN (SELECT product_id
FROM promotional_items)
INTO promotional_sales VALUES(product_id,list_price)
WHEN order_mode = 'online'
INTO web_orders VALUES(product_id, order_total)
SELECT product_id, list_price , order_total, order_mode
FROM orders;
4-19
Syntax for Conditional ALL INSERT

The above example inserts a row into the PROMOTIONAL_SALES table for products sold that are
on the promotional list, and into the WEB_ORDERS table for products for which online orders were
used. It is possible that two rows are inserted for some item lines, and none for others.

Syntax for Conditional FIRST INSERT
INSERT FIRST
WHEN order_total > 10000 THEN
INTO priority_handling VALUES (id)
WHEN order_total > 5000 THEN
INTO special_handling VALUES (id)
WHEN total > 3000 THEN
INTO privilege_handling VALUES (id)
ELSE
INTO regular_handling VALUES (id)
SELECT order_total , order_id id
FROM orders ;
4-20
Syntax for Conditional FIRST INSERT

The above statement inserts into an appropriate handling according to the total of an order.
If you specify FIRST as in the above example, Oracle evaluates each WHEN clause in the order in
which it appears in the statement. For the first WHEN clause that evaluates to true, Oracle executes the
corresponding INTO clause and skips subsequent WHEN clauses for the given row.
For a given row, if no WHEN clause evaluates to true:
• If you have specified an ELSE clause Oracle executes the INTO clause list associated with the
ELSE clause.
• If you did not specify an ELSE clause, Oracle takes no action for that row.

Overview of MERGE statements
• MERGE statements
– provide the ability to conditionally UPDATE/INSERT
into the database
– Do an UPDATE if the row exists and an INSERT if it is
a new row
– Avoid multiple updates
– Can be used in Data Warehousing applications
4-21

Applications of MERGE statements
• MERGE statements use a single SQL statement to

complete and UPDATE or INSERT or both
• The statement can be parallelized transparently
• Bulk DML can be used
• Performance is improved because fewer
statements require fewer scans of the source
tables
4-22

Example of Using the MERGE Statement
in Data Warehousing
MERGE INTO customer C

USING cust_src S
ON (c.customer_id = s.src_customer_id)
WHEN MATCHED THEN
UPDATE SET c.cust_address = s.cust_address
WHEN NOT MATCHED THEN
INSERT ( Customer_id, cust_first_name,…)
VALUES (src_customer_id , src_first_name,…);
4-23
Example of MERGE
This is an example of using MERGEs in data warehousing. Customer(C) is a large fact table and
cust_src is a smaller "delta" table with rows which need to be inserted into customer
conditionally. This MERGE statement indicates that table customer has to be MERGEed with the rows
returned from the evaluation of the ON clause of the MERGE. The USING clause in this case is the
table cust_src (S), but it can be an arbitrary query. Each row from S is checked for a match to
any row in C by satisfying the join condition specified by the On clause. If so, each row in C is
updated using the UPDATE SET clause of the MERGE statement. If no such row exists in C then the
rows are inserted into table Causing the ELSE INSERT clause.

Oracle Change Data Capture Overview
• Extraction of Data needs to be efficient in identifying

recent changes
• Oracle Change Data Capture facilitate Incremental
Data Extraction:
– Captures DMLs made to user tables
– Changes are stored in database tables called
Change Tables
– Change data available to users through views
– Propagation as Publish and Subscribe mechanism
– Synchronous: Uses internal triggers in the same
database. This gives the user a synchronous view
of changes
– DBAs must ensure Java is enabled in the database
®
4-24
Oracle Change Data Capture Overview

An important consideration for extraction is incremental extraction, also called
change data capture. If a data warehouse extracts data from an operational system
on a nightly basis, then the data warehouse requires only the data that has changed
since the last extraction (that is, the data that has been modified in the past 24
hours).
When it is possible to efficiently identify and extract only the most recently
changed data, the extraction process can be much more efficient, because it must
extract a much smaller volume of data. Unfortunately, for many source systems,
identifying the recently modified data may be difficult or intrusive to the operation
of the system. Change data capture is typically the most challenging technical issue
in data extraction.
Oracle Change Data Capture is a new server feature introduced in Oracle9i that
quickly identifies and processes only the data that has changed, not entire tables,
and makes the change data available for further use. It captures all the INSERT’s,
UPDATE’s and DELETE’s made to user tables. These changes are stored in a new
database object called a change table, and the change data is made available to
applications in a controlled way using views.
Oracle Change Data Capture (CDC) can be used in synchronous mode only:
• Synchronous: Change Data Capture supports a synchronous mode of data
capture which employs internal database triggers. In this mode, changes are
stored in the same database as the one from which changes are tracked down.
This has an notable impact on performance but has the advantage to give the
user a synchronous vision of changes.
Note: Although the user interface is via PL/SQL packages, Oracle Change Data
Capture is implemented in Java and C. DBAs need to have the Java runtime
installed and enable on the Oracle9i RDBMS server.
Publish and Subscribe Model
• Publisher:
– Determines and advance the change sets
– Uses the Oracle supplied package
DBMS_LOGMNR_CDC_PUBLISH
– Publishes the change data
– Allow controlled access to subscribers
• Subscriber:
– Use the Oracle supplied package
DBMS_LOGMNR_CDC_SUBSCRIB E
– Extend the window and create change view
– Prepare the subscriber views
– View data stored in change tables
– Purge the subscriber view
– Remove the subscriber views
®
4-25
Publish and Subscribe Model

Most Change Data Capture systems have one publisher that captures and publishes
change data for any number of Oracle source tables. There can be multiple
subscribers accessing the change data. Change Data Capture provides PL/SQL
packages to accomplish the publish and subscribe tasks.
Publisher
The publisher (who is usually a DBA) determines which tables the warehouse
application is interested in capturing changes from. These tables are referred to as
source tables. For each source table from the OLTP system that must be captured,
the publisher creates a corresponding change table on the staging system. Change
tables are organized into change sets, which keep track of the changes to a number
of tables, in a transactional consistent manner. Oracle CDC ensures that none of the
updates are missed or double counted.
The publisher performs these tasks:
• Determines the source tables from which the data warehouse application is
interested in capturing change data. Uses the Oracle supplied package,
DBMS_LOGMNR_CDC_PUBLISH, to set up the system to capture data from
one or more relational tables (called source tables).
• Publishes the change data in the form of change tables.
• Allow controlled access to subscribers by using the SQL GRANT and
REVOKE statements to grant and revoke the SELECT privilege on change
tables for users and roles.

Publish and Subscribe Model (Continued)
Subscribers
Once the change set has been created, client applications can then subscribe to the
change set. The subscribers, are consumers of the published change data. Each
subscriber has its own change view on the change data, so that multiple clients can
simultaneously subscribe to the same change set without interfering with one
another.
For example, if the change set contains all the changes that occurred between
Monday and Friday, application A can be processing data from Tuesday, client B
can be looking at the data from Wednesday and Thursday, etc. Each client is has
its own subscription window that contains a block of transactions in the order in
which they committed.
Oracle CDC manages the subscription window on behalf of each subscriber, by
creating a database view that returns the range of transactions of interest to that
subscriber. The subscriber accesses the change data by performing a SELECT on
the change view that was generated by CDC. As each client finishes processing
the data in its subscription window, it calls a procedure to purge the contents of the
window. When it wants to read additional change data, it calls a procedure to
extend the window, and CDC creates a new subscriber view.
Each subscriber walks through the data at its own pace, while Oracle CDC takes
care storage management. Data that is no longer in use by any subscriber will be
automatically purged by CDC. This is necessary to prevent the change set from
growing indefinitely.
Subscribers perform the following tasks:
• Use the Oracle supplied package, DBMS_LOGMNR_CDC_SUBSCRIBE, to
subscribe to source tables for controlled access to the published change data for
analysis.
• Extend the window and create a new view when the subscriber is ready to
receive a set of change data.
• Prepare the subscriber views.
• View data stored in change tables through subscriber views by using
SELECT statements to retrieve change data.
• Purge the subscriber view when it is finished processing a block of changes,
effectively making the subscriber view empty.
• Remove the subscriber views
Note: The following slides show examples of how to use CDC.

Components and Terminology for
Synchronous Change Data Capture
Source POST-DML Triggers SYNC_SOURCE

tables Change Source
SYNC_SET Subscriber
Subscriber
Change Set
View 1 View 2
Change Table 1 Change Table 2 Change Table 3 Change Table 4
C1 C2 C3 C4 C1 C2 C3 C5 C6 C7 C8 C1 C4 C7 C8
Source Table 1 Source Table 2 Source Table 3 Source Table 4

®
4-27
Components and Terminology for Synchronous Change Data

Capture
The following sections describe Change Data Capture components in more detail:
Source System
A source system is a production database that contains source tables for which
Change Data Capture will capture changes.
Source Table
A source table is a database table that resides on the source system that contains the
data you want to capture. Changes made to the source table are immediately
reflected in the change table.
Change Source
A change source represents a source system. There is only one system-generated
change source named SYNC_SOURCE.
Change Set
A change set represents the collection of change tables. There is only one system-
generated change set named SYNC_SET. Change tables are contained in the
predefined SYNC_SET change set.
Change Table
A change table contains the change data resulting from DML statements made to a
single source table. A change table consists of two things: the change data itself,
which is stored in a database table, and the system metadata necessary to maintain
the change table. A given change table can capture changes only from one source
table. In addition to published columns, the change table contains control columns
that are managed by Change Data Capture.
Components and Terminology for Synchronous Change Data
Capture (Continued)
Subscriber View
A subscriber view is a view created by Change Data Capture that returns all of the
rows in the subscription window. In the above example, the subscribers have
created two views: one on columns 1, 2 and 3 of Source Table 1 and one on
columns 4, 7, and 8 of Source Table 4 The columns included in the view are based
on the actual columns that the subscribers subscribed to in the source table.
Subscriber Window
A subscriber window defines the time range of change rows that the subscriber can
currently see. The oldest row in the window is the low watermark; the newest row
in the window is the high watermark. Each subscriber has a subscription window.
Note: With synchronous data capture, internal triggers are used to generate the
change data as data manipulation language (DML) operations are made to the
database on the source system. Every time a DML operation occurs on a source
table, the internal trigger is executed and writes a record of that operation to the
change table.

Data Dictionary Views Supporting CDC
• CHANGE_SOURCES lists existing change sources
• CHANGE_SETS lists existing change sets
• CHANGE_TABLES lists existing change tables
• DBA_SOURCE_TABLES lists published source tables
• DBA_PUBLISHED_COLUMNS lists published source table
columns
• DBA_SUBSCRIPTIONS lists all registered subscriptions
• DBA_SUBSCRIBED_TABLES lists published tables to which
subscribers have subscribed
• DBA_SUBSCRIBED_COLUMNS lists the columns of
published tables to which subscribers have subscribed
4-29
Data Dictionary Views Supporting CDC

Note: For most od these views, there are also the corresponding ALL_ and USER_
views.

Transportable Tablespaces and Oracle9i
• Introduced in Oracle8i
• New in Oracle9i: Source and target databases can
have different block sizes
• Especially useful for transporting from OLTP to
warehouse
• Same limitations as in Oracle8i except for the
different block sizes limitation
• Refer to the Memory Management Lesson in the
Server Manageability module for more info on how
to setup different block sizes in the same database
• Use the same procedures defined in Oracle8i to
transport a tablespace to a database with a different
block size
®
4-30

Oracle8i introduced an important mechanism for transporting data: transportable
tablespaces. This feature is the fastest mechanism for moving large volumes of
data between two Oracle databases.
Previous to Oracle8i, the most scalable data transportation mechanisms relied on
moving flat files containing raw data. These mechanisms required that data be
unloaded or exported into files from the source database. Then, after
transportation, these files were loaded or imported into the target database.
Transportable tablespaces entirely bypass the unload and reload steps.
Using transportable tablespaces, Oracle data files (containing table data, indexes,
and almost every other Oracle database object) can be directly transported from
one database to another. Furthermore, like import and export, transportable
tablespaces provide a mechanism for transporting metadata in addition to
transporting data.
Transportable tablespaces have some notable limitations: source and target systems
must be running Oracle8i (or higher), must be running the same operating system,
and must use the same character set and national character set, … Despite these
limitations, transportable tablespaces can be an invaluable data transportation
technique in many warehouse environments.
The most common applications of transportable tablespaces in data warehouses are
in moving data from an OLTP database to a data warehouse. For this case,
Oracle9i introduce a brand new feature that allow the DBA to transport a
tablespace with a certain block size to a database with another block size.

4
Although the procedure used to transport a tablespace from a source database to a
target database with different block sizes is the same as the procedure already
described for Oracle8i databases, the DBA needs to know how to set up databases
with different database block sizes. You should refer to the Server Manageability
module of this track for more details. Just follow the above link to this module.

Parallel Direct-Load Inserts over
Partitioned Tables Enhancements
• Since Oracle8, each partition of a table is acted upon

by only one slave during a parallel direct-load
operation
• With Oracle9i, multiple slave processes can insert data
in one partition
• The main benefit is that load balancing is possible with
data skew amongst partitions
• This results in better performance
4-32
Parallel Direct-Load Inserts over Partitioned Tables Enhancements

Since Oracle8, a parallel direct load insert operation over a partitioned table can
only use one slave process per partition. As a result, load balancing is not possible
if there is a data skew amongst the partitions, e.g. in many applications the table is
partitioned by range on a date column, and the rows are mainly inserted in the last
partition. Due to this, the slave working on the last partition needs to do much more
work than other slaves.
Allowing multiple slaves to work on a partition helps to alleviate this performance
bottleneck. It takes advantage of the dynamic load balancing capabilities of parallel
execution.

How does it Work ?
Oracle8i Oracle9i
HWM
Enqueue
Partition 1 Partition 2
DML DML
Slave Slave
Partition 1 Partition 2
Repartitioning
Query Query Slave Slave

Slave Slave
®
4-33
How does it Work ?

In Oracle8i, parallel direct-load inserts over partitioned tables are represented on
the left part of the above slide: During parallel execution of an insert into a
partitioned table, there are two active slave sets. These are the query slave set
which reads the data and the dml slave set which inserts the data. Each slave of the
dml slave set works on one partition of the table. The inverse is also true, i.e. each
partition is allocated to at most one slave from this slave set.
The rows read by the query slaves have to be repartitioned and redistributed
amongst the dml slaves so that each dml slave gets rows that correspond to the
partition it is populating. These rows are inserted above the High Water Mark (in
red on the above slide) of the segment of the corresponding partition by the dml
slave.
With the ability of allowing intra-partition parallelism in inserts, Oracle9i removes

this repartitioning phase altogether. The new mode of operation is represented on
the right part of the above slide. As is shown, each slave does the query as well as
the dml part of the execution. And as more than one slave can work simultaneously
on the target partitions, the repartitioning phase is not required.
Each slave while executing the query, buffers the results in an in-memory buffer
(one for each partition that it can insert into). Whenever, the buffer fills up, the
slave knows exactly the amount of space it needs for that partition. The slaves
coordinate their activities on each segment through a “broker enqueue” (in blue on
the above slide). Oracle9i associates with each segment an enqueue which holds
the address above the high water mark, above which no space has been reserved by
any slave.

How does it Work ? (Continued)
After determining the amount of space (X) it needs (at the time when the buffer
corresponding to that partition fills up), the slave simply reads this enqueue value
as soon as possible and reserves the needed space by bumping up the enqueue
value by X. The slace then release the enqueue before writing inside the segment in
the allocaterd space. At commit time, the high water mark of each correponding
partition is updated to the enqueue’s value in order to reflect the changes.

SQL*Loader Enhancements
• Oracle9i SQL*Loader allows for correct loading of

integer and zoned/packed decimal datatypes across
platforms. For example :
– INTEGER(n) BYTEORDER
• No more byte ordering problems
• CHECK constraints can be enabled during direct path
loads by using the EVALUATE CHECK_CONSTRAINTS
clause
4-35
SQL*Loader Enhancements
In Oracle9i, SQL*Loader allows for correctly loading integer and zoned/packed
decimal datatypes across platforms. This means that SQL*Loader now has the
ability to do the following:
• Load binary integer data created on a platform whose byte ordering is
different than that of the target platform: INTEGER(n) is a full-word binary
integer, where n is an optionally-supplied length of 1, 2, 4, or 8. If no length
specification is given, then the length, in bytes, is based on the size of a LONG
INT C on your particular platform. An INTEGER that is specified without a
length specification is nonportable. However, if INTEGER is specified with a
length specification (n) and the BYTEORDER keyword, it is portable. Specifying
an explicit length for binary integers is useful in situations where the input data
was created on a platform whose word length differs from that on which
SQL*Loader is running. For instance, input data containing binary integers
might be created on a 64-bit platform and loaded into a database using
SQL*Loader on a 32-bit platform. In this case, INTEGER(8) should be used
in order to instruct SQL*Loader to process the integers as 8-byte quantities, not
as 4-byte quantities.
• Load binary floating-point data created on a platform whose byte ordering is
different than that of the target platform (provided the floating-point format
used by source and target systems is the same)
• Specify the size, in bytes, of a binary integer and load it regardless of the
target platform's native integer size
• Specify that integer values are to be treated as signed or unsigned quantities
• Accept EBCDIC-based zoned/packed decimal data encoded in IBM format

SQL*Loader Enhancements (Continued)
SQL*Loader also allows to check more constraints during direct path loads.
Indeed, during a direct path load, the following constraints are automatically
disabled by default:
• Check constraints
• Referential constraints (foreign keys)
You can override the disabling of check constraints by specifying the EVALUATE
CHECK_CONSTRAINTS clause. SQL*Loader will then evaluate check constraints
during a direct path load. Any row that violates the check constraint is rejected.

Overview of Oracle9i Bulk Bind
Enhancements
• Bulk bind features have been enhanced in Oracle9i to
support more efficient and convenient bulk bind
operations:
– BULK FETCH
– BULK EXECUTE IMMEDIATE/BULK EXECUTE
– FORALL,COLLECT INTO and RETURNING INTO
• Restrictions on usage of collections in SELECT and
FETCH clauses has been removed
• Error handling for failure in Bulk Binds has been provided
– In prior releases, exceptions occurring during the FORALL,
stopped the program execution immediately
– The new error handling mechanism saves the error
information and continues execution
®
4-37
Overview of Oracle9i Bulk Bind Enhancements

Bulk bind features of PL/SQL in Oracle9i have been enhanced to support more convenient and
efficient bulk bind operations and provide an error handling mechanism. One of the restrictions about
bulk binds in prior releases of Oracle is that variables defined in SELECT and FETCH clauses cannot
be collections of records. In Oracle9i the bulk binding feature has been enhanced to allow users to
easily retrieve multiple rows from database tables. Bulk binding of records used in INSERT and
UPDATE statements is also now supported.
The following functionality has been introduced for Bulk SQL in embedded dynamic SQL:
• BULK FETCH support for cursors opened on dynamic SQL statements
• BULK EXECUTE IMMEDIATEand EXECUTE for dynamic SQL
• FORALL statement, COLLECT INTO, and RETURNING INTO clauses will be extended to
support dynamic SQL statements
In addition previously bulk bind operations, such as FORALL INSERT/UPDATE/DELETE, will
stop immediately whenever there is an error during its execution and an exception will then be
raised. In certain applications it is better to handle the exception and continue processing. In Oracle9i
An error handling mechanism has been incorporated such that errors during a bulk bind operation are
collected and returned together when the operation completes.

Example of Exception Handling
DECLARE
TYPE numtab IS table of number;
v_deptid numtab :=
numtab(10,0,11,12,30,0,20,199,2,0,9,1);
total_error NUMBER;
BEGIN
FORALL i IN v_deptid.first .. v_deptid.last
SAVE EXCEPTIONS
DELETE from employees where salary >
500000/v_deptid(i);
total_error := SQL%BULK_EXCEPTIONS.COUNT ;
4-38

The keywords SAVE EXCEPTIONS , in the FORALL DELETE/UPDATE/INSERT statements , are
required if the users want the bulk bind operation to complete regardless of the occurrences of errors.
All of the errors occurring during the execution are saved in the new cursor attribute,
%BULK_EXCEPTIONS. %BULK_EXCEPTIONS is a bulk attribute of a collection of records which
have two integer attributes. The first attribute, index is used to store the corresponding SQL error
code, SQLCODE. The user can get the corresponding SQL error message by calling SQLERRM with
the SQL error code as the parameter. The number of errors is saved in the count attribute of
%BULK_EXCEPTIONS, that is, %BULK_EXCEPTIONS.COUNT. The subscripts of
%BULK_EXCEPTIONS are from 1 to %BULK_EXCEPTIONS.COUNT.
Without the keywords, SAVE EXCEPTIONS , in a FORALL statement ,
DELETE/UPDATE/INSERT statement inside the FORALL will stop whenever an error occurs. In
this situation , SQL%BULK_EXCEPTIONS.COUNT will be one and SQL%BULK_EXCEPTIONS
will contain one record.
If there is no error at all during the execution of a FORALL statement,
%BULK_EXCEPTIONS.COUNT returns zero.
The values of this new cursor attribute always refer to the most recently executed FORALL statement.
%BULK_EXCEPTIONS cannot be assigned to other collections. It also cannot be passed as a
parameter to subprograms.

(continued)
dbms_output.put_line('Total number of errors is '
|| total_error);
FOR i in 1 .. Total_error LOOP
dbms_output.put_line('Error '||i||'occurred at the '
||SQL%BULK_EXCEPTIONS(i )||
'th iteration');
dbms_output.put_line('SQL Error Code is '||
SQL%BULK_EXCEPTIONS(i).error ||
' Error:'||
SQLERRM(SQL%BULK_EXCEPTIONS(i).error ));
END LOOP;
END;
4-39
Example of Exception Handling (continued)

In the example above, errors, whose SQL error codes are 1476 (zero_divide), had occurred when I =2,
6,10. The bulk bind operation would finish the whole operation and then return all of the errors
together into the new cursor attribute, %BULK_EXCEPTIONS.
SQL%BULK_EXCEPTIONS.COUNTis 3 and the contents of SQL%BULK_EXCEPTIONSare
(2,1476),(6,1476),(10,1476).
The output of the above program is the following:
Total number of errors is 3

Error 1 occurred at the 2th iteration
SQL Error Code is 1476 Error: ORA-01476:divisor is equal to zero



BULK COLLECT Enhancements Examples
CREATE TYPE Coords AS OBJECT (x NUMBER, y NUMBER);

CREATE TABLE grid (num NUMBER, loc Coords);
INSERT INTO grid VALUES(10, Coords(1,2));
INSERT INTO grid VALUES(20, Coords(3,4));
DECLARE
TYPE CoordsTab IS TABLE OF Coords;
pairs CoordsTab;
BEGIN
SELECT loc BULK COLLECT INTO pairs FROM grid;
-- now pairs contains (1,2) and (3,4)
END;
4-40
Bulk Collect Enhancements Examples

The keywords BULK COLLECT tell the SQL engine to bulk-bind output collections before returning
them to the PL/SQL engine. You can use these keywords in the SELECT INTO, FETCH INTO, and
RETURNING INTO clauses.
In the above example, the SQL engine loads all the values in an object column into a nested table
before returning the table to the PL/SQL engine.

BULK COLLECT Enhancements Examples
(Continued)
DECLARE
TYPE NameList IS TABLE OF employees.last_name%TYPE;
TYPE SalList IS TABLE OF employees.salary%TYPE;
CURSOR c1 IS SELECT last_name, salary
FROM employees WHERE sal > 1000;
names NameList;
sals SalList;
BEGIN
OPEN c1;
FETCH c1 BULK COLLECT INTO names, sals;
...
END;
®
4-41
Bulk Collect Enhancements Examples (Continued)

The above example shows that you can bulk-fetch from a cursor into one or more collections.

Example of Bulk Bind with Dynamic SQL
DECLARE
TYPE num_tab IS TABLE OF NUMBER;
type trc IS REF CURSOR;
rc trc;
ids num_tab;
BEGIN
EXECUTE IMMEDIATE 'select employee_id
from employees'
BULK COLLECT INTO ids;
END;
4-42
Example of Bulk Bind for Defining Variables

In the above example num_tab is an object type created as follows: CREATE TYPE num_tab AS
TABLE OF NUMBER;

Example of Bulk Bind for Input Variables
DECLARE
TYPE char_tab is TABLE OF VARCHAR2(30);
ids num_tab := num_tab (12,13,14,15);
names char_tab := char_tab('R/D','IT','GL','PR');
BEGIN
FORALL iter IN 1..4
EXECUTE IMMEDIATE
'INSERT INTO departments(department_id,
department_name)VALUES(:1, :2)'
USING ids(iter), names(iter);
END;
4-43
Bulk-bind for Input Variables

Input bind variables of a SQL statement can be bound via FORALL control statement and USING
clause. In the above example bulk binding is used to execute an INSERT statement.

Example of Bulk Binding In Output Variables
DECLARE
sql_str VARCHAR2(200);
v_sal NUMBER := 10000;
saltab num_tab;
BEGIN
sql_str := 'UPDATE employees SET salary = :1
WHERE department_id = 10
RETURNING salary INTO :2';
EXECUTE IMMEDIATE sql_str
USING v_sal RETURNING BULK COLLECT INTO
saltab;
END;
®
4-44
Bulk-bind for Output Variables

Only DML statements have out binds. In bulk bind output bind variables must be bound via BULK
RETURNING INTO clause.

Summary
In this lesson, you should have learned:
• Understand Oracle’s core ETL framework inside the
database and its integration advantage
• ETL framework components:
– Understand how to use Change Data Capture
– Read non-Oracle data from external tables
– Use the new table functions
– Insert into multiple tables in one statement
– Use the new Upsert statement
– Use transportable tablespace between databases of
different block sizes
• Use the new Bulk Bind features in PL/SQL
• Understand parallel insert operations enhancements
• Use some of the new SQL*Loader possibilities ®
4-45

Additional Enhancements for
Data Warehouse Environments

Objectives

• Understand the new bitmap index join mechanism
• Understand some of the Oracle9i Optimizer’s
enhancements
• Use of the enhanced Oracle9i materialized views
• Take advantage of the query rewrite
enhancements
5-2

What is a Bitmap Join Index ?
1.2.3
Sales Customers
CREATE BITMAP INDEX cust_sales_bji
ON Sales(C.cust_city)
FROM Sales S, Customers C
WHERE C.cust_id = S.cust_id;
start end
key bitmap
ROWID ROWID
10.8000.3 <Rognes, 1.2.3, 10.8000.3, 1000100100010010100…>
<Aix-en-Provence, 1.2.3, 10.8000.3, 0001010000100100000…>
<Marseille, 1.2.3, 10.8000.3, 0100000011000001001…>
……………………………………
SELECT SUM(S.amount_sold)
Only the index and Sales
FROM Sales S, Customers C
WHERE S.cust_id = table are used to evaluate
C.cust_id the query. No join with
AND C.cust_city = 'Aix-en-Provence'; Customers table is needed!
®
5-3
What is a Bitmap Join Index ?

In addition to a bitmap index on a single table, you can create a bitmap join index
in Oracle9i. A bitmap join index is a bitmap index for the join of two or more
tables. A bitmap join index is a space efficient way of reducing the volume of data
that must be joined by performing restrictions in advance.
As you can see on the above slide, Oracle9i introduces a new CREATE BITMAP
INDEX syntax allowing you to specify a FROM and a WHERE clause.
Here, we create a new bitmap join index named cust_sales_bji on the
SALES table. The key of this index is the column CUST_CITY of the
CUSTOMERS table.
This example assumes that there is an enforced primary key-foreign key
relationship between SALES and CUSTOMERS in order to ensure that what is
stored in the bitmap reflect the reality of the data in the tables. The column
CUST_ID is the primary key of CUSTOMERS but is also a foreign key inside
SALES.
The FROM and WHERE clause in the CREATE statement allow Oracle9i to make the
link between the two tables. They represent the natural join condition between the
two tables.
The midle part of the above graphic shows you a theoretical implementation of this
bitmap join index. Each entry or key in the index represent a possible city found in
the CUSTOMERS table. A bitmap is then associated to one particular key. The
meaning of the bitmap is quite obvious as it as the same representation as for
traditional bitmap indexes. Each bit in a bitmap corresponds to one row in the
SALES table. If we take the first key above (Rognes), we can see that the first
row in the SALES table corresponds to a product sold to a Rognes customer,
while the second bit is not a product sold to a Rognes customer.

What is a Bitmap Join Index ? (Continued)
The interest of this structure becomes clear with the last part of the slide. Indeed,
when the user tries to find what is the total cost of all sales for the Aix-en-Provence
customers, Oracle9i can just use the above bitmap join index and the SALES table
to answer the question. In this case, there is no need to explicitly compute the join
between the two tables. Using the bitmap join index in this case is way faster than
computing the join at query time.

Advantages & Disadvantages
• Bitmap join indexes provide an indexing structure
across two or more tables
• Advantages :
– Bitmap Join Indexes provide good performance for
join queries and are space-efficient
– Especially useful for large dimension tables in star
schemas
• Disadvantages :
– More indexes are required: up to one index per
dim-table column rather than one index per dim
table
– Maintenance costs are higher: building/refreshing a
bitmap join index requires a join
®
5-5
Advantages & Disadvantages

A Bitmap Join Index is an index on one table that involves columns of one or more
different tables through a join.
The volume of data that must be joined can be reduced if bitmap join indexes are
used as joins have already been precalculated. In addition, bitmap join indexes
which can contain multiple dimension tables can eliminate bitwise operations
which are necessary in the star transformation’s use of bitmap indexes.
An alternative to a bitmap join index is a materialized join view which is the
materialization of a join in a table. Compared to a materialized join view, a bitmap
join index is much more space efficient as it compress rowids of the fact tables.
Also, queries using bitmap join indexes can be sped up via bitwise operations.
On the other hand you may need to create more bitmap join indexes on the fact
table to satisfy the maximum number of different queries. This mean that you may
have to create one bitmap join index for each columns of the corresponding
dimension table(s). Of course, the implication of having many indexes on one table
is to have higher maintenace costs especially when the fact table is updated.

A more Complex Example
Sales
CREATE BITMAP INDEX c_s_p_bji ON
Customers Sales (C.cust_gender,P.prod_category)
FROM Sales S, Customers C, Products P
WHERE C.cust_id = S.cust_id AND
P.prod_id = S.prod_id;
SELECT SUM(S.amount_sold)
FROM Sales S, Customers C, Products P
WHERE S.cust_id = C.cust_id AND
S.prod_id = P.prod_id AND
C.cust_gender = 'MALE' AND
P.prod_category = 'MOBILE'
Products
5-6
A more Complex Example

You can create a bitmap join index on more than one table, as in the above slide,
where c_s_p_bji is a bitmap join index on the SALES table that indexes two
columns in two different tables: CUST_GENDER in CUSTOMERS and
PROD_CATEGORY in PRODUCTS.
The above SELECT statement that retrieves the total MOBILE sales for MALE
customers can benefit the c_s_p_bji.
Note: The above example is just here to clarify the syntax possibilities. Indeed, in
the above case, it might be interesting to consider the creation of two different
bitmap join indexes instead of just one as this could led to an explosion in the
number of indexes (one for each query). Whereas, creating one bitmap join index
on the SALES table for CUST_GENDER and one bitmap join index on SALES for
PROD_CATEGORY also allow the optimizer to use both indexes for the above
query. This will be, of course, less efficient than the above solution but has the
great advantage to also allow queries only between SALES and CUSTOMERS, or
SALES and PRODUCTS for example.

Bitmap Join Indexes Restrictions
• Parallel DML is currently only supported on the fact

table
• Only one table can be updated concurrently
• No table can appear twice in the join
• Can’t be created on index-organized tables or
temporary tables
• The keys must all be part of the dimension tables
• The dimension table join columns must be either :
– primary key columns OR
– have unique constraints
• If a dimension table has composite primary key, each
column in the primary key must be part of the join.
®
5-7
Bitmap Join Indexes Restrictions

Due to the necessity of storing the results of join results, bitmap join indexes have
the following restrictions:
• Parallel DML is currently only supported on the fact table. Parallel DML on
one of the participating dimension tables will mark the index as unusable.
• Only one table can be updated concurrently by different transactions when
using the bitmap join index.
• No table can appear twice in the join.
• You cannot create a bitmap join index on an index-organized table or a
temporary table.
• The columns in the index must all be columns of the dimension tables.
• The dimension table join columns must be either primary key columns or
have unique constraints.
• If a dimension table has composite primary key, each column in the primary
key must be part of the join.

New First Rows Optimization
New way :
– OPTIMIZER_MODE =
– FIRST_ROWS_1
– FIRST_ROWS_10
– FIRST_ROWS_100
– FIRST_ROWS_1000
– ALTER SESSION SET OPTIMIZER_GOAL =
FIRST_ROWS_n
– /*+ FIRST_ROWS(x) */
– x can be any positive integer
5-8
New First Rows Optimization

Oracle9i introduces a new way of doing first rows optimization. The old mode was
partially based on heuristics, such as always use an index if possible. These
heuristics could sometimes lead to very bad plans.
The new mode is completely cost based and allows optimizing for a particular
number of first rows, e.g., the first 10 rows.
It's invoked as an argument to the initialization parameter optimizer_mode or
the session parameter optimizer_goal, which allows the following values:
•first_rows_1
•first_rows_10
•first_rows_100
•first_rows_1000
In addition, the FIRST_ROWS hint now takes a numeric argument that is not
limited to the values for the parameter. For instance, you could say
/*+FIRST_ROWS(20)*/
Without a numeric argument, both the parameter and the hint imply the old first
rows behavior, which is retained for backwards compatibility and plan stability.

New Gathering Statistic Estimates
New auto-sampling functionality with DBMS_STATS:
• DBMS_STATS.AUTO_SAMPLE_SIZE: New possible value for
estimate_percent parameter. Oracle decides on the
percentage necessary to ensure accurate statistics
collections
• For histograms, DBAs can now specify the following new
size options in the method_opt parameter:
– REPEAT: New histogram with same number of buckets
– AUTO: New histogram based on data distribution AND
application workload
– SKEWONLY: New histogram based on data distribution
EXECUTE DBMS_STATS.GATHER_SCHEMA_STATS(
ownname => 'OE',
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
method_opt => 'for all columns size AUTO');
®
5-9
New Gathering Statistic Estimations

Because the cost-based approach relies on statistics, users should generate statistics
for all tables, clusters, and all indexes accessed by SQL statements before using the
cost-based approach. If the size and data distribution of the tables change
frequently, then users should regenerate these statistics regularly to ensure the
statistics accurately represent the data in the tables.
Exact statistics computation requires enough space to perform scans and sorts of
involved objects. If there is not enough space in memory, then temporary space
might be required. Thus, it is also possible to compute only estimations in order to
reduce resources needed to gather statistics. The difficulty in computing estimated
statistics is to find the best sample size. Some statistics are always computed
exactly, such as the number of data blocks currently containing data in a table or
the depth of an index from its root block to its leaf blocks. Nevertheless, this is not
true for all statistics.
With Oracle9i, Oracle Corporation recommends setting the
ESTIMATE_PERCENT parameter of the DBMS_STATS gathering procedures to
the new value DBMS_STATS.AUTO_SAMPLE_SIZE. This is introduced to
maximize performance gains while achieving necessary statistical accuracy
avoiding the extremes of collecting inaccurate statistics and wasting valuable time.

New Gathering Statistic Estimations (Continued)
Also, Oracle9i introduce new possible values for the method_opt parameters of
the dbms_stats gathering procedures:
• If the size option is set to REPEAT and the column has a histogram with b
buckets, Oracle will attempt to create a new histogram with b buckets. If the
column has no histogram, Oracle will not create a histogram. This option is
used to maintain the same “class” of statistics (histogram/no-histogram) when
looking at new data.
• If the size is set to AUTO, Oracle will decide to create a histogram based
on the data distribution AND the way the column is being used by the
application. This mean that Oracle not only looks at non-uniformity in value
repetition counts (skew) but also to non-uniformity in range (sparsity). If the
application has yet to be run for a sufficient amount of time to capture the
workload involving this column, it would be better to use the SKEWONLY
option temporarily.
• If the size is set to SKEWONLY, Oracle will decide to create a histogram
based solely on the data distribution (regardless of how the application uses the
column). This option is useful when gathering statistics for the first time (before
the workload has had time to run). Using SKEWONLY can add quite a bit of
overhead to statistics collection, so Oracle recommends that customers use
AUTO after the application has run for a while.
The above example shows you how to collect all table, column, and index statistics
for OE’s schema where Oracle decides what the sampling percentage should be
and when histograms are necessary (assuming that the workload has run for a
while).
Note: Oracle capture workload information for a cursor when it is hard parsed.
Information are stored in the SGA and regularly flushed to disk. No access to these
memory and disk structures is provided in Oracle9i.

Optimizer Cost Model Enhancements
The Cost model now :

• Give more meaningful cost estimates. PLAN_TABLE
contains three new columns:
– CPU_COST: The estimated CPU cost of the operation
– IO_COST: The estimated I/O cost of the operation
– TEMP_SPACE: The estimated temporary space, in bytes,
used by the operation
• Includes CPU and network usage
• Accounts for the effect of caching
• Account for index prefetching
5-11
Optimizer Cost Model Enhancements

The role of a query optimizer is to produce the best performing execution plan for a
given query. This process includes selecting access paths for single tables, join
order if more than one table is involved in the query, and the implementation of
each join operation (i.e, join method).
Currently, a user has a choice between using the rule-base optimizer (RBO) and the
cost-based optimizer. The cost-based optimizer uses a cost model (a set of cost
functions) to choose between alternative access paths, join order or join methods
while the RBO uses a set of simple rules.
The cost-based optimizer compares the cost of several alternatives and selects the
one with the lowest cost. In addition to the cost model the cost-based optimizer
uses a size model in order to derive statistics on intermediate tables, e.g, cardinality
of the result of a join operation for example.
Both the cost and the size model use statistics on the objects manipulated by the
query. Those statistics are produced by using the DBMS_STATS package, and are
stored in the database dictionary.
The quality of the execution plan produced by the optimizer is dependent on the
accuracy of both the cost and the size model.
The current version of both models contains several limitation both in terms of
accuracy and completeness. For example, the size model assumes independence of
columns when computing the selectivity of multiple predicates on different
columns and the cost model accounts only for IO activities.

Optimizer Cost Model Enhancements (Continued)
The cost model is extended to take into account the following :
• Allow users or developpers to convert the cost information into meaningfull.
The PLAN_TABLE contains three new columns :
– CPU_COST: The CPU cost of the operation as estimated by the
optimizer's cost-based approach. For statements that use the rule-based
approach, this column is null. The value of this column is proportional to the
number of machine cycles required for the operation
– IO_COST: The I/O cost of the operation as estimated by the optimizer's
cost-based approach. For statements that use the rule-based approach, this
column is null. The value of this column is proportional to the number of
data blocks read by the operation
– TEMP_SPACE: The temporary space, in bytes, used by the operation as
estimated by the optimizer's cost-based approach. For statements that use
the rule-based approach, or for operations that don't use any temporary
space, this column is null
• Include CPU and network usage. CPU usage will be estimated for SQL
functions and operators. Network usage will be estimated when data is shipped
between query servers running on different nodes of a cluster
• Account for the effect of caching on the performance of nested-loops join
• Account for index prefetching. Index prefetching consists in fetching
multiple leaf blocks in a single IO operation.

Gathering System Statistics
• System statistics enable the cost-based Optimizer to

use CPU and I/O characteritics
• System statistics must be gathered on a regular
basis for the cost-based optimizer
• Does not invalidate cached plans
• Gathering system statistics = analyzing system
activity for a specified period of time
• System analysis is provided by the new procedures:
– DBMS_STATS.GATHER_SYSTEM_STATS
– DBMS_STATS.SET_SYSTEM_STATS
– DBMS_STATS.GET_SYSTEM_STATS
®
5-13
Gathering System Statistics

New in Oracle9i, system statistics allow the optimizer to consider a system's I/O
and CPU performance and utilization. For each candidate plan, the optimizer
computes estimates for I/O and CPU costs. It is important to know the system
characteristics to pick the most efficient plan with optimal proportion between I/O
and CPU cost.
System CPU and I/O characteristics depends on many factors and do not stay
constant all the time. Using system statistics management routines, database
administrators can capture statistics in the interval of time when the system has the
most common workload. For example, database applications can process OLTP
transactions during the day and run OLAP reports at night. Administrators can
gather statistics for both states and activate appropriate OLTP or OLAP statistics
when needed. This allows the optimizer to generate relevant costs with respect to
available system resource plans.
When Oracle generates system statistics, it analyzes system activity in a specified
period of time. Unlike table, index, or column statistics, Oracle does not invalidate
already parsed SQL statements when system statistics get updated. All new SQL
statements are parsed using new statistics. Oracle Corporation highly recommends
that you gather system statistics.
The DBMS_STATS.GATHER_SYSTEM_STATS routine collects system statistics
in a user-defined timeframe. You can also set system statistics values explicitly
using DBMS_STATS.SET_SYSTEM_STATS. Use
DBMS_STATS.GET_SYSTEM_STATS to verify system statistics.

Gathering System Statistics Example
• During the first day

EXECUTE DBMS_STATS.GATHER_SYSTEM_STATS(
interval => 120, stattab => 'mystats', statid => 'OLTP');
• During the first night

EXECUTE DBMS_STATS.GATHER_SYSTEM_STATS(
interval => 120, stattab => 'mystats', statid => 'OLAP');
• During subsequent days

EXECUTE DBMS_STATS.IMPORT_SYSTEM_STATS(
Stattab => 'mystats', statid => 'OLTP');
• During subsequent nights

EXECUTE DBMS_STATS.IMPORT_SYSTEM_STATS(
Stattab => 'mystats', statid => 'OLAP');
®
5-14
Gathering System Statistics Example

The above example shows database applications processing OLTP transactions
during the day and running reports at night.
First, system statistics must be collected during the day. Here, gathering ends after
120 minutes and is stored in the mystats table.
Then, system statistics are collected during the night. Gathering ends after 120
minutes and is stored in the mystats table.
Generally, you will use the above syntax to gather the system statistics. In that case,
you must be sure, before invoking the GATHER_SYSTEM_STATS procedure with
the INTERVAL parameter specified, to activate job processes using a command
like: alter system set job_queue_processes = 1;
Alternatively, you can also invoke the same procedure with different arguments to
enable manual gathering instead of using jobs. For syntax information refer to the
Oracle9i Supplied PL/SQL Packages Reference Release 9.0.0 – BETA December
2000 Part No. A86815-01.
If appropriate, you can switch between the statistics gathered. Note that it is
possible to automate this process by submitting a job to update the dictionary with
appropriate statistics: During the day, a job may import the OLTP statistics for the
daytime run and during the night, an other job imports the OLAP statistics for the
nighttime run.

Summary Management Changes for
Oracle9i
• Explain Materialized View

• Explain Query Rewrite
• More rewrite with fewer and smaller MVs
• Fast Refresh after SQL DML on all types of
materialized views
• Summary Advisor, new workloads and no
dimensions needed
5-15
Summary Management Changes for Oracle9i

In the rest of this lesson we are talking about Oracle9i new features relating to
Summary Management :
• Users can now explain Materialized Views (MVs) in order to better
understand what king of rewrite techniques Oracle can use. This feature also
allows the user to understand why a potential MV can’t be fast refreshable for
example
• Users can now explain query rewrite in order to better understand why a
query rewrite did not occur
• Lots of new rewrite technique have been introduced in Oracle9i; especially
MVs filtering or containment
• Oracle9i now supports fast refresh of MVs under lots of currently
unsupported scenarios
• Oracle9i improved dramatically the Summary Advisor by allowing user-
defined and SQL cache workloads to be loaded into the system. Also, there is
no more the need for the analyzed tables and MVs to have dimensions and
constraints defined on them.

Explain Materialized View
• New procedure DBMS_MVIEW.EXPLAIN_MV

accepts a:
– Materialized View name OR
– SQL statement
• Advises what is/is not possible with this MV or
potential MV before you create it
• Results are stored in MV_CAPABILITIES_TABLE
(relational table) or in a VARRAY
• utlxmv.sql must be executed in the current
schema to create MV_CAPABILITIES_TABLE
5-16
Explain Materialized View

With Oracle8i, it is very difficult for a user to understand why fast refresh is not
possible for a particular Materialized View (MV). The purpose of the “explain
materialized view” procedure, available with Oracle9i, is to advise what is and is
not possible with a given MV or potential MV. This package advises the user in
the following ways:
• Is this materialized view fast refreshable ?
• What types of query rewrite Oracle can do with this MV ?
The process for using this package is very simple. The procedure
DBMS_MVIEW.EXPLAIN_MV is called, passing in as parameters, the schema
and MV name for an existing MV. Alternatively, you can specify the select
string for a potential MV. The MV or potential MV is then analyzed and the
results are written into either a table called MV_CAPABILITIES_TABLE,
which is the default, or to a VARRAY of type ExplainMVArrayType called
MSG_ARRAY.
Note: You must run the utlxmv.sql script prior to calling EXPLAIN_MV
except when you are only concerned with VARRAYs. The script is found in the
admin directory. In addition, you must create MV_CAPABILITIES_TABLE in
the current schema.

Explain Materialized View Example
EXECUTE dbms_mview.explain_mv ('123', 'SH', 'sales_sum');
SELECT capability_name, possible, related_text, msgtxt

FROM mv_capabilities_table
WHERE statement_id = '123' ORDER BY seq;
CAPABILITY_NAME P REL_TEXT MSGTXT
---------------- - -------- ---------------------------
PCT N
REFRESH_COMPLETE Y
REFRESH_FAST N
REWRITE Y
PCT_TABLE N SALES no partition key or PMARKER
in select list
REFRESH_FAST_AFT
ER_INSERT N TIMES mv log must have new values
REWRITE_GENERAL Y
…
®
5-17
Explain Materialized View Example

In the above example, we suppose that the MV_CAPABILITIES_TABLE has
already been created.
Here we want to analyze an already existing MV called sales_sum created in
the SH schema. We need to assign an id for this analysis so that we can retrieve it
afterwards in the MV_CAPABILITIES_TABLE table. We also need to use the
SEQ column in an ORDER BY clause so the rows will display in a logical order.
If a capability is not possible, N will appear in the P column and an explanation
in the MSGTXT column. If a capability is not possible for more than one reason, a
row is displayed for each reason.
In the above example, we can see that Partition Change Tracking is not possible
because the SALES table doesn’t have a partition key or a partition marker in its
select list (this should become clearer at the end of this lesson).
Also, from the above, we can see that because the MV Log of the TIMES table
does not have new values, Oracle won’t be able to refresh fast this MV.

Explain Query Rewrite
• Goal is to explain why query didn’t rewrite?

• Use procedure DBMS_MVIEW.EXPLAIN_REWRITE
• Gives a reason why rewrite was not possible
• Parameters
– SQL statement OR
– Materialized View name
• Results are stored in REWRITE_TABLE
(relational table) or in a VARRAY
• utlxrw.sql must be executed in the current
schema to create REWRITE_TABLE
5-18
Explain Query Rewrite

With Oracle8i, the users of query rewrite do not have any tools to find out why
query rewrite failed to occur. They also don't have any ways to find out why
query rewrite algorithm did not choose a particular materialized view, even
though it may appear to be the right choice. The rules governing query rewrite
eligibility are quite complex, involving various factors such as constraints,
dimensions, query rewrite integrity modes, freshness of the materialized views,
and the types of queries themselves. In addition, you may want to know why
query rewrite chose a particular materialized view instead of another.
To help with this matter, Oracle9i provides a PL/SQL procedure
(DBMS_MVIEW.EXPLAIN_REWRITE) to advise you when a query can be
rewritten and, if not, why not. Using the results from
DBMS_MVIEW.EXPLAIN_REWRITE, you can take the appropriate action
needed to make a query rewrite if at all possible.
Once again, you can obtain the output from
DBMS_MVIEW.EXPLAIN_REWRITE in two ways:
• A table named REWRITE_TABLE created by executing the Oracle-
supplied script utlxrw.sql
• A VARRAY called OUTPUT_ARRAY of type RewriteArrayType
Note: The query specified in the EXPLAIN_REWRITE statement is never
actually executed

Explain Query Rewrite Example
BEGIN
qrytext:='SELECT cust_last_name, SUM(amount_sold)
FROM sales s, customers c
WHERE s.cust_id = c.cust_id
GROUP BY cust_last_name';
…
dbms_mview.explain_rewrite (qrytext,'124','sales_mv','SH');
…
END;
SELECT message FROM rewrite_table

WHERE statement_id = '124' ORDER BY sequence;
MESSAGE
------------------------------------
QSM-01001: query rewrite not enabled
5-19
Explain Query Rewrite Example

In this example, the user tries to understand why his query, defined in the
qrytext PL/SQL variable, is not rewritten by using the sales_mv MV. As
shown in the above slide, the reason found by Oracle is that query rewrite is not
enabled. Here, the user provides the EXPLAIN_REWRITE procedure with:
• A query text, which represents the query the user wants to get rewritten
• A statement id used to retrieve Oracle analysis inside the
REWRITE_TABLE table. In this example, we assume that this table has
already been created.
• A MV name towards which the user wants its query to be rewritten
• The owner of the MV
Note: The parameters respectively representing the name and schema of the MV
are optional to EXPLAIN_REWRITE. When they are not specified,
EXPLAIN_REWRITE returns any relevant messages regarding all the summaries
considered for rewriting the given query.
On the other hand, if you specify the MV to be used and if in fact, the specified
query is rewritten but is not rewritten using the specified MV, a error message
will be generated.

Fast Refresh Enhancements
• Fast Refresh after SQL DMLs is now possible for
MVs that include:
– aggregation and joins
– subqueries in the top-level FROM list
– CUBE, ROLLUP, and GROUPING SETS
• Fast (Incremental) refresh after partition DDL
operations on detail tables use Partition Change
Tracking
• The following cases need MV logs to be created
with the new SEQUENCE option:
– If MIXED DML has been done to more than
one table
– If a table has had a mixture of direct loads
and MIXED DML other than inserts
®
5-20
Fast Refresh Enhancements

There are two primary ways to refresh a materialized view: complete and fast.
Complete refresh re-executes the materialized view query, thereby completely re-
computing the contents of the materialized view from the detail tables. Fast refresh
uses a variety of incremental algorithms to update the materialized view to take
into account the new and updated data in the detail tables. Because complete
refresh can take hundreds to thousands of times longer to execute than fast refresh,
many data warehouse environments require fast, incremental refresh in order to
meet their operational objectives.
Oracle9i supports fast refresh of MVs under the following (currently unsupported)
scenarios:
• After partition maintenance operations on detail tables (DROP
PARTITION, EXCHANGE PARTITION, TRUNCATE PARTITION, ADD
PARTITION, SPLIT PARTITION, MERGE PARTITION). This is
supported with Partition Change Tracking mecanisms and is discussed in more
details after in this lesson
• After conventional insert, update and delete for MVs that include both
aggregation and joins
• For MVs that include subqueries in the top level FROM list (i. e. inline
views)
• For MVs that employ the CUBE, ROLLUP and GROUPING SET options
The keyword SEQUENCE is new for Oracle9i and is used to create materialized
view logs. If a table has had a mixture of insert and deletes (or updates) done to it,
then we say that it has MIXED DML done to it. If MIXED DML has been done to
more than one table, then the MV log for the tables must have a SEQUENCE
NUMBER column. This column can be added by specifying the SEQUENCE option
of the CREATE MATERIALIZED VIEW LOG command.
Fast Refresh Enhancements (Continued)
Also, if a table has had a mixture of direct loads and conventional DML other than
inserts (i.e. deletes or updates) done to it, then its MV log must have a SEQUENCE
NUMBER column. Otherwise incremental refresh is not supported.

Summary Advisor Enhancements
• New Workload sources stored in MVIEW_WORKLOAD
– User defined: load_workload_user
– SQL cache: load_workload_cache
• Filter Workload : add_filter_item
– Application name
– Date a query was last used
– Number of times a query is executed
– …
• Enhanced recommendation engine:
– Dimension Advisor
– Diverse-schema: Dimensions and Constraints are optional
– Support for exact match and partial exact-match rewrite
• New HTML Reports: advisor_detail_report
• SQL script generation: generate_mv_script
®
5-22
Summary Advisor Enhancements

One of the problems with having materialized views in your system is that it is
difficult for the DBA to determine which ones to keep and which ones to discard.
To resolve this problem, summary management comes with a Summary Advisor
function which recommends what to do with your materialized views.
The Summary Advisor can be run via by calling a PL/SQL package
recommend_mv_w or using the Summary Advisor Wizard in Oracle Enterprise
Manager. Although it can be used standalone, the best results are obtained when a
workload is provide. In Oracle8i the query workload is collected via Oracle Trace.
New for Oracle9i is the ability to define a workload in the following two new
ways:
• User Supplied
• Current contents of the SQL Cache
Multiple workloads can now be loaded into a SYSTEM table called
MVIEW_WORKLOAD. The workloads are loaded using the following new
DBMS_OLAP procedures: load_workload_trace,
load_workload_user or load_workload_cache. When a workload is
used for recommending materialized views, the entire workload does not have to
be considered. The DBMS_OLAP procedure add_filter_item can be used to
specify filtering options. Thus enabling a workload to be filtered by; application
name, cardinality of the base table, date a query was last used, the number of times
a query is executed, the database users executing queries, the priority of a query,
the base tables referenced in a query, and the query response time. The Summary
Advisor in Oracle Enterprise Manager also allows you to specify the filtering
values.

Summary Advisor Enhancements (Continued)
When it is time to use the package recommend_mv_w, to generate its
recommendations on the materialized views, you may specify which one of the
workloads is to be used and optionally whether it is to be filtered. Therefore, there
is now the ability to store multiple workloads, thus allowing the DBA to perform
‘what-if’modeling on various scenarios.
A new Dimension Advisor analyzes a user’s workload history, metadata and
database table contents and generates recommendations for dimensions.
Other enhancements to the Advisor engine include :
• diverse-schema support: dimensional information and constraint information
are made optional
• support for exact match and partial exact-match summary: In Oralce8i, we
only support structural queries. For Oracle9i, we matches the rewrite engine’s
support for exact and partial exact match.
A detailed HTML report can now be generated by using the DBMS_OLAP
procedure advisor_detail_report about the recommendations from the
Summary Advisor. An alternative to the recommend_mv_w package, is to use
the Summary Advisor in Oracle Enterprise Manager which will not only show the
recommendations from the recommend_mv_w package, but will also implement
them. When the procedures recommend_mv and recommend_mv_w are called
directly the DBMS_OLAP procedure generate_mv_script must be used to
create a script which will implement the advisors recommendations.

New DBMS_OLAP Procedures
validate_workload Procedure
Table
Workload
SQL User Oracle

Cache Defined Trace
add_filter_ item purge_filter
load_ workload
MVIEW_FILTER purge_results
MVIEW_WORKLOAD
advisor_detail_report Report
purge_ workload
MVIEW_RECOMMENDATIONS
recommend_mv_w Script
generate_mv_script
®
5-24
New DBMS_OLAP Procedures

• MVIEW_WORKLOAD: This table represents basically the Advisor Workload
Repository (AWR). The Advisor performs best when a workload based on usage is
available. The AWR is capable of storing multiple workloads, so that the different
uses of a real-world data warehousing environment can be viewed over a long
period of time. When the workload is loaded using the appropriate load_workload
procedure, it is stored in a new workload repository in the SYSTEM schema called
MVIEW_WORKLOAD.
• PURGE_WORKLOAD: A specific workload can be removed by calling this
procedure
• LOAD_WORKLOAD_USER: A user-defined workload is loaded using the
procedure LOAD_WORKLOAD_USER procedure. The actual workload is defined in
a separate table called USER_WORKLOAD. This table must be explicitlly create
before in the user schema. The user must also explicitly insert the corresponding
workload.
• LOAD_WORKLOAD_TRACE: A workload collected by Oracle Trace is loaded
using the procedure LOAD_WORKLOAD_TRACE. When collection is complete,
Oracle Trace automatically formats the Oracle Trace log file into a set of relations,
which have the predefined synonyms beginning with V_ (most of the time these
tables are refered to as V-Tables).
• LOAD_WORKLOAD_CACHE: You obtain a SQL cache workload using the
procedure LOAD_WORKLOAD_CACHE. This procedure captures the current
workload in the SQL cache.
• VALIDATE_WORKLOAD_…: Prior to loading a workload, one of the three
VALIDATE_WORKLOAD procedures: VALIDATE_WORKLOAD_USER,
VALIDATE_WORKLOAD_CACHE, and VALIDATE_WORKLOAD_TRACE may be
called to check that the workload exists. This procedure does not check that the
contents ofOracle9i
the workload are valid,Intelligence
Business it merely checks that Page
(Beta) the workload
5-24 exists.
New DBMS_OLAP Procedures (Continued)
• ADD_FILTER_ITEM: The entire contents of a workload do not have to be
used during the recommendation process. Any workload can be filtered by creating
a filter item using the procedure ADD_FILTER_ITEM. The supported Advisor
Filter Items have already bean discussed in the previous slide.
• PURGE_FILTER: A filter can be removed at anytime by calling the procedure
PURGE_FILTER. This will remove the corresponding filter in the Filter Advisor
Repository table: MVIEW_FILTER.
• GENERATE_MV_SCRIPT: When the Summary Advisor is run using Oracle
Enterprise Manager the facility is provided to implement the advisors
recommendations. But when the procedures recommend_mv and
recommend_mv_w are called directly the procedure generate_mv_script
must be used to create a script which will implement the advisors
recommendations.
• ADVISOR_DETAIL_REPORT: A Summary Data Report offers you data about
workloads and filters, and then generates recommendations. The report format is in
HTML format and contains the following information: Activity Journal Details,
Activity Log Details, Materialized View Recommendations, Materialized View
Usage, Workload Collection Details, Workload Filter Details, …
• PURGE_RESULTS: Every time the Summary Advisor is run, a new set of
recommendations is created. When they are no longer required, they should be
removed using the procedure PURGE_RESULTS. You can remove all results or
those for a specific run.
Note: This does not represent an exhaustive list of all the new procedures and
related tables introduce in the Oracle9i Summary Advisor but this should give you
a good idea of the new possibilities. Note also that it is possible to use previously
declared filters before loading workloads.

Containment and Oracle8i: Exact Match
Application Query
SELECT s.store_name,
SUM(f.amount) SELECT
as sum_sales m.store_name,
FROM sales f, store s REWRITE m.sum_sales
WHERE s.store_key = f.store_key FROM
AND s.city = 'Boston' sales_boston m;
GROUP BY s.store_name;
SELECT s.store_name, SUM(f.amount) as sum_sales

FROM store s, sales f
WHERE s.store_key = f.store_key AND
s.city = 'Boston'
GROUP BY s.store_name;
sales_boston Materialized View Query

®
5-26
Containment and Oracle8i: Exact Match

The query rewrite feature must be able to handle common queries used by data warehousing
applications. Several key transformations were implemented in Oracle8i.
A major feature we did not support in Oracle8i was the ability to rewrite a query with materialized
views (MVs) with non-join predicates in the where clause. Such predicates, also known as selections in
relational algebra, select tuples from a table where the tuples satisfy the condition(s) specified in the
predicate. Rewrite with such materialized views in Oracle8i is limited to the case where the text of the
MV and the query match exactly.
Selections in MVs are commonly used to contain the data stored in the MV to a particular subset. For
instance, an MV may need to store data for a particular region only. All queries for that region may then
be rewritten against that MV.
The above slide is an example of what was only possible in Oracle8i if the materialized view was
containing, in its WHERE clause, a predicate other than the joins predicates.
Note: In the above slide and in the following ones, the blue rectangle represents a materialized view
definition. For formatting reasons, it was not possible to include the CREATE MATERIALIZED VIEW
part of the statement. We only concentrate on the SELECT part of the statement which is in fact the
most important part in this series of examples. The gray rectangle corresponds to the query executed by
an hypothetical example. Finally, the orange rectangle corresponds to the rewritten query. This will
never be visible to the user as this is internally generated by Oracle. Anyway, this will give you a good
idea of what happens during the rewriting process in those examples.

Containment and Oracle9i: Example 1
Application Query
SELECT
SELECT city,
city,
SELECT
SELECT s.city,
s.city, SUM(sum_sales)
SUM(sum_sales)
SUM(f.amount)
SUM(f.amount) FROM
FROM total_sales
total_sales
FROM
FROM store
store s,
s, sales
sales ff WHERE
WHERE
WHERE
WHERE s.store_key ==
s.store_key REWRITE store_strength>75
store_strength>75
f.store_key
f.store_key AND
AND AND
AND
s.store_strength
s.store_strength >> 75 75 AND
AND store_strength<90
store_strength<90
s.store_strength
s.store_strength << 90 90
GROUP
GROUP BY
BY s.city
s.city ;; GROUP
GROUP BY
BY city
city ;;
SELECT
SELECT s.city,
s.city, s.store_strength,
s.store_strength, SUM(f.
SUM(f.amount)
amount)
FROM store s, sales
FROM store s, sales f f
WHERE
WHERE s.store_key
s.store_key == f.store_key
f.store_key AND
AND
s.store_strength
s.store_strength >
> 50 AND s.store_strength << 100
50 AND s.store_strength 100
GROUP
GROUP BY
BY s.city,
s.city, s.store_strength;
s.store_strength;
total_sales Materialized View Query

®
5-27

With Oracle9i, it is possible to add non-join predicates in the WHERE clause of the materialized view’s
definition. Here the materialized view contains information for sales by city and store for stores with
more than 50 employes and less than 90 employees.
The application query asks for sales by city for stores with more than 75 employes and less than 90
employees. Because the application is in fact a subset of the materialized view, Oracle can rewrite the
application query to use the above materialized view.
Note:
• Here we didn’t use the AS clause in the select lists to avoide formating problems. Thus, we
assume that SUM(f.amount) column in the MV’s definition is called SUM_SALES.
• The examples described here for the Containment part of the WHERE clause, may also be applied
when a materialized view (and query) has a selection in the HAVING clause.

Application Query
SELECT
SELECT s.city,
s.city, SELECT
SELECT
s.store_strength,
s.store_strength, city,
city,
SUM(f.amount)
SUM(f.amount) store_strength,
store_strength,
FROM
FROM store s,
store s, sales
sales ff
REWRITE SUM(sum_sales)
SUM(sum_sales)
WHERE
WHERE s.store_key
s.store_key == FROM
FROM total_sales
total_sales
f.store_key
f.store_key WHERE
WHERE store_strength
store_strength
AND
AND s.store_strength
s.store_strength between
between 80
80 and
and 85;
85;
between
between 80
80 and
and 85
85
GROUP
GROUP BY
BY s.city,
s.city,
s.store_strength
s.store_strength
SELECT
SELECT s.city,
s.city, s.store_strength,
s.store_strength, SUM(f.
SUM(f.amount)
amount)
FROM
FROM store
store s,
s, sales
sales ff
WHERE
WHERE s.store_key
f.store_key AND
AND
s.store_strength
s.store_strength >> 50
50 AND
s.store_strength << 100
100
GROUP BY s.city, s.store_strength;
GROUP BY s.city, s.store_strength;
total_sales Materialized View Query ®
5-28

Here the query asks for sales for stores with 'between 80 and 85' employees. Oracle is able to
realize that the 'between' condition represents a subset of the materialized view rows. Thus, Oracle can
rewrite the application query using the same technique as for the previous example.
Note: Here we didn’t use the AS clause in the select lists to avoid formatting problems. Thus, we
assume that SUM(f.amount) in the MV’s definition is called SUM_SALES.

Application Query
SELECT
SELECT
COUNT(f.amount)
COUNT(f.amount) SELECT
SELECT
FROM
FROM sales
sales f,
f, product
product pp SUM(m.cnt_sales)
SUM(m.cnt_sales)
WHERE
WHERE f.product_key ==
f.product_key REWRITE FROM
FROM mv_car_color
mv_car_color
p.product_key
p.product_key WHERE
WHERE m.color
m.color in
in
AND
AND p.color in
p.color in ('red',
('red', 'green');
'green');
('red'
('red' ,'green');
,'green');
SELECT
SELECT p.color,
p.color, COUNT(f.amount)
COUNT(f.amount) as
as cnt_sales
cnt_sales
FROM
FROM sales
sales f,
f, product
product pp
WHERE
WHERE f.product_key
f.product_key == p.product_key
p.product_key AND
AND
p.color
p.color in
in ('red'
('red' ,'green'
,'green' ,'blue')
,'blue')
GROUP
GROUP BY
BY p.color;
p.color;
mv_car_color Materialized View Query

®
5-29

The above example illustrates rewrite with the 'IN' operator. Rewrite can occur as long as the in-list
values requested by the query are also in the in-list values of the materialized view.
In the above example, the application query asks for the count of sales for red and green cars. This query
can be rewritten by using only two rows of the above materialized view: The row corresponding to to
the 'red' group, and the other one corresponding to the 'green' group by summing the cnt_sales
column for the two rows.
Note: The 'IN' operator is equivalent to a series of predicates separated by ORs.

Application Query
SELECT
SELECT p.category_id,
p.category_id,
SUM(f.amount)
SUM(f.amount)
SELECT
SELECT category_id,
category_id,
FROM
FROM sales
sales f,
f, product
product pp
WHERE sales
sales
WHERE f.product_key
f.product_key ==
p.product_key REWRITE FROM
FROM mvproducts
mvproducts
p.product_key AND
AND
WHERE
WHERE category_id
category_id
p.category_id
p.category_id
LIKE
LIKE 'ABCD%';
'ABCD%';
LIKE
LIKE 'ABCD%'
'ABCD%'
GROUP
GROUP BY
BY p.category_id;
p.category_id;
SELECT
SELECT p.category_id,
p.category_id, SUM(f.amount)
SUM(f.amount) as
as sales
sales
FROM
FROM sales
sales f,
f, product
product pp
WHERE
WHERE f.product_key
p.product_key
AND
AND p.category_id LIKE 'AB%'
p.category_id LIKE 'AB%'
GROUP
GROUP BY
BY p.category_id;
p.category_id;
mvproducts Materialized View Query
®
5-30

This example shows you that it is possible for Oracle to rewrite a query containing the ‘LIKE’operator.
Here, it is possible for Oracle to rewrite the query because the condition ‘AB%’, which is part of the
materialized view definition, is a subset of the application query’s predicate: ‘ABC%’.
Note: For now, only patterns like the above are supported (‘ABC%’). Some support is also provided for
‘_’.

Application Query
SELECT
SELECT s.city,
s.city, SELECT
SELECT city,
city,
SUM(f.amount)
SUM(f.amount) sum_sales
sum_sales
FROM
FROM store
store s,
s, sales
sales ff FROM
FROM mv_non_null
mv_non_null
WHERE
s.store_key REWRITE WHERE
WHERE
f.store_key
f.store_key store_strength
store_strength
AND
s.store_strength
between
between 50
50 and
and 100
100 between
between 50
50 and
and 100
100
GROUP
GROUP BY
BY s.city;
s.city;
SELECT
SELECT s.city,
s.city, SUM(f.amount)
SUM(f.amount) asas sum_sales,
sum_sales,
s.store_strength
s.store_strength
FROM
FROM store
store s,
s, sales
sales ff
WHERE
WHERE s.store_key == f.store_key
s.store_key f.store_key AND
AND
s.store_strength
s.store_strength IS
IS NOT
NOT NULL
NULL
GROUP
GROUP BY
BY s.city;
s.city;
mv_non_null Materialized View Query
®
5-31

A materialized view may filter out rows with NULL columns using the IS NOT NULL operator. Such a
materialized view can be used to rewrite a query asking for any non-null range of the column values.
This is illustrated by the above example where we consider the MV_NON_NULL materialized view
which filters out rows where store_strength is NULL.
Note: In real-life, a store with a NULL store_strength may represent an online store.

Application Query
SELECT
SELECT s.city,
s.city,
SUM(f.amount)
SUM(f.amount)
FROM
FROM store
store s,
s, sales
sales ff SELECT
SELECT city,
city,
WHERE
s.store_key SUM(sum_sales)
SUM(sum_sales)
f.store_key
f.store_key REWRITE FROM
FROM sales_hightax
sales_hightax
AND
AND (s.city_tax ++
(s.city_tax WHERE
WHERE total_tax
total_tax >> 55
s.regional_tax
s.regional_tax ++ GROUP
GROUP BY city;
BY city;
s.sales_tax
s.sales_tax )) >> 55
GROUP
GROUP BY
BY s.city;
s.city;
SELECT
SELECT s.city,
s.city, s.city_tax
s.city_tax ++ s.regional_tax
s.regional_tax ++ s.sales_tax
s.sales_tax,,
SUM(f.amount) as sum_sales
SUM(f.amount) as sum_sales
FROM
FROM store
store s,
s, sales
sales ff
WHERE
WHERE s.store_key == f.store_key
s.store_key f.store_key
AND
AND (s.city_tax
(s.city_tax ++ s.regional_tax
s.regional_tax ++ s.sales_tax
s.sales_tax )) >> 22
GROUP
GROUP BY s.city, s.city_tax + s.regional_tax+ s.sales_tax;
BY s.city, s.city_tax + s.regional_tax+ s.sales_tax;
sales_hightax Materialized View Query ®
5-32

The rule for rewrite in this case, is that the expression (in purple in the above slide) in the MV and the
query must match exactly. However, the terms in the expression can be rearranged. For example, due to
the commutative property of +, expression (A+B) is equivalent to (B+A). The predicate can also be
comprised of two complex expressions (instead of just one as in the above example).
Also, the range of values in the query must lie within the range of values in the query.
The above query contains a complex predicate (city_tax + regional_tax + sales_tax). It
computes the sales in cities where the total tax on items is greater than 5 percent.
Note:
• Here, the expression (city_tax + regional_tax + sales_tax) in the MV’s
definition should be given the alias total_tax but this was not done for formatting reasons.
• Predicates with subqueries or complex predicates with bind variables are not be supported for
query rewrite in this release.

Application Query
SELECT
SELECT
COUNT(f.amount)
COUNT(f.amount) SELECT
SELECT
FROM
FROM sales
sales f,
f, product
product pp SUM(cnt_sales)
SUM(cnt_sales)
WHERE REWRITE
WHERE f.product_key ==
f.product_key FROM
FROM mv_car_color
mv_car_color
p.product_key
p.product_key WHERE
WHERE color
color == 'red'
'red'
AND
AND p.color == 'red'
p.color 'red' or
or or
or color
color == 'green';
'green';
p.color
p.color == 'green'
'green'
GROUP
GROUP BY
BY p.color;
p.color;
SELECT
SELECT p.color,
p.color, COUNT(f.amount)
COUNT(f.amount) as
as cnt_sales
cnt_sales
FROM sales f, product
FROM sales f, product p p
WHERE
WHERE f.product_key
p.product_key
AND
AND (( p.color = 'red') OR
(( p.color = 'red') OR (p.
(p. color
color == 'green')
'green')
OR
OR (p.
(p. color
color == 'blue'))
'blue'))
GROUP
GROUP BY
BY p.color;
p.color;
mv_car_color Materialized View Query

®
5-33

The above materialized view stores number of cars sold by color for red, green and blue cars. The above
appllication query asks for the count of red cars as well as green cars only. As this is a subset of the
materialized view, Oracle can rewrite the application query.
This rule extends the rewrite rules described in the previous slides to predicates connected using ORs.
We assume that the <selection predicates> in the where clause of the materialized view and
query are of the following form.
( <predicate-1> AND <predicate-2> …) OR (<predicate-3> AND
<predicate-4>…) OR …
We call each group of predicates separated by ORs, a disjunct. For example, (<predicate-1>
AND <predicate-2> …) is a disjunct.
1. Notice that the previous slide specifies the containment rule when the selection predicate is a single
disjunct. Each such disjunct in the query must be contained in some disjunct in the materialized view.
2. The materialized view may have additional disjuncts not containing any disjunct in the query.
3. In-lists can also be treated as a series of simple predicates connected by ORs. For instance, “city
in ('Boston', 'Seattle')” can also be represented as (city = 'Boston') OR (city
= 'Seattle') . Basically, this example is a generalization of example 3.
Note: All the previous examples give you one possibility Oracle has to rewrite a query. You can of
course combine all the previous examples to create your own MVs. Depending on the application query,
Oracle is able to rewrite it by using one or many of the rules presented so far at the same time.

Exact Text Match and Literal Replacement
in Oracle9i
• In Oracle8i release 2 or greater, when

CURSOR_SHARING was set to FORCED, queries
using literals could not find an exact match of an
existing MV, therefore preventing the query from
being re-written
• Oracle9i looks at the query and the literals that are
being substituted with bind variables to determine
if the query can be rewritten using a MV
5-34
Exact Text Match and Literal Replacement

In Oracle8i release 2, via the CURSOR_SHARING initialization parameter, it is possible to replace
literals in incoming queries with system-generated bind variables. Thus two queries that are similar
except for the values of literals will look identical within the server and will be represented by a single
shared cursor. This reduces the number of times we need a hard parse of similar queries.
This feature breaks query rewrite with exact text match: A materialized view does not have any bind
variables. The query originally may not have bind variables but after literal replacement, the query will
contain bind variables. Because the literal replacement is done much before query rewrite, the text of the
query as it appears to query rewrite will no longer match any materialized view. Further, Oracle must
make sure that two queries with different literal values, appearing identical after literal replacement,
must be differentiated within the exact text match algorithm so that Oracle does not get incorrect
rewrites.
In Oracle9i, query rewrite supports exact text match in the presence of literal replacement.

Partition-Level Staleness and Query Rewrite
• In Oracle8i, staleness information about an MV is
maintained as a whole:
– If a specific partition of the detail table is
updated, the entire MV is marked stale
• Oracle9i enhance the meta-data to maintain
staleness at a finer granularity:
– only specific sections of the materialized view
are considered stale
– Query rewrite can use an MV in ENFORCED (or
TRUSTED) for rows considered fresh only
– MV’s fresh rows identified by adding selection
predicates to the MV’s definition
– MV’s fresh rows identified by using partitions
markers in the MV’s definition
®
5-35
Partition-Level Staleness and Query Rewrite

Data Warehouses commonly use partitioning in MVs. In Oracle8i, we maintain staleness
information about an MV as a whole. If a specific partition of the detail table is updated, the entire
MV is marked stale and hence is ineligible for rewrite at the ENFORCED (or TRUSTED unless using
the CONSIDER FRESH command) rewrite integrity level.
In Oracle9i, Oracle use meta-data to maintain staleness at a finer granularity. Thus, when a certain
partition of the detail table is updated, only specific sections of the materialized view are marked
stale. This feature is often called Partition Change Tracking.
The MV must have information that can identify the partition of the table corresponding to a
particular row or group of the materialized view.
The simplest scenario is when the partitioning key of the table is available in the select (and group
by) list of the MV.
Query rewrite can use an MV in ENFORCED (or TRUSTED) mode provided that the rows from the
MV used to answer the query are known to be FRESH.
The fresh rows in the materialized view are identified by adding selection predicates to the
materialized view’s where clause (in the data dictionary). We will rewrite a query with this MV if
its answer is contained within this (restricted) MV.
Instead of the partitioning key, a partition marker (a function that identifies the partition given a
rowid) may be present in the select (and group by list) of the MV. Oracle can use the materialized
view to rewrite queries that require data from only certain partitions (identifiable by the partition-
marker), for instance, queries that reference a partition-extended table-name or queries that have a
predicate specifying ranges of the partitioning keys containing entire partitions.

Partition-Level Staleness and Query Rewrite (Continued)
One big advantage of using the partition marker function instead of the partition
key is that this enables enhanced update Tracking on the MV’s based tables with
significantly less cardinality impact than grouping by the respective partition key
columns.
Note: Partition Change Tracking only applies on partitioned tables using range or
composite with single key. Also, the materialized view must contain the detail
table’s partition key column in its SELECT list and GROUP BY clause (if present).
This feature is also used during MV’s refreshes.

Partition-Level Staleness Example
CREATE
CREATE TABLE
TABLE sales
sales (date_entered
(date_entered DATE,
DATE, …… ))
PARTITION
PARTITION BY
BY RANGE
RANGE (date_entered)
(date_entered)
(( PARTITION
PARTITION fact_part1 VALUES
fact_part1 VALUES LESS
LESS THAN
THAN ('1-APR-1999'),
('1-APR-1999'),
PARTITION
PARTITION fact_part2
fact_part2 VALUES
VALUES LESS
LESS THAN
THAN ('1-JUL-1999'),
('1-JUL-1999'),
PARTITION fact_part3 VALUES LESS THAN ('1-OCT-1999'),
PARTITION fact_part3 VALUES LESS THAN ('1-OCT-1999'),
PARTITION
PARTITION fact_part4
fact_part4 VALUES
VALUES LESS
LESS THAN
THAN ('1-JAN-2000'),
('1-JAN-2000'),
…… );
);
CREATE
CREATE TABLE
TABLE store
store (store_id
(store_id NUMBER,
NUMBER,
region_id
region_id NUMBER,
NUMBER, …… ))
PARTITION BY RANGE (store_id)
PARTITION BY RANGE (store_id)
(( PARTITION
PARTITION store_part1
store_part1 VALUES
VALUES LESS
LESS THAN
THAN (1000),
(1000),
PARTITION store_part2 VALUES LESS THAN (2000),
PARTITION store_part2 VALUES LESS THAN (2000),
PARTITION
PARTITION store_part3
store_part3 VALUES
VALUES LESS
LESS THAN
THAN (MAXVALUE));
(MAXVALUE));
5-37

This example, creates two partitioned tables named SALES and STORE.

1. Create the STORE_ID_SALES MV (only the select

part of the create materialized view command)
SELECT
SELECT s.store_id,
s.store_id, SUM(f.amount)
sum_sales
FROM store s, sales
WHERE
WHERE s.store_key
f.store_key
GROUP BY s.store_id;
2. Insert data in the second partition of the

STORE_DIM table (1000<=STORE_ID<2000)
3. MV’s definition reflect fresh rows
SELECT
SELECT s.store_id,
sum_sales
FROM store s, sales
WHERE
WHERE s.store_key
f.store_key AND
AND
((
(( s.store_id < 1000 ) OR (s.store_id >=
s.store_id < 1000 ) OR (s.store_id >= 2000))
2000))
GROUP
GROUP BY
BY s.store_id;
s.store_id;
®
5-38
Partition-Level Staleness Example (Continued)

1. Create a materialized view named STORE_ID_SALES that references the two tables. Note that
STORE_ID_SALES includes the partition key column from table STORE (STORE_ID) in both its
select and group by lists. This enables Enhanced Update Tracking on table STORE for materialized
view STORE_ID_SALES. Note also that for formatting reasons only the SELECT part of the CREATE
MATERIALIZED VIEW command is shown.
2. Suppose new data for STORE was inserted into the second partition of the STORE detail table. With
Oracle8i, until a refresh is done, the MV is stale and cannot be used for rewrite in enforced mode.
3. With 9i, After the previous insert is done, Oracle modifies the MV’s definition in the data
dictionary to reflect the fresh rows. They can be represented by modifying the MV’s defining query as
in the above slide in step 3.

1. The following query :

SELECT
SELECT s.store_id,
sum_sales
FROM
FROM store
store s,
s, sales
sales ff
WHERE
WHERE s.store_key
f.store_key
AND
AND s.store_id
s.store_id >> 15
15 and
and s.store_id
s.store_id << 950
950
2. Can be rewritten (ENFORCED or TRUSTED):

SELECT
SELECT m.store_id,
m.store_id, m.sum_sales
m.sum_sales
FROM
FROM store_id_sales mm
store_id_sales
WHERE
WHERE m.store_id
m.store_id >> 15
15 AND
AND m.store_id
m.store_id << 950;
950;
5-39

If we consider the above query (step 1) which asks for sales in a certain range of store_ids, we
know that those ranges of rows in the MV are fresh and hence we can rewrite the above query with the
MV. Note that the rewrite uses the data containment rules described previously in this lesson.
The rewritten query looks as the above in step 2.

1. Create the STORE_ID_SALES_2 MV :

SELECT
SELECT s.city,
s.city, s.store_name,
s.store_name,
SUM(f.amount)
SUM(f.amount) sum_sales,
sum_sales,
dbms_mview.partition_marker(f.rowid)
dbms_mview.partition_marker(f.rowid) as
as p_marker
p_marker
FROM
FROM store
store s,
s, sales
sales ff
WHERE
WHERE s.store_key
f.store_key
GROUP
GROUP BY
BY s.city,
s.city, s.store_name,
s.store_name,
dbms_mview.partition_marker(f.rowid);
dbms_mview.partition_marker(f.rowid);
2. Insert data in the second partition of the

STORE_DIM table (1000<=STORE_ID<2000)
3. From the dictionary, Oracle knows that the first
partition of STORE table is fresh in the MV
5-40

Instead of the partitioning key, a partition marker (a function that identifies the partition given a rowid)
may be present in the select (and group by list) of the MV. We can use the materialized view to rewrite
queries that require data from only certain partitions (identifiable by the partition-marker), for instance,
queries that reference a partition-extended table-name or queries that have a predicate specifying ranges
of the partitioning keys containing entire partitions.
Here, because the partition key of the STORE table is not in the MV’s definition, Oracle will not change
the MV’s definition based on the STORE modifications as in the previous example.
However, after step 2 on the above slide, Oracle knows that the rows corresponding to the second
partition in the STORE table are no more fresh.
Note: As described previously, Enhanced Update Tracking requires sufficient information in the
materialized view to be able to correlate each materialized view row back to its corresponding detail row
in its source partitioned detail table. In general, this can be accomplished by including the detail table
partition key columns in the select list and, if GROUP BY is used, in the group by list. This has the
unfortunate effect of significantly increasing the cardinality of the materialized view, depending on the
desired level of aggregation and the distinct cardinalities of the partition key columns. In many cases,
the advantages of Enhanced Update Tracking will be offset by this restriction for highly aggregated
materialized views.
The PARTITION_MARKER function addresses this problem. It returns a partition identifier that
uniquely identifies the partition for a specified row within a specified partition table. The returned
identifier is not guaranteed to distinguish partitions between different partitioned tables. The
PARTITION_MARKER function can be used in lieu of the partition key column(s) in the select and
group by lists.

The effect is to reduce the cardinality impact requirements of Enhanced Update
Tracking significantly. For instance, if the distinct cardinality of the partition key
columns is 10,000, the cardinality impact of including the partition key columns
in the select and group by lists upon the materialized view is a factor of 10,000.
However, if the partitioned tables consists of 10 partitions, the cardinality impact
of including the PARTITION_MARKER function in the select and group by lists is
only a factor of 10.

1. The following query :

SELECT
SELECT s.city,
s.city, SUM(f.amount)
SUM(f.amount)
FROM
FROM store s, sales ff
store s, sales
WHERE
WHERE s.store_id
s.store_id << 1000
1000
AND
AND s.store_name = 'Sears'
s.store_name = 'Sears'
GROUP
GROUP BY
BY s.city;
s.city;
2. Can be rewritten :
SELECT
SELECT m.city,
m.city, SUM(m.sum_sales)
SUM(m.sum_sales)
FROM
FROM store_id_sales mm
store_id_sales
WHERE
WHERE m.p_marker
m.p_marker == 'STORE_PART1'
'STORE_PART1'
AND
AND m.store_name = 'Sears'
m.store_name = 'Sears'
GROUP
GROUP BY
BY m.city;
m.city;
5-42

From the previous slide, Oracle knows that the first partion of the STORE table is still fresh in the MV.
Oracle can rewrite the above query using the previous materialized view. This query restricts the data to
the first partition of the STORE table and further selects only certain values of store_name.

OLAP Operators Rewrite : Example 1
Application Query
SELECT
SELECT city,
city,
SELECT
SELECT s.city,
s.city, s.zipcode,
s.zipcode, zipcode,
zipcode,
SUM(f.amount)
sum_sales
FROM
FROM store
store s,
s, sales
sales ff
REWRITE FROM
FROM mv_cube
mv_cube
WHERE
s.store_key
WHERE
WHERE cc == 00 and
and
f.store_key
f.store_key zz == 0;
0;
GROUP
GROUP BY s.city,
BY s.city, s.zipcode;
s.zipcode;
SELECT
SELECT s.city,
s.city, s.zipcode,
s.zipcode,
SUM(f.amount)
SUM(f.amount) as
as sum_sales,
sum_sales,
GROUPING(s.city)
GROUPING(s.city) as
as c,
c, GROUPING(s.zipcode)
GROUPING(s.zipcode) as
as zz
FROM store s, sales
WHERE
WHERE s.store_key
f.store_key
GROUP
GROUP BY CUBE (s.city, s.zipcode);
BY CUBE (s.city, s.zipcode);
mv_cube Materialized View Query

®
5-43

A key aspect of this approach is to be able to rewrite with MVs that materialize a CUBE or a partial
cube. An MV with a GROUP BY CUBE/ROLLUP clause can answer queries with a regular GROUP
BY clause. By supporting incremental refresh and general rewrite of these MVs, Oracle9i has a much
better OLAP/DSS capability directly inside the kernel.
Oracle9i support the following types of general rewrite in this area:
• An MV with GROUP BY CUBE clause is able to rewrite a query with GROUP BY or GROUP
BY CUBE/ROLLUP clauses.
• An MV with GROUP BY ROLLUP clause is able to rewrite a query with GROUP BY or GROUP
BY CUBE/ROLLUP clauses.
• An MV with GROUPING SETS is able to rewrite queries that request GROUP BY columns in
the materialized GROUPING SETS.
In the above example, note that GROUPING(city) and GROUPING(zipcode) are required so as to
identify the rows at the appropriate grouping level and distinguish them from actual null values.
Similar rewrites could be done with MV having GROUP BY ROLLUP.

SELECT
SELECT
Application Query product,
product, region,
region,
SELECT
SELECT f.product,
f.product, sum_sales
sum_sales
f.region,
f.region, FROM
FROM mv_grouping_id
mv_grouping_id
sum(f.sales)
sum(f.sales) as
as sum_sales
sum_sales REWRITE WHERE
WHERE gid
gid ==
FROM salesTable
FROM salesTable FF <id
<id of the
of the
GROUP
GROUP BY
BY product,
product, region;
region; (product,
(product, region)
region)
group>;
group>;
SELECT
SELECT product,
product, region,
region, time,GROUPING_ID()
time,GROUPING_ID() as
as gid
gid,,
sum(sales)
sum(sales) as
as sum_sales
sum_sales
FROM
FROM salesTable
salesTable
GROUP
GROUP BY
BY GROUPING
GROUPING SETS
SETS ((product,
((product, region,
region, time),
time),
(product,
(product, region),
region),
(product));
(product));
mv_grouping_id Materialized View Query

®
5-44

The GROUPING_ID function is a new function that will gives the group identifier corresponding to the
GROUPING_SETS clause.
If GROUP BY list contains many expressions, then to determine the GROUP BY level of a particular
row, we would have to select GROUPING function on each of the expressions as with the previous
example. This is not very easy to use and dramatically increases the number of columns required thus,
wasting lot of storage space in a MV.
To address this, Oracle9i introduces an Oracle specific function, GROUPING_ID(), which returns a
number corresponding to a particular GROUPING.
The GROUPING_SET clause is also new in Oracle9i and allows efficient analysis data across multiple
dimensions without computing the whole CUBE (computing/materializing the whole CUBE is very
expensive).

Cursor Dependency
• Rewrite can’t use stale MV’s in ENFORCED mode

• Thus, Rewrite may pick a sub-optimal MV or not at
all
• The corresponding cursor get cached
• If the best MV become VALID again, Oracle8i will
not force the cursor’s recompilation and continue
using the cached cursor
• Oracle9i will force to cursor to be automatically
recompiled and will benefit of the MV’s validation
5-45
Negative Cursor Dependency

Rewrite does not use stale materialized views in the ENFORCED integrity level. If the only eligible
MV to rewrite a query is stale, rewrite will not occur in ENFORCED mode. Similarly, rewrite may
pick a sub-optimal MV because the optimal one is stale. Later, if the stale MV is refreshed and
becomes fresh, rewrite is now able to use it for subsequent queries. This was not the case before
Oracle9i.
In Oracle8i, Oracle was still using the previously compiled cursor using a sub-optimal MV or no
MV. This behavior was very confusing.
This mechanism is internally referred to as a negative dependency.

Summary

• Use new bitmap index join mechanism
• Gather system statistics
• Use the enhanced Oracle9i materialized views
• Take advantage of the query rewrite
enhancements
5-46

9ib BI

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

9ib BI

Caricato da

Copyright:

Formati disponibili

Introduction to Business

Intelligence and Data

Oracle9i Business Intelligence (Beta) Page 1-1

Oracle9i Business Intelligence (Beta) Page 1-2

Oracle9i Business Intelligence (Beta) Page 1-3

Protracted implementation and maintenance cycle

The Traditional Way: Fragmented Information Supply Chain

Oracle9i Business Intelligence (Beta) Page 1-4

Scale to the Internet

Oracle9i: Major DW Initiatives

Single business intelligence platform

The New Way: Oracle9i

Oracle9i Business Intelligence (Beta) Page 1-6

Oracle Warehouse Builder

Data Mining Suite

Oracle9i OLAP Services

Data mining algorithms

The New Way: Oracle9i

Oracle9i Business Intelligence (Beta) Page 1-7

Oracle9i Business Intelligence (Beta) Page 1-8

• All Oracle DW products can leverage performance, scalability

Oracle9i Server As a Data Warehouse Platform

Oracle9i Business Intelligence (Beta) Page 1-9

External Oracle9i Express

CWM and Repository

Designer and Enterprise Manager

Oracle Business Intelligence

Oracle9i Business Intelligence (Beta) Page 1-10

Oracle9i Business Intelligence (Beta) Page 2-1

After this lesson, you should be able to:

• Understand the enhanced Oracle9i analytical

Oracle9i Business Intelligence (Beta) Page 2-2

Supplement the power of the relational database

Oracle9i Business Intelligence (Beta) Page 2-3

Key benefits provided by the new functions:

Oracle9i Business Intelligence (Beta) Page 2-4

Two new functions, PERCENTILE_CONT and

Inverse Percentile Functions - Description

Oracle9i Business Intelligence (Beta) Page 2-5

Example: Find the discrete value closest to the 50th

Inverse Percentile Functions - Example

Oracle9i Business Intelligence (Beta) Page 2-6

Example results: Find the discrete value closest to

Inverse Percentile Functions - Results

Oracle9i Business Intelligence (Beta) Page 2-7

• New syntax available for:

What-if Rank and Distribution - Description

Oracle9i Business Intelligence (Beta) Page 2-8

Example: A new worker is hired at a salary of $10,000,

What-if Rank and Distribution - Example

Oracle9i Business Intelligence (Beta) Page 2-9

Example results: Compare how a salary of $10,000

What-if Rank and Distribution - Results

Oracle9i Business Intelligence (Beta) Page 2-10

Enables queries to specify sorted aggregate groups

First and Last Aggregate Values - Description

Oracle9i Business Intelligence (Beta) Page 2-11

Example: Per manager, determine the salary with the

First and Last Aggregate Values - Example

LAST_NAME MANAGER_ID SALARY COMMISSION_PCT

Example results: Per manager, determine the salary

First and Last Aggregate Values - Results