Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Student Guide
D101882GC10
Edition 1.0 | April 2018 | D103824
The information contained in this document is subject to change without notice. If you find any
Graphic Designer problems in the document, please report them in writing to: Oracle University, 500 Oracle
Parkway, Redwood Shores, California 94065 USA. This document is not warranted to be error-
Kavya Bellur free.
1 Introduction
Overview 1-2
New Release Model 1-3
New Version Numbering for Oracle Database 1-4
Practice 1: Overview 1-5
3 Managing Security
Objectives 3-2
Schema-Only Account 3-3
iii
Encrypting Data Using Transparent Data Encryption 3-4
Managing Keystore in CDB and PDBs 3-6
Keystore Management Changes for PDB 3-8
Defining the Keystore Type 3-9
Isolating a PDB Keystore 3-10
Converting a PDB to Run in Isolated Mode 3-11
Converting a PDB to Run in United Mode 3-12
Migrating a PDB Between Keystore Types 3-13
Creating Your Own TDE Master Encryption Key 3-14
Protecting Fixed-User Database Links Obfuscated Passwords 3-15
Importing Fixed-User Database Links Encrypted Passwords 3-16
DB Replay: The Big Picture 3-17
iv
Duplicating On-Premise Encrypted CDB as Cloud Encrypted CDB 4-16
Duplicating Cloud Encrypted CDB as On-Premise CDB 4-17
Automated Standby Synchronization from Primary 4-18
Summary 4-19
Practice 4: Overview 4-20
6 Improving Performance
Objectives 6-2
In-Memory Column Store: Dual Format of Segments in SGA 6-3
Deploying the In-Memory Column Store 6-4
Setting In-Memory Object Attributes 6-5
Managing Heat Map and Automatic Data Optimization Policies 6-6
Creating ADO In-Memory Policies 6-7
Automatic In-Memory: Overview 6-8
AIM Action 6-9
Configuring Automatic In-Memory 6-10
Diagnostic Views 6-11
Populating In-Memory Expression Results 6-12
Populating In-Memory Expression Results Within a Window 6-13
Memoptimized Rowstore 6-14
In-Memory Hash Index 6-15
DBMS_SQLTUNE Versus DBMS_SQLSET Package 6-16
SQL Tuning Sets: Manipulation 6-17
SQL Performance Analyzer 6-18
Using SQL Performance Analyzer 6-19
Steps 6-7: Comparing / Analyzing Performance and Tuning Regressed SQL 6-20
SQL Performance Analyzer: PL/SQL Example 6-21
v
SQL Exadata-Aware Profile 6-23
Summary 6-24
Practice 6: Overview 6-25
8 Sharding Enhancements
Objectives 8-2
System-Managed and Composite Sharding Methods 8-3
User-Defined Sharding Method 8-5
Support for PDBs as Shards 8-7
Improved Oracle GoldenGate Support 8-8
Query System Objects Across Shards 8-9
Consistency Levels for Multi-Shard Queries 8-10
Sharding Support for JSON, LOBs, and Spatial Objects 8-11
vi
Improved Multi-Shard Query Support 8-14
Oracle Sharding Documentation 8-15
Summary 8-16
Practice 8: Overview 8-17
A Database Sharding
Objectives A-2
What Is Database Sharding? A-3
Sharding: Benefits A-4
Oracle Sharding: Advantages A-5
Application Considerations for Sharding A-6
Components of Database Sharding A-8
vii
Oracle Internal & Oracle Academy Use Only
1
Introduction
• This course focuses on the new features and enhancements of Oracle Database.
• This course complements the topics covered in Oracle Database 18c:
– The 5-day Oracle Database 12c: 12.2 New Features for 12.1 Administrators Ed 1
course
– Or the 10-day Oracle Database 12c R2: New Features for Administrators course
— Oracle Database 12c R2: New Features for Administrators Part 1 Ed 1
— Oracle Database 12c R2: New Features for Administrators Part 2 Ed 1
• Previous experience with Oracle Database 12c is required and, in particular, Release 2
(12.2) for a full understanding of many of the new features and enhancements.
This three-day course follows either the 5-day Oracle Database 12c: 12.2 New Features for 12.1
Administrators Ed 1 course or the 10-day Oracle Database 12c Release 2 (12.2) course. Both
courses are designed to introduce the new features and enhancements of Oracle Database 12c
Release 2 (12.2.0.1) that are applicable to the work that is usually performed by database
administrators and related personnel.
The course is designed to introduce the major new features and enhancements of
Oracle Database 18c.
You should not expect to discover in the course all the new features without supplemental reading,
especially the Oracle Database New Features 12c Release 2 (12.2) and Oracle Database Database
New Features 18c documentation.
The course consists of instructor-led lessons, hands-on labs, and tutorials like OBEs (Oracle By
Examples) that enable you to see how certain new features behave.
RUs consolidate fixes for common issues encountered by customers and are highly tested as a
bundle by Oracle before shipping.
RURs primarily contain RU content with sufficient time allowed for the content to be proven in field
by thousands of customer deployments and any regressions resolved.
The table in the slide is an example.
For more information about the availability of Oracle Database 18c for on-prem platforms, refer to the
My Oracle Support (MOS) Release Schedule of Current Database Releases (Doc ID 742060.1).
A CDB fleet is a collection of different CDBs that can be managed as one logical CDB:
• To provide the underlying infrastructure for massive scalability and centralized
management of many CDBs
• To provision more than the maximum number of PDBs for an application
CDB Fleet High_Speed CDB Fleet Low_Speed
cdb1
PDB333 PDB555
• To manage appropriate server resources for PDBs, such as CPU, memory, I/O rate, and
storage systems
Oracle Database 18c introduces the CDB Fleet feature. CDB Fleet aims to provide the underlying
infrastructure for massive scalability and centralized management of many CDBs.
• The maximum number of PDBs in a CDB is 4096 PDBs. A CDB fleet can hold more than
4096 PDBs.
• Different PDBs in a single configuration require different types of servers to function
optimally. Some PDBs might process a large transaction load, whereas other PDBs are used
mainly for monitoring. You want the appropriate server resources for these PDBs, such as
CPU, memory, I/O rate, and storage systems.
• Each CDB can use all the usual database features for high availability, scalability, and
recovery of the PDBs in the CDB, such as Real Application Clusters (RAC), Data Guard,
RMAN, PITR, and Flashback.
• PDB names must be unique across all CDBs in the fleet. PDBs can be created in any CDB in
the fleet, but can be opened only in the CDB where they physically exist.
• The CDB lead in a fleet is the CDB from which you perform operations across the fleet.
• The CDB members of the fleet link to the CDB lead through a database link.
CDB Fleet High_Speed
CDB Lead
cdb3
cdb1 cdb2
PDB3
PDB2
CDB root PDB33
PDB22 PDB333
1
A CDB fleet contains a CDB lead and CDB members. PDB information from the individual CDBs is
synchronized with the CDB lead.
The CDB lead is, from the CDB root, able to:
• Monitor all PDBs of all CDBs in the fleet
• Report information and collect diagnostic information from all PDBs of all CDBs in the fleet
through a cross-container query
• Query Oracle-supplied objects from all PDBs of all CDBs in the fleet
To configure a CDB fleet, define the lead and then the members.
1. To define a CDB as the CDB lead in a CDB fleet, from the CDB root, set the LEAD_CDB
database property to TRUE.
2. Still in the CDB root of the CDB lead, use a common user and grant appropriate privileges.
3. To define other CDBs as members of the CDB fleet:
a) Connect to the CDB root of another CDB.
b) Use a common user identical to the common user used in the lead CDB because you
have to create a public database link using a fixed user.
c) Set the LEAD_CDB_URI database property to the name of the database link to the
CDB lead.
It assumes that the network is configured so that the current CDB can connect to the CDB lead
using the connect descriptor defined in the link.
• Monitoring and collecting diagnostic information across CDBs from the lead CDB
• Querying Oracle-supplied objects, such as DBA views, in different PDBs across the CDB
fleet
PDB22 PDB33
• Serving as a central location where you can view information about and the status of all
the PDBs across multiple CDBs
• The CDB lead in the CDB fleet can monitor PDBs across the CDBs in the CDB fleet. You can
install a monitoring application in one container and use CDB views and GV$ views to
monitor and process diagnostic data for the entire CDB fleet. A cross-container query issued
in the lead CDB can automatically execute in all PDBs across the CDB fleet through the
Oracle-supplied objects.
• Using Oracle-supplied or even common application schema objects in different PDBs (or
application PDBs) across the CDB fleet, you can use the CONTAINERS clause or
CONTAINER_MAP to run queries across all of the PDBs of the multiple CDBs in the fleet.
This enables the aggregation of data from PDBs in different CDBs across the fleet. The
application can be installed in an application root and each CDB in the fleet can have an
application root clone to enable the common application schema across the CDBs.
• The CDB lead can serve as a central location where you can view information about and the
status of all the PDBs across multiple CDBs.
In Oracle Database 18c, when you create a PDB, you can specify whether it is enabled for PDB
snapshots. A PDB snapshot is an archive file (.pdb) containing the contents of the copy of the PDB
at snapshot creation.
PDB snapshots allow the recovery of PDBs back to the oldest PDB snapshot available for a PDB.
This feature extends the recovery beyond the flashback retention period that necessitates database
flashback enabled.
The example in the slide shows a situation where you have to restore PDB1 back to Wednesday.
A use case of PDB snapshots is reporting on historical data. You might create a snapshot of a
sales PDB at the end of the financial quarter. You could then create a PDB based on this snapshot
so as to generate reports from the historical data.
Every PDB snapshot is associated with a snapshot name and the SCN and timestamp at snapshot
creation. The MAX_PDB_SNAPSHOTS database property sets the maximum number of PDB
snapshots for each PDB. The default and allowed maximum is 8. When the maximum number is
reached for a PDB, and an attempt is made to create a new PDB snapshot, the oldest PDB snapshot
is purged. If the oldest PDB snapshot cannot be dropped because it is open, an error is raised. You
can decrease this limit for a given PDB by issuing an ALTER DATABASE statement specifying a
max number of snapshots. If you want to drop all PDB snapshots, you can set the limit to 0.
DATABASE_PROPERTIES
PROPERTY_NAME = MAX_PDB_SNAPSHOTS
PROPERTY_VALUE = 8
By default, a PDB is enabled for PDB snapshots. There are two ways to define PDBs enabled for
PDB snapshot creation:
• Manually: The first example in the slide uses the SNAPSHOT MODE MANUAL clause of the
CREATE PLUGGABLE DATABASE or ALTER PLUGGABLE DATABASE statement. No clause
gives the same result.
• Automatically after a given interval of time: The second example in the slide uses the
SNAPSHOT MODE EVERY <snapshot_interval> [MINUTES|HOURS] clause of the
CREATE PLUGGABLE DATABASE or ALTER PLUGGABLE DATABASE statement. When the
amount of time is expressed in minutes, it must be less than 3000. When the amount of time
is expressed in hours, it must be less than 2000. In the second example in the slide, the
SNAPSHOT MODE clause specifies that a PDB snapshot is created automatically every 24
hours.
Every PDB snapshot is associated with a snapshot name and the SCN and timestamp at snapshot
creation. You can specify a name for a PDB snapshot. The third and fourth examples in the slide
show how to create PDB snapshots manually, even if the PDB is set to have PDB snapshots created
automatically. If PDB snapshots are created automatically, the system generates a name.
PDB Snapshot Carousel After a PDB snapshot is created, you can create a new PDB
from it:
PDB1
SQL> CREATE PLUGGABLE DATABASE pdb1_day_1 FROM pdb1
USING SNAPSHOT <snapshot_name>;
You can create a new PDB from an existing PDB snapshot by using the USING SNAPSHOT clause.
Provide any of the following:
• The snapshot name
• The snapshot SCN at which the snapshot was created
• The snapshot time at which the snapshot was created
When the carousel reaches eight PDB snapshots or the maximum number of PDB snapshots
defined, the oldest PDB snapshot is deleted automatically, whether or not it is in use. There is no
need to materialize a PDB snapshot in carousel, because PDB snapshots are all full clone. Be
aware that if the SNAPSHOT COPY clause is used with the USING SNAPSHOT clause, the
SNAPSHOT COPY clause is simply ignored.
You can manually drop the PDB snapshots by altering the PDB for which the PDB snapshots were
created and using the DROP SNAPSHOT clause.
cdb1
PDB1 PDB1
PDB1b
User Error Drop PDB1
error detected
Close PDB1 Rename PDB1b
to
PDB1
Create PDB1b
PDB1_snapW PDB1_snapT from
at Wednesday at Thursday PDB1_snapW
PDB snapshots enable the recovery of PDBs back to the oldest PDB snapshot available for a PDB.
The example in the slide shows a situation where you detect an error that happened between
PDB1_SNAPW and PDB1_SNAPT creation. To recover the situation, perform the following steps:
1. Close PDB1.
2. Create PDB1b from the PDB1_SNAPW snapshot created before the user error.
3. Drop PDB1.
4. Rename PDB1b to PDB1.
5. Open PDB1 and create a new snapshot.
MAP table
Partition N_AMER Partition APAC Partition EMEA
• Each PDB corresponds to data for a
particular partition.
In Oracle Database 12c, the CONTAINERS clause (table or view) in a query in the CDB root
accesses a table or view in the CDB root and in each of the opened PDBs, and returns a UNION ALL
of the rows from the table or view. This concept is extended to work in an application container.
CONTAINERS (table or view) queried in an application root accesses the table or view in the
application root and in each of the opened application PDBs of the application container.
CONTAINERS (table or view) can be restricted to access a subset of PDBs by using a predicate on
CON_ID. CON_ID is an implicitly generated column of CONTAINERS (table or view).
SELECT fname, lname FROM CONTAINERS(emp) WHERE CON_ID IN (44,56,79);
One drawback of CONTAINERS() is that queries need to be changed to add a WHERE clause on
CON_ID if only certain PDBs should be accessed. Often, rows of tables or views are horizontally
partitioned across PDBs based on a user-defined column.
The CONTAINER_MAP database property provides a declarative way to indicate how rows in
metadata-linked tables or views are partitioned across PDBs.
The CONTAINER_MAP database property is set in the application root. Its value is the name of a
partitioned table (the map object). The names of the partitions of the map object match the names of
the PDBs in the application container. The columns that are used in partitioning the map object
should match the columns in the metadata-linked object that is being queried. The partitioning
schemes that are supported for a CONTAINER_MAP map object are LIST, HASH, and RANGE.
Note: Container maps can be created in CDB root, but the best practice is to create them in
application roots.
In a hybrid model, you can create common partitioned tables in the application root, mapping a
partition of the table to an application PDB of the application container. For example, the
TENANT_GRP1 partition would store data for customers of group1 in the Tenant_GRP1 application
PDB. The TENANT_GRP2 partition would store data for customers of group2 in the Tenant_GRP2
application PDB.
In a data warehouse model, you can create common partitioned tables in the application root, which
are partitioned on a column, such as REGION in the example in the slide, where data is segregated
into separate application PDBs of the application container.
In the example in the slide, the NA partition stores data for AMERICA, MEXICO, and CANADA as
defined in the list, in the NA application PDB. The EMEA partition stores data for UK, FRANCE, and
GERMANY as defined in the list, in the EMEA application PDB.
Because data is segregated into separate application PDBs of the application container, querying a
container map table, for example, data for N_AMER, automatically retrieves data from the NA
application PDB. The query is appropriately routed to the relevant partition and therefore to the
relevant application PDB.
If you need to retrieve data from a table that is spread over several application PDBs within an
application container, use the CONTAINERS clause to aggregate rows from partitions from several
application PDBs.
PDB$SEE Application
DPDB$SEED N_AMER S_AMER APAC EMEA
ROOT
In Oracle Database 18c, when a PDB is created, dropped or renamed, CONTAINER_MAP defined in
CDB root, or application root, or both can be dynamically updated to reflect the change.
The CREATE PLUGGABLE DATABASE statement takes an optional clause that describes the key
values affiliated with the new PDB.
CDB_LOCKDOWN_PROFILES
In Oracle Database 12c, a PDB lockdown profile whose name is stored in the PDB_LOCKDOWN
parameter determines the operations that can be performed in a given PDB. If the PDB_LOCKDOWN
parameter is set to a PDB lockdown profile at the CDB root level, and no PDB_LOCKDOWN
parameter is set at the PDB level, then the PDB lockdown profile defined at the CDB root level
determines the operations that can be performed in all the PDBs.
After the PDB_LOCKDOWN parameter is set, the PDB must be bounced before the lockdown profile
can take effect.
A created PDB lockdown profile cannot derive any restriction rule from another PDB lockdown
profile.
In Oracle Database 18c, you can create lockdown profiles in application roots, and not only in the
CDB root.
If the PDB_LOCKDOWN parameter in a PDB is set to a name of a lockdown profile different from that
in its ancestor, the CDB root or application root for application PDBs, the following governs
interaction between restrictions imposed by these lockdown profiles:
• If the PDB_LOCKDOWN parameter in a regular or application PDB is set to a CDB lockdown
profile, lockdown profiles specified by the PDB_LOCKDOWN parameter respectively in the CDB
root or application root are ignored.
• If the PDB_LOCKDOWN parameter in an application PDB is set to an application lockdown
profile while the PDB_LOCKDOWN parameter in the application root or CDB root is set to a
lockdown profile, in addition to rules stipulated in the application lockdown profile, the PDB
lockdown profile inherits the DISABLE rules from the lockdown profile set in its nearest
ancestor, the CDB root.
• If there are conflicts between rules comprising the CDB lockdown profile and the Application
lockdown profile, the rules in CDB lockdown profile takes precedence. For example, an
OPTION_VALUE clause of a CDB lockdown profile takes precedence over OPTION_VALUE
clause of an Application lockdown profile.
PDB_SALES PDB_HR
PDB_LOCKDOWN = PDB_LOCKDOWN =
base_lock_prof2 dynamic_lock_from_prof2
CDB1
When a PDB lockdown profile is created, it can derive rules from a “base” lockdown profile.
There are two ways of creating lockdown profiles using existing profiles:
• Static lockdown profiles: When the lockdown profile is being created with the FROM clause,
rules comprising the base profile at the time are copied to the static lockdown profile. Any
subsequent changes to the base profile do not affect the newly created static lockdown
profile.
• Dynamic lockdown profiles: When the lockdown profile is being created with the
INCLUDING clause, the dynamic lockdown profile inherits disabled rules comprising base
profile as well as any subsequent changes to the base profile. If rules explicitly added to the
newly created dynamic lockdown profile come into conflict with rules comprising the base
profile, then the latter takes precedence.
Refresh
DB Link Incremental refreshing => Open PDB1_REF_CLONE
Hot Cloned in RO mode:
PDB1
• Manual
Refreshable Copy
The Oracle Database 12c cloning technique copies a remote source PDB into a CDB while the
remote source PDB is still up and fully functional.
Hot remote cloning requires both CDBs to switch from shared UNDO mode to local UNDO mode,
which means that each PDB uses its own local UNDO tablespace.
In addition, hot cloning allows incremental refreshing in that the cloned copy of the production
database can be refreshed at regular intervals. Incremental refreshing means refreshing an existing
clone from a source PDB at a point in time that is more recent than the original clone creation to
provide fresh data. A refreshable copy PDB can be opened only in read-only mode.
Propagating changes from the source PDB can be performed in two ways:
• Manually (on demand)
• Automatically at predefined time intervals
If the source PDB is not accessible at the moment, the refresh copy needs to be updated. Archive
logs are read from the directory specified by the REMOTE_RECOVERY_FILE_DEST parameter to
refresh the cloned PDB.
2. The roles can be reversed: the refreshable clone can be made the primary PDB.
– The new primary PDB can be opened in read/write mode.
– The primary PDB becomes the refreshable clone.
In Oracle Database 18c, after a user creates a refreshable clone of a PDB, the roles can be
reversed. The refreshable clone can be made the primary PDB which can be opened in read/write
mode while the primary PDB becomes the refreshable clone.
The ALTER PLUGGABLE DATABASE command with the SWITCHOVER clause must be executed
from the primary PDB. The refresh mode can be either MANUAL or EVERY <refresh interval>
[MINUTES | HOURS]. REFRESH MODE NONE cannot be specified when issuing this statement.
After the switchover operation, the primary PDB becomes the refreshable clone and can only be
opened in READ ONLY mode. During the operation, the source is quiesced and any redo generated
from the time of the last refresh is applied to the destination to bring it current.
The database link user also has to exist in the primary PDB if the refreshable clone exists in another
CDB.
In the example in the previous slide, the roles can be reversed at any time because none of the
primary and refreshable cloned PDBs are damaged.
In the example of an unplanned switchover, the primary encounters an issue. Then the way to
refresh the refreshable cloned PDB is to first archive the current redo log file and copy the archive
log files to a new directory that you define as the REMOTE_RECOVERY_FILE_DEST. Then once the
PDB is refreshed, you disable the refreshable capability on the former refreshable cloned PDB that
becomes the primary PDB and open it. You can finally re-create a new refreshable cloned PDB from
the former refreshable cloned PDB. Because the original primary PDB is dropped, you can give the
new refreshable cloned PDB the same name of the former primary PDB.
12c
As a clone from another PDB: Copy the data files belonging to the source PDB to the
standby database.
Use the STANDBY_PDB_SOURCE_FILE_DBLINK parameter to specify the name of a
database link which is used to copy the data files from the source PDB to which the
database link points.
The file copy is automatically done only if the database link points to the source
PDB, and the source PDB is open in read-only mode.
In Oracle Database 12c, when you create a PDB in the primary CDB with a standby CDB, you must
first copy your data files to the standby. Do one of the following, as appropriate:
• If you plan to create a PDB from an XML file, the data files on the standby are expected to be
found in the PDB's OMF directory. If this is not the case, then copy the data files specified in
the XML file to the standby database.
• If you plan to create a PDB as a clone, then copy the data files belonging to the source PDB
to the standby database.
The path name of the data files on the standby database must be the same as the path name that
will result when you create the PDB on the primary in the next step, unless the
DB_FILE_NAME_CONVERT database initialization parameter has been configured on the standby. In
that case, the path name of the data files on the standby database should be the path name on the
primary with DB_FILE_NAME_CONVERT applied.
In Oracle Database 18c, you can use initialization parameters to automatically copy the data files to
the standby.
• If you plan to create a PDB from an XML file, the
STANDBY_PDB_SOURCE_FILE_DIRECTORY parameter can be used to specify a directory
location on the standby where source data files for instantiating the PDB may be found. If
they are not found there, files are still expected to be found in the PDB's OMF directory on
the standby.
• If CPU_COUNT is set at the PDB level, the maximum DOP generated by AutoDOP
queries are the PDB’s CPU_COUNT.
• Similarly, the default value for PARALLEL_MAX_SERVERS and
PARALLEL_SERVERS_TARGET are computed based on the PDB’s CPU_COUNT.
If CPU_COUNT is set at the PDB level, the maximum DOP generated by AutoDOP queries are the
PDB’s CPU_COUNT. Similarly, the default value for PARALLEL_MAX_SERVERS and
PARALLEL_SERVERS_TARGET are computed based on the PDB’s CPU_COUNT.
In Oracle Database 12c, DBCA enables you to create a new PDB. The new PDB is created as a
clone of the CDB seed.
In Oracle Database 18c, DBCA enables you to create a new PDB as a clone of an existing PDB, and
not necessarily from the CDB seed.
Managing Security
DBA_USERS
AUTHENTICATION_TYPE = NONE | PASSWORD
Application designers may want to create accounts that contain the application data dictionary, but
are not allowed to log in to the instance. This can be used to enforce data access through the
application, separation of duties at the application level, and other security mechanisms.
In addition, utility accounts can be created, but remain inaccessible by denying the ability to log in
except under controlled situations.
Until Oracle Database 12c, DBAs create accounts that do not need to log in to the instance or maybe
rarely log in to the instance. Nevertheless, for all of these accounts, there are default passwords and
there are also requirements to rotate the passwords.
In Oracle Database 18c, an account can be created with the NO AUTHENTICATION clause to
ensure that the account is not permitted to log in to the instance. Removing the password and
removing the ability to log in essentially just leaves a schema. The schema account can be altered to
allow login, but can then have the password removed. The ALTER USER statement can be used to
disable or re-enable the login capability.
The DBA_USERS view has a new column, AUTHENTICATION_TYPE, which displays NONE when NO
AUTHENTICATION is set, and PASSWORD when a password is set.
Tables keys
Data bocks
TBS_APPS
Keystore
Transparent Data Encryption (TDE) provides easy-to-use protection for your data without requiring
changes to your applications. TDE allows you to encrypt sensitive data in individual columns or
entire tablespaces without having to manage encryption keys. TDE does not affect access controls,
which are configured using database roles, secure application roles, system and object privileges,
views, Virtual Private Database (VPD), Oracle Database Vault, and Oracle Label Security. Any
application or user that previously had access to a table will still have access to an identical,
encrypted table.
TDE is designed to protect data in storage, but does not replace proper access control.
TDE is transparent to existing applications. Encryption and decryption occurs at different levels
depending on whether it is tablespace or column level, but in either case, encrypted values are not
displayed and are not handled by the application. TDE eliminates the ability of anyone who has
direct access to the data files to gain access to the data by circumventing the database access
control mechanisms. Even users with access to the data file at the operating system level cannot see
the data unencrypted.
TDE stores the master key outside the database in an external security module, thereby minimizing
the possibility of both personally identifiable information (PII) and encryption keys being
compromised. TDE decrypts the data only after database access mechanisms have been satisfied.
TDE enables encryption for sensitive data in columns without requiring users or applications to
manage the encryption key. The default external security module is a software keystore.
In a multitenant container database (CDB), the root container and each pluggable database (PDB)
have their own master key used to encrypt data in the PDB, all of them stored in the common single
keystore. The master key must be transported from the source database keystore to the target
database keystore when a PDB is moved from one host to another.
The master keys are stored in a PKCS#12 software keystore or a PKCS#11-based HSM, outside the
database. For the database to use TDE, the keystore must exist.
To create a software keystore and a master key, perform the following steps:
1. Create a directory to hold the keystore, as defined by default in
V$ENCRYPTION_WALLET.WRL_PARAMETER, which is accessible to the Oracle software
owner.
2. If you plan to define another location for the keystore, specify an entry in the
$ORACLE_HOME/network/admin/sqlnet.ora file and create the appropriate directory.
ENCRYPTION_WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = /u01/app/oracle/other_admin_dir/orcl/wallet)))
• There is still one single keystore for CDB and optionally one keystore per PDB.
• There is still one master key per PDB to encrypt PDB data, stored in the PDB keystore.
• Modes of operation
– United mode: PDB keys are stored in the unique CDB root keystore.
– Isolated mode: PDBs keys are stored in their own keystore.
– Mix mode: Some PDBs use united mode, some use isolated mode.
In an Oracle Database 12c multitenant container database (CDB), the CDB root and each pluggable
database (PDB) have their own master key used to encrypt data in the PDB, all of them stored in the
common single keystore. The master key must be transported from the source database keystore to
the target database keystore when a PDB is moved from one host to another.
The master keys are stored in a PKCS#12 software keystore or a PKCS#11-based HSM, outside the
database. For the database to use TDE, the keystore must exist in a directory defined by the
ENCRYPTION_WALLET_LOCATION location in the sqlnet.ora file.
In Oracle Database 12c, the Multitenant architecture was mainly focused on providing support for
database consolidation.
In Oracle Database 18c, the Multitenant architecture continues to provide support for database
consolidation; however, focus is on independent, isolated PDB administration. To support
independent, isolated PDB administration, support for separate keystores for each PDB is now
provided. Providing the PDBs with their own keystore is called the “Isolated Mode.” Having
independent keystores allows PDBs to be managed independently of each other. The shared
keystore mode provided with Oracle Database 12c is now called "United Mode". Both modes can be
used at the same time, in a single multitenant environment, with some PDBs sharing a common
keystore in united mode and some having there own independent keystores in isolated mode.
This feature allows each PDB running in isolated mode within in a CDB to manage its own keystore.
Isolated mode allows a tenant to manage its TDE keys independently, and supports the requirement
for a PDB to be able to use its own independent keystore password. The project aims to allow the
customer to decide how the keys of a given PDB are protected, either with the independent
password of an isolated keystore, or with the password of the united keystore.
PDBs can optionally have their own keystore, allowing tenants to manage their own keys.
1. Define the shared location for the CDB root and PDB keystores:
SQL> ALTER SYSTEM SET wallet_root = /u01/app/oracle/admin/ORCL/tde_wallet;
2. Define the default PDB keystore type for each future isolated PDB, and then define a
different file type in each isolated PDB if necessary:
SQL> ALTER SYSTEM SET tde_configuration = 'KEYSTORE_CONFIGURATION=FILE';
– United: WALLET_ROOT/<component>/ewallet.p12
– Isolated: WALLET_ROOT/<pdb_guid>/<component>/ewallet.p12
PDBB /u01/app/oracle/admin/ORCL/tde_wallet/51FE2A4899472AE6/tde/ewallet.p12
PDBC /u01/app/oracle/admin/ORCL/tde_wallet/7893AB8994724ZC8/tde/ewallet.p12
2. Connect as the PDB security admin to the newly created PDB to:
a. Create the PDB keystore
pass
TDE master key WALLET_ROOT/<pdb_guid>/tde/ewallet.p12
No keystore mgt TDE PDB key
In the case of a newly-created PDB, the ADMINISTER KEY MANAGEMENT privilege needs to be
granted to a newly-created local user in the PDB, who acts as the security officer for the new PDB. It
is assumed that this security officer is provided with the password of the united keystore, because
that password is required to gain access to the TDE master key. Note that knowledge of this
password does not allow the user to perform an ADMINISTER KEY MANAGEMENT UNITE
KEYSTORE operation. Additional privilege scope is needed for the unite keystore operation.
The PDB security officer is then allowed to invoke the ADMINISTER KEY MANAGEMENT CREATE
KEYSTORE command, which creates an isolated keystore for the PDB and automatically configures
the keystore of the PDB to run in isolated mode.
Note: Observe that the ADMINISTER KEY MANAGEMENT CREATE KEYSTORE command does
not use the definition of the keystore location. The keystore location is defined in the WALLET_ROOT
parameter.
In the V$ENCRYPTION_WALLET view, the KEYSTORE_MODE column shows NONE for the CDB root
container. For the isolated PDB, the KEYSTORE_MODE column, shows ISOLATED.
If you want to convert a PDB to run in isolated mode, the ADMINISTER KEY MANAGEMENT
privilege needs to be commonly granted to a newly-created common user who will act as the
security officer for the PDB. The security officer for each PDB will now be managing their own
keystore.
Then after logging in to the PDB as the security officer, the ADMINISTER KEY MANAGEMENT
ISOLATE KEYSTORE command must be executed to isolate the key of the PDB into a separate
isolated keystore. The isolated keystore is created by this command, with its own password.
All of the previously active (historical) master keys associated with the PDB are moved to the
isolated keystore.
If a PDB no longer wants to manage its own separate keystore in isolated mode, the security officer
can decide to unite its keystore with that of the CDB root, and allow the security officer of the CDB
root administer its keys.
The PDB security officer who is a common user with the ADMINISTER KEY MANAGEMENT privilege
granted commonly logs in to the PDB and issues the ADMINISTER KEY MANAGEMENT UNITE
KEYSTORE command to unite the keys of the PDB with those of the CDB root.
When the keystore of a PDB is being united with that of the CDB root, all of the previously active
(historical) master keys associated with the PDB are also moved to the keystore of the CDB root.
When V$ENCRYPTION_WALLET is queried from the united PDB, the PDB being configured to use
the CDB root keystore, in this case the KEYSTORE_MODE column, shows UNITED.
To migrate a PDB from using wallet as the keystore to using Oracle Key Vault if the PDB is
running in isolated mode:
1. Upload the TDE encryption keys from the isolated keystore to Oracle Key Vault using a
utility.
2. Set the TDE_CONFIGURATION parameter of the PDB to the appropriate value:
SQL> ALTER SYSTEM SET tde_configuration = 'KEYSTORE_CONFIGURATION=OKV';
To migrate a PDB from using wallet as the keystore to using Oracle Key Vault if the PDB is running
in isolated mode, the TDE encryption keys from the isolated keystore need to be uploaded to Oracle
Key Vault by using a utility such as the okvutil upload command to migrate an existing TDE
wallet to Oracle Key Vault. Then the TDE_CONFIGURATION parameter of the PDB needs to be
changed to KEYSTORE_CONFIGURATION=OKV.
Refer to the following Oracle documentation:
• Oracle Database Advanced Security Guide 18c – Chapter Managing the Keystore and the
Master Encryption Key - Migration of Keystores to and from Oracle Key Vault.
• Oracle Key Vault Administrator's Guide 12c Release 2 (12.2) – Chapter Migrating an Existing
TDE Wallet to Oracle Key Vault – Oracle Key Vault Use Case Scenarios
• Oracle Key Vault Administrator's Guide 12c Release 2 (12.2) – Chapter Enrolling Endpoints
for Oracle Key Vault – okvutil upload Command
• Create and then use your own TDE master encryption key by providing raw binary data:
SQL> ADMINISTER KEY MANAGEMENT CREATE KEY
'10203040506032F88967A5419662A6F4E460E892318E307F017BA048707B402493C'
USING ALGORITHM 'SEED128' FORCE KEYSTORE
IDENTIFIED BY "WELcome_12" WITH BACKUP;
This capability is needed by Database Cloud Services to support the integration with the Key
Management service. Instead of requiring that TDE master encryption keys always be generated in
the database, this supports using master keys generated elsewhere.
The ADMINISTER KEY MANAGEMENT command allows you to either SET your own TDE master
encryption key if you want to create and activate the TDE master encryption key within a single
statement, or CREATE if you want to create the key for later use, without activating it yet. To activate
the generated key, first find the key in the V$ENCRYPTION_KEYS view and then use the USE KEY
clause of the same command.
Define the value for the key:
• MKID: The master encryption key ID is a 16-byte hex-encoded value that you can create or
have Oracle Database generate. If you omit this value, Oracle Database generates it.
• MK: The master encryption key is a hex-encoded value that you can create or have Oracle
Database generate, either 32 bytes for the AES256, ARIA256, and GOST256 algorithms or
16 bytes for the SEED128 algorithm. The default algorithm is AES256.
If you omit both the MKID and MK values, then Oracle Database generates both of the values for
you.
To complete the operation, the keystore must be opened. The keystore can be temporarily opened
by using the FORCE KEYSTORE clause.
In Oracle Database 12c, fixed-user database links passwords are obfuscated in the database.
Hackers find ways to deobfuscate the passwords.
In Oracle Database 18c, you can have the DB Link passwords be replaced with “x’ in the dump file
by enabling the credentials encryption in the dictionary of the CDB root and PDBs.
If the operation is performed in the CDB root, the CDB root keystore is required and opened. If the
operation is performed in a PDB and if the PDB works in isolated mode, the PDB keystore is
required and opened.
To perform this operation, the SYSKM privilege is required.
If you need to disable the credentials encryption, use the following statement:
SQL> ALTER DATABASE DICTIONARY DELETE CREDENTIALS KEY;
In Oracle Database 12c, because fixed-user database links passwords are obfuscated in the
database, Data Pump export exports the database links passwords with the obfuscated value. In this
case, Oracle recommends that you set the ENCRYPTION_PASSWORD parameter on the expdp
command so that the obfuscated database link passwords are encrypted in the dump file. To further
increase security, Oracle recommends that you set the ENCRYPTION_PWD_PROMPT parameter to
YES so that the password can be entered interactively from a prompt, instead of being echoed on the
screen with the ENCRYPTION_PASSWORD parameter.
In Oracle Database 18c, if you enabled the encryption of credentials in the database, a Data Pump
export operation stores an invalid password for the database link password in the dump file. A
warning message during the export and import operations tells you that after the import, the
password for the database link has to be reset.
If you do not reset the password of the imported database link, the following error appears when you
attempt to use it during a query:
SELECT * FROM system.test@test
*
ERROR at line 1:
ORA-28449: cannot use an invalidated database link
Clients/app servers
Capture directory 2
Replay
system
In capture files,
Because Oracle Database 11g manages system changes, a significant benefit is the added
confidence to the business in the success of performing changes. The record-and-replay
functionality offers an accurate method to test the impact of a variety of system changes including
database upgrade, configuration changes, and hardware upgrade.
A useful application of Database Replay is to test the performance of a new server configuration.
Consider a customer who is utilizing a single instance database and wants to move to a RAC setup.
The customer records the workload of an interesting period and then sets up a RAC test system for
replay. During replay, the customer is able to monitor the performance benefit of the new
configuration by comparing the performance to the recorded system. This can also help convince the
customer to move to a RAC configuration after seeing the benefits of using the Database Replay
functionality.
Another application is debugging. You can record and replay sessions emulating an environment to
make bugs more reproducible. Manageability feature testing is another benefit. Self-managing and
self-healing systems need to implement this advice automatically (“autonomic computing model”).
Multiple replay iterations allow testing and fine-tuning of the control strategies’ effectiveness and
stability. The database administrator, or a user with special privileges granted by the DBA, initiates
the record-and-replay cycle and has full control over the entire procedure.
On the server side, the DB Replay user password is set before the workload capture in the keystore.
On the client side, the DB Replay client password is set in the SSO wallet. They must match
together.
• Before the whole process starts, the DBA must set the password for the DB Replay user
(oracle.rat.database_replay.encryption identifier) in the keystore. The DB
Replay user password is then mapped to an encryption key and stored in the keystore.
• During the capture, the oracle.rat.database_replay.encryption password is
retrieved from the keystore and used to encrypt the sensitive fields. This encryption key is the
first-level encryption key used to generate a second-level encryption key for each capture file.
The second-level encryption key is encrypted by the first-level encryption key and saved in
the capture file header. The second-level encryption key is applied to all sensitive fields in
capture files, such as database connection strings, SQL text, and SQL bind values.
• During the process and replay, the data encrypted in the capture files is decrypted.
a) During the process and replay, the oracle.rat.database_replay.encryption
password is verified against the keystore to see if it matches the one used during the
workload capture.
b) If the verification is successful, the password is mapped to the first-level encryption
key, which subsequently is used to recover the second-level encryption key for each
capture.
c) Once the second-level encryption key is available, all sensitive fields are decrypted.
1. Before the whole process can encrypt sensitive data, set the password for the DB Replay
user (oracle.rat.database_replay.encryption identifier) in the keystore.
To delete the secret password from the keystore, use the ADMINISTER KEY MANAGEMENT
DELETE SECRET FOR CLIENT 'oracle.rat.database_replay.encryption‘
command.
2. Then when starting the capture, specify which encryption algorithm used to encrypt the data
in the workload capture files by using the new ENCRYPTION parameter of the
DBMS_WORKLOAD_CAPTURE.START_CAPTURE procedure:
- NULL: Capture files are not encrypted (default).
- AES128: Capture files are encrypted using AES128.
- AES192: Capture files are encrypted using AES192.
- AES256: Capture files are encrypted using AES256.
3. Stop the capture when workload is sufficient for testing.
You can encrypt data in existing capture files by using the
DBMS_WORKLOAD_CAPTURE.ENCRYPT_CAPTURE procedure:
SQL> exec DBMS_WORKLOAD_CAPTURE.ENCRYPT_CAPTURE(-
SRC_DIR => 'OLTP', DST_DIR => 'OLTP_ENCRYPTED')
You can also decrypt data in capture files by using the
DBMS_WORKLOAD_CAPTURE.ENCRYPT_CAPTURE procedure:
SQL> exec DBMS_WORKLOAD_CAPTURE.DECRYPT_CAPTURE(-
SRC_DIR => 'OLTP_ENCRYPTED', DST_DIR => 'OLTP_DECRYPTED')
4. Process the capture after moving the capture files to the testing server environment.
SQL> exec DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE(capture_dir => 'OLTP')
7. On the client side, the password for the encrypted capture is retrieved from a client-side SSO
wallet on disk. The wrc replay client retrieves the password by the identifier
oracle.rat.database_replay.encryption. This ensures the safety of user password
without compromising automation.
Set up a client-side wallet including the same
oracle.rat.database_replay.encryption client password defined in the keystore of
the production database where the capture was executed.
8. Start as many wrc replay clients as required to replay the capture workload. The client
retrieves the password for the oracle.rat.database_replay.encryption from the
client-side SSO wallet created in step 7. In the wrc command line, specify the directory
where the client-side SSO wallet was created by using the WALLETDIR parameter.
9. Start the replay.
HR
HR App
FIN
FIN App
Oracle Database Vault (DB Vault) helps customers control privileged user access to sensitive
application data stored in the database.
The slide shows how DB Vault realms prevent privileged database users and even privileged
applications from accessing data outside their authorization. Realms can be placed around a single
table or an entire application. Once in place, they prevent users with privileges such as the DBA role
from accessing data protected by the realm. Interestingly enough, many applications today also have
privileged accounts. When applications are consolidated, it can be advantageous to put realms
around the individual applications to prevent, as an example, an application owner from “peeking
over the fence” at the contents of another application, perhaps containing sensitive financial data.
The following components of Database Vault provide highly configurable access control:
• Realms and authorization types 1. The DBA can view the ORDERS table data.
– Participant SQL> SELECT order_total FROM oe.orders
– Owner WHERE customer_id = 101;
• Realms: A boundary defined around a set of objects in a schema, a whole schema, multiple
schemas, or roles. A realm protects the objects in it, such as tables, roles, and packages
from users exercising system or object privileges, such as SELECT ANY TABLE or even from
the schema owner. A realm may also have authorizations given to users or roles as
participants or owners. The security manager can define which users are able to access the
objects that are secured by the realm via realm authorizations and under which conditions
(rule sets).
In Oracle Database Vault 12c, there are two authorization types within a realm:
- Participant: The grantee is able to access the realm-secured objects.
- Owner: The grantee has all the access rights that a participant has, and can also
grant privileges (if they have either GRANT ANY ROLE or were granted that privilege
with the WITH ADMIN option) to others on the objects in the realm.
• Command rules: An ability to block the specified SQL command under a set of specific
conditions (rule sets)
• Rule sets: A collection of rules that are evaluated for the purpose of granting access. Rule
sets work with both realms and command rules to create a trusted path. Rule sets can
incorporate factors to create this trusted path to allow fine grained control on realms and
command rules. Examples of realms and command rules configured with rule sets:
- Realms can be restricted to accept only SQL from authorized users from a specific set
of IP addresses.
- Command rules can limit sensitive commands to certain DBAs from local workstations
during business hours.
New realm authorization types to allow users to run DB Replay capture and replay:
• DBCAPTURE authorization type
• DBREPLAY authorization type
• Managed using Database Vault admin procedures:
– DVSYS.DBMS_MACADM.AUTHORIZE_DBCAPTURE
– DVSYS.DBMS_MACADM.UNAUTHORIZE_DBCAPTURE
Requires
– DVSYS.DBMS_MACADM.AUTHORIZE_DBREPLAY DV_OWNER or DV_ADMIN role
– DVSYS.DBMS_MACADM.UNAUTHORIZE_DBREPLAY
DVSYS.DBA_DV_DBREPLAY_AUTH
GRANTEE = name of the granted user
You may want to use Database Replay to evaluate the functionality and performance of any mission
critical system with Database Vault.
However, the execution of Database Replay on a database with Database Vault configured requires
that the capture or replay user is granted appropriate access controls required by Database Vault to
access data in the database. Database Vault does not rely on system and object privileges to grant
access to data to users. Database Vault relies on realms with authorizations, rule sets, command
rules and secure application roles to allow or deny users access to application data.
In Oracle Database 18c, two new authorization types can be defined for a realm to allow capture or
replay users to run captures or replays.
• A user is allowed to run a capture only if the user is authorized for DBCAPTURE authorization
type by the Database Vault.
• A user is allowed to run a replay only if the user is authorized for DBREPLAY authorization
type by the Database Vault.
External directories store user credentials and authorizations in a central location (LDAP-
compliant directory, such as OID, OUD, and OVD).
PROD
• Eases administration through centralization Paul
ORCL Pass
• Enables single-point authentication Paul role_mgr
sales
Pass
• Eliminates the need for client-side wallets role_mgr
sales
Authenticating and authorizing users with external directories is an important feature of Oracle
Database 12c Enterprise Edition, which allows users to be defined in an external directory rather
than within the databases. Their identities remain constant throughout the enterprise.
Authenticating and authorizing users with external directories addresses the user, the administrative,
and the security challenges by centralizing storage and management of user-related information with
Enterprise User Security (EUS) relying on Oracle Directory Services such as Oracle Internet
Directory (OID), Oracle Unified Directory (OUD), and Oracle Virtual Directory (OVD).
When an employee changes jobs in such an environment, the administrator needs to modify
information only in one location (the directory) to make changes effective in multiple databases and
systems. This centralization can substantially lower administrative costs while materially improving
enterprise security.
ODS / EUS
Directory metadata repository
DN: Ann
Authentication: Password
Password: pass_ann
AD Database : ORCL
Mapping schema: user_global
DIRECTORY_SERVERS=(oidhost:13060:13130)
In the example in the slide, a client can submit the same connect command, whether connecting as
a database user or as an enterprise user. The enterprise user has the additional benefit of allowing
the use of a shared schema.
The authentication process is as follows:
1. The user presents a username and password (or other credentials).
2. The directory returns the authorization token to the database.
3. The schema is mapped from the directory information. The directory supplies the global roles
for the user. Enterprise roles are defined in the directory and global roles are defined in the
database (non-CDB or PDB). The mapping from enterprise roles to global roles is in the
directory. The directory can supply the application context. An application context supplied
from the directory is called a global context.
4. Finally, the user is connected.
If the authentication and authorization must be completed with Active Directory (AD), AD must first
go through Oracle Directory Services (ODS) to get the user’s authentication and authorization.
Note: Each PDB has its own metadata, such as global users, global roles, and so on. Each PDB
should have its own identity in the directory.
ODS / EUS
Create exclusive global DN: CN=analyst …
schemas authenticated by: Create shared global
user_ann
exclusive
in ORCL
schema
Authentication : Certificate
• PKI certificates schemas authenticated by:
Certificate: DN_ann
• Passwords • PKI certificates
DN: CN=trainer …
• Kerberos tickets • Passwords
Authentication : Password
• Kerberos tickets
user_tom
exclusive
in ORCL
schema
Password: pass_tom
DN: CN=manager …
Shared schema
Authentication : Password
GLOBAL_U
in ORCL
Password: pass_paul
DN: CN=director …
Authentication : Password
Password: pass_jean
Authentication methods and certificates of users can be centralized in the directory. A user can
connect to the database in two different ways.
• A global exclusive schema in the database that has a one-to-one schema mapping in the
directory. This method requires that the user be created in every database where the end
user requires access. The following command creates a database user identified by a
distinguished name. The DN is the distinguished name in the user’s PKI certificate in the
user’s wallet.
CREATE USER global_ann
IDENTIFIED GLOBALLY AS 'CN=analyst,OU=division1, O=oracle, C=US';
• A global shared schema in the database that has a shared schema mapping in the
directory. Any user identified to the directory can be mapped to the shared global schema in
the database. The mapped user will be authenticated by the directory and the schema
mapping will provide the privileges. The following command creates the global shared
schema:
CREATE USER global_u IDENTIFIED GLOBALLY;
No one connects directly to the GLOBAL_U schema. Any user that is mapped to the
GLOBAL_U schema in the directory can connect to this schema.
12c
Deploy and synchronize database user credentials and authorizations with ODS/EUS first.
18c
Deploy database user credentials and authorizations directly in Active Directory with
Centrally Managed Users (CMU):
– Centralized database user authentication
– Centralized database access authorization AD
ldap.ora
user_ann
exclusive
in ORCL
schema
user_ann Users mapping: Ann
DIRECTORY_SERVERS=(oidhost:13060:13130)
DIRECTORY_SERVER_TYPE = AD
g_AD_u granted
Shared schema
Groups mapping:
in ORCL
g_AD_u
G-ORCL : g_AD_u
LDAP_DIRECTORY_ACCESS=
PASSWORD | SSL | PASSWORD_XS | SSL_XS Global shared or exclusive schemas
LDAP_DIRECTORY_SYSAUTH=yes authenticated by:
• PKI certificates
in ORCL
mgr_role
• Passwords MGR-ORCL : mgr_role
global
role
• Kerberos tickets
With Oracle Database 18c, Centrally Managed Users (CMU) allows sites to manage database user
credentials and authorizations in Active Directory directly without the need to deploy and synchronize
them with EUS relying on Oracle Directory Services. Active Directory stores authentication and
authorization data that is used by the database to authenticate users.
CMU provides the following capabilities:
• Supports passwords, Kerberos, and PKI certificates
- AD stores user database password verifiers: Passwords use verifiers as they did
before. The only difference is that the verifier for a user is not stored in the database,
but in AD.
- AD includes Kerberos Key Distribution Center: The difference is that the user is now a
global user (not authenticated externally).
- AD verifies client Distinguished Name (DN) and may act as Certificate Authority
• Supports AD account policies:
- For passwords: Expiration, complexity, and history
- For lockout: Threshold, duration and reset account lockout counter
- For Kerberos: Ticket timeout, clock skew between server and KDC
Simplified Implementation
The following key advantages will lead you to use CMU rather than EUS:
• Simplified centralized directory services integration with less cost and complexity
- Authentication in AD for password, Kerberos and PKI
- Map AD groups to shared database accounts and roles
- Map exclusive AD user to database user
- Support AD account policies
• No client update required
• Supports all Oracle Database clients 10g and later
EUS and Oracle Directory Services authentication and authorization work as before.
Clone
After conversion:
impdp TTS Plug
• Is it possible to recover the PDB back in time before
Dump XML Data Replication
the non-CDB was converted?
In Oracle Database 12c, there are different possible methods to migrate a non-CDB to a CDB.
Whichever method is used, are the non-CDB backups transported with the non-CDB during the
migration?
• Does Oracle Data Pump transport the non-CDB backups?
- You either use transportable tablespace (TTS) or full conventional export/import or full
transportable database (TDB) provided that in the last one any user-defined object
resides in a single user-defined tablespace.
- Using any of these Data Pump methods, Data Pump transports objects definition and
data, but not backups.
• Does the plugging method transport the non-CDB backups? Generating an XML metadata
file from the non-CDB to use it during the plugging step into the CDB only describes the
non-CDB data files, but it does not describe the list of backups associated to the non-CDB.
• Does the cloning method transport the non-CDB backups? Cloning non-CDBs in a CDB
requires copying the files of the non-CDB to a new location. It does not copy the backups to a
recovery location.
• Does replication transport the non-CDB backups? The replication method replicates the data
from a non-CDB to a PDB. When the PDB catches up with the non-CDB, you fail over to the
PDB. Backups are not associated with the replication.
Because there are no backups transported with the non-CDB into the target CDB, no restore nor
recovery using the old backups is possible. Even if the non-CDB backups were manually
transported/copied to the target CDB, users cannot perform restore/recover operations using these
backups. You had to create a new baseline backup for the CDB converted as a PDB.
In Oracle Database 18c, you can transport the existing backups and backed up archive log files of
the non-CDB and reuse them to restore and recover the new PDB.
The backups transported from the non-CDB into the PDB are called preplugin backups.
Transporting the backups and backed up archive log files associated to a non-CDB before migration
requires the following steps:
1. The following new DBMS_PDB.exportRmanBackup procedure must be executed in the
non-CDB opened in read/write mode. This is a mandatory step for non-CDB migration.
The procedure exports all RMAN backup metadata that belongs to the non-CDB into its own
dictionary. The metadata is transported along with the non-CDB during the migration.
2. Use dbms_pdb.describe to generate an XML metadata file from the non-CDB describing
the structure of the non-CDB with the list of datafiles.
3. Archive the current redo log file required for a potential restore/recover using preplugin
backups.
4. Transfer the data files, backups, and archive log files to the target CDB.
5. Use the XML metadata file during the plugging step to create the new PDB into the CDB.
6. Run the ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql script to delete unnecessary
metadata from the PDB SYSTEM tablespace.
7. Open the PDB. When the PDB is open, the exported backup metadata is automatically
copied into the CDB dictionary, except the current redo log file archived in step 3. Catalog the
archived redo log file as one of the preplugin archived logs.
Because the backups for the PDB are now available in the CDB, they can be reused to recover the
PDB.
Data files / Tempfiles CDB2 After relocating/plugging the PDB into another CDB:
CDB root Control Redo • Is it possible to recover the PDB back in time before
files Log
files it was relocated/unplugged?
Data files / Tempfiles
Open PDB2 4 • Are the PDB backups transported with the
3 relocated/unplugged PDB?
PDB2
3 Plug
XML
Unplug using file Datafiles
2
DBMS_PDB
PDB1
In Oracle Database 12c, PDBs can be hot cloned from one CDB into another CDB by using local
UNDO tablespaces.
Are the PDB backups transported with the PDB during the cloning?
• Cloning a PDB into another CDB requires copying the files of the PDB to a new location. It
does not copy the backups to a recovery location.
If there are no backups transported with the PDB into the target CDB, no restore nor recovery using
the old backups is possible. Even if the PDB backups were manually transported/copied to the target
CDB, users cannot perform restore/recover operations using these backups. You had to create a
new baseline backup for PDBs relocated or plugged.
CDB2
Data files / Tempfiles
Archive 1. Export backups metadata by using
Backups 4 log files DBMS_PDB.exportRmanBackup.
Control Redo
CDB root files Log
files 2. Unplug the PDB by using
Data files / Tempfiles
DBMS_PDB.describe.
4 Open PDB2
3 3. Transfer the data files including backups to
PDB2
the target CDB.
3 Plug as PDB2 4. Plug using the XML file.
5. Open PDB. This automatically imports
Backups backups metadata into the CDB dictionary.
XML file containing XML Then you can restore/recover the PDB by using
In Oracle Database 18c, you can transport the existing backups and backed up archive log files of
the PDB and reuse them to restore and recover the new plugged PDB.
To transport the backups and backed up archive log files associated to a PDB before replugging the
PDB, perform the following steps:
1. The following new DBMS_PDB.exportRmanBackup procedure can be executed in the PDB
opened in read/write mode. This is not a mandatory step before unplugging the PDB:
SQL> EXEC dbms_pdb.exportRmanBackup ('<pdb_name>')
The procedure exports all RMAN backup metadata that belongs to the PDB into its own
dictionary. The metadata is transported along with the PDB during the unplug operation.
2. Unplug the PDB.
3. Transfer the data files, backups, and archive log files to the target CDB.
4. Plug the PDB with the COPY clause to copy the data files, backups, and backed up archive
log files of the source PDB into a new directory.
5. Open the new PDB. When the PDB is open, the exported backup metadata is automatically
copied into the CDB dictionary.
Because the backups for the PDB are now available in the CDB, they can be reused to recover the
PDB.
If you forgot to execute the DBMS_PDB.exportRmanBackup procedure before unplugging the PDB,
you can still catalog the existing backups and backed up archive log files of the plugged PDB after
the plugging operation and reuse them to restore and recover the plugged PDB.
If preplugin backups and archive log files are moved or new backups and archive log files were
created on the source CDB after the PDB was transported, then the target CDB does know about
them. You can catalog those preplugin files:
RMAN> SET PREPLUGIN CONTAINER=<pdb_name>;
RMAN> CATALOG PREPLUGIN ARCHIVELOG '<archivelog>';
RMAN> CATALOG PREPLUGIN BACKUP '<backup_name>';
Use the PrePlugin option to perform RMAN operations using preplugin backups.
• Restore a PDB from its preplugin backups cataloged in the target CDB.
RMAN> RESTORE PLUGGABLE DATABASE pdb_noncdb FROM PREPLUGIN;
• Recover a PDB from its preplugin backups until the datafile was plugged in.
RMAN> RECOVER PLUGGABLE DATABASE pdb_noncdb FROM PREPLUGIN;
• Check whether preplugin backups and archive log files are cataloged in the target CDB.
RMAN> SET PREPLUGIN CONTAINER pdb1;
• A restore operation from preplugin backups restores the datafiles taken before the PDB was
plugged in.
• A recover operation using preplugin backups use preplugin incremental backup and
archivelogs to recover the datafile until the datafile was plugged in. At the end of the recover
operation, the datafile is checkpointed as of plugin SCN.
The preplugin archivelogs are restored to the target archivelog destination by default as long
as the target archivelog destination is not a fast recovery area (FRA). If the target archivelog
destination is the FRA, then the user has to provide an explicit archivelog destination using
the SET ARCHIVELOG DESTINATION command before executing RECOVER FROM
PREPLUGIN.
• If there are preplugin metadata that belongs to more than one PDB, a command that does
not clarify the PDB the syntax refers to errors out indicating that the user should scope the
PDB. The scoping of PDB can be done by using the SET PREPLUGIN CONTAINER
command. Scoping is not necessary if the user has connected to PDB as the target CDB.
The SET PREPLUGIN CONTAINER command is necessary if you connected to the target
CDB.
• CROSSCHECK, DELETE, and CHANGE commands can use the PREPLUGIN option. The
CROSSCHECK command can validate the existence of preplugin backups, archived log files,
and image copies. The DELETE command can delete preplugin backups, archived log files
and image copies, and also expired preplugin backups.
RMAN> DELETE EXPIRED PREPLUGIN BACKUP;
RMAN> CHANGE PREPLUGIN archivelog all unavailable;
RMAN> CHANGE PREPLUGIN backup available;
RMAN> CHANGE PREPLUGIN copy unavailable;
• The source and destination CDBs must have COMPATIBLE set to 18.1 or higher to
create/restore/recover preplugin backups.
• In case of plugging in a non-CDB, the non-CDB must use ARCHIVELOG mode.
• The target CDB does not manage preplugin backups.
– Use CROSSCHECK and DELETE commands to manage the preplugin backups.
• A RESTORE using preplugin backups can restore datafiles from one PDB only.
• Backups taken by the source cdb1 are visible in target cdb2 only.
• The target CDB does not manage the source database backups. However, there are
commands to delete and crosscheck the source database backups.
• In one RMAN command, you cannot specify datafiles belonging to different PDBs when using
preplugin backups.
• The CDB root must be opened to make use of preplugin backups.
• Backups taken by the source cdb1 are visible in target cdb2 only. For instance, a PDB can
migrate from cdb1 to cdb2 and then to cdb3. The backups of the PDB taken at cdb1 are
accessible by cdb2. They are not accessible by cdb3. The cdb3 can only see backups of
the PDB taken by cdb2.
In this example, you first avoid any ambiguity to which PDB the backups belong to by scoping the
PDB.
Then you catalog the last archive log file created after the PDB was unplugged and the metadata
exported.
Then you restore and recover the PDB using preplugin backups.
And finally, you run a normal media recovery after recovering from preplugin backups.
18c Duplicate a PDB or PDB tablespaces in active mode to an existing opened CDB.
– Set the COMPATIBLE initialization parameter to 18.1.
– Clone only one PDB at a time.
In Oracle Database 12c, to duplicate PDBs, you must create the auxiliary instance as a CDB. To do
so, start the instance with the declaration enable_pluggable_database=TRUE in the
initialization parameter file. When you duplicate one or more PDBs, RMAN also duplicates the CDB
root and the CDB seed. The resulting duplicate database is a fully new, functional CDB that contains
the CDB root, the CDB seed, and the duplicated PDBs.
In Oracle Database 18c, the destination instance acts the auxiliary instance.
• An active PDB can be duplicated directly into an open CDB.
• The passwords for target and auxiliary connections must be the same when using active
duplicate.
• In the auxiliary instance, define the location where to restore the foreign archive log files via
the new initialization parameter, REMOTE_RECOVERY_FILE_DEST.
RMAN should be connected to the CDB root of the target and auxiliary instances.
Limitations
• Non-CDB to PDB duplication is not supported.
• Encryption is not supported for PDB cloning.
• SPFILE, NO STANDBY, FARSYNC STANDBY, LOG_FILE_NAME_CONVERT keywords are not
supported.
• NORESUME, DB_FILE_NAME_CONVERT, SECTION SIZE, USING COMPRESSED
BACKUPSET keywords are supported.
pdb1 pdb1
4. Start duplicate.
RMAN> DUPLICATE PLUGGABLE DATABASE pdb1 TO cdb2 FROM ACTIVE DATABASE;
The example shows a duplication of pdb1 from cdb1 into the existing cdb2 as pdb1.
To perform this operation, connections to the source (TARGET) cdb1 and to the destination
(AUXILIARY) cdb2 are required.
The location where to restore the foreign archive log files in the auxiliary instance is defined via the
new initialization parameter, REMOTE_RECOVERY_FILE_DEST.
Then the DUPLICATE command defines that the operation is performed while the source pdb1 is
opened.
• cdb2 needs to be opened in read/write.
pdb1 pdb2
4. Start duplicate.
RMAN> DUPLICATE PLUGGABLE DATABASE pdb1 AS pdb2 TO cdb2 FROM ACTIVE DATABASE;
The example shows a duplication of pdb1 from cdb1 into the existing cdb2 as pdb2.
To perform this operation, connections to the source (TARGET) cdb1 and to the destination
(AUXILIARY) cdb2 are required.
The location where to restore the foreign archive log files in the auxiliary instance is still defined via
the new initialization parameter, REMOTE_RECOVERY_FILE_DEST.
Then the DUPLICATE command defines that the operation is performed while the source pdb1 is
opened.
• cdb2 needs to be opened read/write.
ENCRYPT_NEW_TABLESPACES = CLOUD_ONLY
• The Cloud CDB holds a keystore because this is the default behavior on Cloud.
If you decide to migrate an on-premise CDB to the Cloud, any tablespace created in the Cloud CDB
will be encrypted, even if no encryption clause is declared.
Oracle Database 12c allows encryption of new user-defined tablespaces via a new
ENCRYPT_NEW_TABLESPACES instance parameter.
• A user-defined tablespace that is created in a CDB in the Cloud is transparently encrypted
with Advanced Encryption Standard 128 (AES 128) even if the ENCRYPTION clause for the
SQL CREATE TABLESPACE statement is not specified, and the
ENCRYPT_NEW_TABLESPACES instance parameter is being set to CLOUD_ONLY by default.
• A user-defined tablespace that is created in an on-premise database is not transparently
encrypted. Only the ENCRYPTION clause of the CREATE TABLESPACE statement determines
if the tablespace is encrypted.
All forms of duplication are compatible except for standby duplicate.
• Active duplication connects as TARGET to the source database and as AUXILIARY to the
Cloud instance.
• Backup-based duplication without a target connection connects as CATALOG to the recovery
catalog database and as AUXILIARY to the Cloud instance. RMAN uses the metadata in the
recovery catalog to determine which backups or copies are required to perform the
duplication.
Copy
Encrypted
tablespaces
On-premise ORCL
Database Mandatory Database Cloud
ORCL Keystore Service database
Encryption ORCL
If the source database already contains encrypted tablespaces, the DUPLICATE must have access
to the TDE master key of the source (TARGET) database because the clone instance needs to
decrypt the datafiles before re-encrypting them during the restore operation. In this case, the
keystore has to be copied from the on-premise CDB to the clone instance before starting the
DUPLICATE and must be opened.
The DUPLICATE command allows the new ‘AS ENCRYPTED’ clause to restore the CDB with
encryption.
For more information about duplicating databases to Oracle Cloud Infrastructure, refer to
“Duplicating Databases to Oracle Cloud Oracle“ in Database Backup and Recovery User’s Guide
18c and also RMAN Duplicate from an Active Database (https://docs.us-phoenix-
1.oraclecloud.com/Content/Database/Tasks/mig-rman-duplicate-active-
database.htm#RMANDUPLICATEfromanActiveDatabase)
Copy
Encrypted
tablespaces ORCL Optional
On-premise
Database Cloud Keystore Encryption Database
Service database
ORCL
ORCL
The source database already contains encrypted tablespaces; therefore, DUPLICATE must have
access to the master key of the source (TARGET) database because the clone instance needs to
decrypt the datafiles before the restore operation. In this case, the keystore has to be copied from
the Cloud CDB to the clone instance before starting DUPLICATE and must be opened by using the
SET DECRYPTION WALLET OPEN IDENTIFIED BY 'password' command.
The DUPLICATE command uses the ‘AS DECRYPTED’ clause to restore the CDB without
encryption.
If the user does not have Advanced Security Option (ASO) license on on-premise side, the on-
premise database cannot have TDE encrypted tablespaces. The DUPLICATE command using the
‘AS DECRYPTED’ clause provides a way to get encrypted tablespaces/databases from Cloud to on-
premise servers. It is important to note that it will not decrypt tablespaces that were explicitly created
with encryption, using the ENCRYPTION USING clause.
For more information, refer to "Duplicating an Oracle Cloud Database as an On-premise Database"
in Oracle Database Backup and Recovery User’s Guide 18c.
In Oracle Database 12c, the RECOVER … FROM SERVICE command refreshes the standby data
files and rolls them forward to the same point-in-time as the primary. However, the standby control
file still contains old SCN values, which are lower than the SCN values in the standby data files.
Therefore, to complete the synchronization of the physical standby database, you must refresh the
standby control file to update the SCN#. Therefore, you have to place the physical standby database
in NOMOUNT mode and restore using the control file of the primary database to standby.
The automation in Oracle Database 18c performs the following steps:
1. Remember all datafile names on the standby.
2. Restart standby in nomount.
3. Restore controlfile from primary.
4. Mount standby database.
5. Rename datafiles from stored standby names.
6. Restore new datafiles to new names.
7. Recover standby.
• Each session can see and modify only its own data.
ACC_TMP ACC_TMP
Temporary tables can be created to hold session-private data that exists only for the duration of a
transaction or session.
The CREATE GLOBAL TEMPORARY TABLE command creates a temporary table that can be
transaction specific or session specific. For transaction-specific temporary tables, data exists for the
duration of the transaction, whereas for session-specific temporary tables, data exists for the
duration of the session. Data in a session is private to the session. Each session can see and modify
only its own data. DML locks are not acquired on the data of the temporary tables. The clauses that
control the duration of the rows are:
• ON COMMIT DELETE ROWS: To specify that rows are visible only within the transaction. This
is the default.
• ON COMMIT PRESERVE ROWS: To specify that rows are visible for the entire session
You can create indexes, views, and triggers on temporary tables and you can also use the Export
and Import utilities to export and import the definition of a temporary table. However, no data is
exported, even if you use the ROWS option. The definition of a temporary table is visible to all
sessions.
USER_PRIVATE_TEMP_TABLES
Private Temporary Tables (PTTs) exist only for the session that creates them.
• You can create a PTT with the CREATE PRIVATE TEMPORARY TABLE statement.
• Table name must start with ORA$PTT_ : PRIVATE_TEMP_TABLE_PREFIX = ORA$PTT_
• The CREATE PRIVATE TEMPORARY TABLE statement does not commit a transaction.
• Two concurrent sessions may have a PTT with the same name but different shape.
ORA$PTT_mine ORA$PTT_mine
Private Temporary Tables (PTTs) are local to a specific session. In contrast with Global Temporary
Tables, the definition and contents are local to the creating session only and are not visible to other
sessions.
There are two types of duration for the created PTTs.
• Transaction: The PTT is automatically dropped when the transaction in which it was created
ends with either a ROLLBACK or COMMIT. This is the default behavior if no ON COMMIT
clause is defined at PTT creation.
• Session: The PTT is automatically dropped when the session that created it ends. This is the
behavior if the ON COMMIT PRESERVE DEFINITION clause is defined at the PTT creation.
A PTT must be named with a prefix 'ORA$PTT_'. The prefix is defined by default by the
PRIVATE_TEMP_TABLE_PREFIX initialization parameter, modifiable at the instance level only.
Creating a PTT does not commit the current transaction. Since it is local to the current session, a
concurrent user may also create a PTT with the same name but having a different shape.
At this time, PTTs cannot include User Defined Types, constraints, column default values, object
types or XML types, or an identity clause.
PTTs must be created in the user schema. Creating a PTT in another schema, using the ALTER
SESSION SET CURRENT SCHEMA command, is not allowed.
12c
When import detects a format error in the data stream, it aborts the load.
• All table data for the current operation is rolled back.
• Solution: Either re-export and re-import or recover as much of the data as possible
from the file with this corruption.
18c
Importing with the CONTINUE_LOAD_ON_FORMAT_ERROR option:
• Detects a format error in the data stream while importing data
• Instead of aborting the import operation, resumes loading data at the next granule
boundary
In Oracle Database 12c, when a stream format error is detected, Data Pump import aborts and all
the rows already loaded are rolled back.
Oracle Database 18c introduces a new value for the DATA_OPTIONS parameter for impdp. When a
stream format error is detected and the CONTINUE_LOAD_ON_FORMAT_ERROR option is specified
for the DATA_OPTIONS parameter for impdp, the Data Pump jumps ahead and continue loading
from the next granule. Oracle Data Pump has a directory of granules for the data stream for a table
or partition. Each granule has a complete set of rows. Data for a row does not cross granule
boundaries. The directory is a list of offsets into the stream of where a new granule, and therefore, a
new row, begins. Any number of stream format errors may occur. Each time, loading resumes at the
next granule.
Using this parameter for a table or partition that has stream format errors means that rows from the
export database will not be loaded. This could be hundreds or thousands of rows. Nevertheless, all
rows that do not present stream format errors are loaded which could be hundreds or thousands of
rows.
The DATA_OPTIONS parameter for DBMS_DATAPUMP.SET_PARAMETER has a new flag to enable
this behavior: KU$_DATAOPT_CONT_LD_ON_FMT_ERR.
Beginning with Oracle Database 12c, you can use the new ONLINE keyword to allow the execution
of DML statements during the following DDL operations:
• DROP INDEX
• ALTER TABLE DROP CONSTRAINT
• ALTER INDEX UNUSABLE
• ALTER TABLE SET COLUMN UNUSED
• ALTER TABLE MOVE
• ALTER TABLE MODIFY/SPLIT PARTITION
This enhancement enables simpler application development, especially for application migrations.
There are no application disruptions for schema maintenance operations.
To change the partitioning method of a table, you had to either use DBMS_REDEFINITION
procedures or do it manually with CTAS.
In Oracle Database 18c, you can change a partitioning method online, for example, convert the
HASH method to the RANGE method, or add or remove subpartitioning to a partitioned table to reflect
a new workload and for more manageability of data. Repartitioning a table can lead to better
performance like changing the partitioning key to get more partition pruning. This avoids a big down
time during the conversion of large partitioned tables. The ALTER TABLE MODIFY command
supports a completely nonblocking DDL to repartition a table.
• Prevents concurrent DDLs on the affected table, until the operation completes
• ONLINE clause: Does not hold a blocking X DML lock on the table being modified
• No tablespace defined for the partitions; defaults to the original table’s tablespace
• The UPDATE INDEXES clause:
– Changes the partitioning state of indexes and storage properties of the indexes being
converted
– Cannot change the columns on which the original list of indexes are defined
– Cannot change the uniqueness property of the index or any other index property
The ONLINE modification of a partitioned table prevents concurrent DDLs on the affected table, until
the operation completes, but it does not hold an exclusive blocking DML lock on the table. If the
ONLINE clause is not mentioned, the DDL operation holds a blocking exclusive DML lock on the
table being modified.
If the user does not specify the tablespace defaults for the partitions, the partitions of the
repartitioned table default to the original table’s tablespace.
The UPDATE INDEXES clause can be used to change the partitioning state of indexes and storage
properties of the indexes being converted. The columns on which the original list of indexes are
defined cannot be changed. This clause cannot change the uniqueness property of the index or any
other index property.
If no partitioning is defined for existing indexes of the original table by using the UPDATE INDEXES
clause, the following defaulting behavior applies for all unspecified indexes:
• Global indexes that are prefixed by the partitioning keys are converted to local partitioned
indexes.
• Local indexes are retained as local partitioned indexes if they are prefixed by the partitioning
keys in either the partitioning or subpartitioning dimension.
• All indexes that are nonprefixed by the partitioning keys are converted to global indexes.
• Because partitioned bitmap indexes can only be local, bitmap indexes are always local
irrespective of their prefixed column behavior.
All auxiliary structures, such as triggers, constraints, and Virtual Private Database (VPD) predicates
associated to the table, are retained exactly on the partitioned table as well.
This modification operation is not supported for IOTs, nor on tables in presence of domain indexes.
Online conversion:
In the example in the slide, the user changes the partitioning method of the range-partitioned SALES
table into a range table subpartitioned by hash and also the state of the existing indexes on the
table. This modification operation is completely nonblocking because the ONLINE keyword is
specified.
The operation subpartitions each partition of the SALES table into eight hash partitions set on the
new subpartitioning key, CUSTNO.
Each partition of the range local partitioned index I1_CUSTNO is hash subpartitioned into eight
subpartitions. The unique index I2_TIME_ID is maintained as a global range partitioned unique
index with no subpartitioning.
All unspecified indexes whose index columns are a prefix of the new subpartitioning key are
automatically converted to a local partitioned index. Other indexes are kept as global nonpartitioned
indexes, such as I3_PRODNO.
All auxiliary structures on the table being modified, such as triggers, constraints, VPDs and others,
are retained on the partitioned table as well.
The ONLINE partition maintenance operation, such as merging partitions of a partitioned table,
prevents concurrent DDLs on the affected (sub) partitions, until the operation completes. It also does
not acquire/hold a blocking exclusive DML lock on the (sub) partitions being merged, even if it is only
for a short duration. If the ONLINE clause is not mentioned, the DDL operation holds a blocking
exclusive DML lock on the table being modified.
In the example in the slide, the user merges three partitions (January 2017, February 2017, and
March 2017) of the partitioned SALES table into the q1_2017 (First quarter 2017) partition. This
operation is completely nonblocking because the ONLINE keyword is specified.
The I1_EMPNO unique index is maintained as a local partitioned index. The I2_MGR index is
maintained as a global partitioned index.
The same online merging operation can be executed on subpartitions.
This maintenance operation is not supported for IOTs, nor on tables in presence of domain indexes.
• 18c
One single ALTER TABLE statement for all the differences related to scalar columns
– No change in behavior for LOB and complex types
– ALTER TABLE simplified patches
DECLARE
Enterprise class application tables are typically very large and have a large number of columns. In
Oracle Database 12c, when comparing two application tables with different column lists,
DBMS_METADATA detects columns that need to be added or dropped and subsequently generates
one ALTER TABLE for each ADD or DROP column. When there is a large number of columns to be
added, there is a significant performance impact when executing a large number of ALTER TABLE
statements.
The performance impact resulting from executing ALTER TABLE ADD | DROP column statements
can be mitigated by batching the commands and collectively adding or dropping the new columns
with a single ALTER TABLE ADD | DROP column DDL statement.
Previous behavior:
ALTER TABLE app1.big_tab1 ADD (COL3 VARCHAR2(20) COLLATE POLISH_CI);
ALTER TABLE app1.big_tab1 ADD (COL4 CHAR2(32) COLLATE BIMARY);
ALTER TABLE app1.big_tab1 ADD (COL5 NUMBER);
New behavior:
ALTER TABLE app1.big_tab1 ADD
( COL3 VARCHAR2(20) COLLATE POLISH_CI,
COL4 CHAR(32) COLLATE BINARY,
COL5 NUMBER);
Unicode 9.0 adds a total of 7500 characters. It also includes a few other important updates on the
core specification as well as standard annexes and technical standards. For a complete list of the
changes, refer to the Unicode Consortium website at:
http://unicode.org/versions/Unicode9.0.0/
The new language scripts and characters add support for lesser-used languages worldwide:
• Osage, a Native American language
• Nepal Bhasa, a language of Nepal
• Fulani and other African languages
• The Bravanese dialect of Swahili, used in Somalia
• The Warsh orthography for Arabic, used in North and West Africa
• Tangut, a major historic script of China
Note: An emoji is a small digital image or icon used to express an idea or emotion. The origin of the
word is Japanese, where ‘e’ stands for ‘picture’ and moji for ‘letter, character’.
Improving Performance
UPDATE
SELECT
INMEMORY_SIZE
Threshold / low mem
3. Turn on the INMEMORY attribute at object creation or when you alter it, to convert the object
into a columnar representation in the IM column store. All columns of the in-memory table are
populated into memory unless some columns are disabled by using the NO INMEMORY
clause. It is recommended to specify all columns simultaneously rather than having an
ALTER TABLE for each column, because it is more efficient.
Two INMEMORY subattributes define the following behaviors:
- The loading priority of the object data in the IM column store: The INMEMORY clause
can be have the PRIORITY subclause. An in-memory table is populated into memory
at first data access by default. This default behavior is the “on demand” behavior.
Using different priority levels, table data can be populated into the IM column
store soon after the database starts up.
- The degree of compression of the columns of an object in the IM column store: The
INMEMORY clause can be have the MEMCOMPRESS subclause.
• The segments that are compatible with the INMEMORY attribute are tables, partitions,
subpartitions, inline LOBs, materialized views, materialized join views, and materialized view
logs.
• Clustered tables and IOTs are not supported with the INMEMORY clause.
1 2
update DEPT…;
MMON
3 4 Window 5
Oracle Database 12c enables the automation of Information Lifecycle Management (ILM) actions by:
• Collecting heat map statistics that track segment and block data usage and segment-level
usage frequencies in addition to daily aggregate usage statistics.
• Creating Automatic Data Optimization (ADO) policies that define conditions when segments
should be moved to other tablespaces and/or when segments/blocks can be compressed.
1. The first operation for the DBA is to enable heat map at the PDB level, tracking activity on
blocks and segments. The heat map activates system-generated statistics collection, such as
segment access and row and segment modification.
2. Real-time statistics are collected in memory (V$HEAT_MAP_SEGMENT view) and regularly
flushed by scheduled DBMS_SCHEDULER jobs to the persistent table HEAT_MAP_STAT$.
Persistent data is visible by using the DBA_HEAT_MAP_SEG_HISTOGRAM view.
3. The next step is to create ADO policies in the PDB on segments or groups of segments or as
default ADO behavior on tablespaces.
4. The next step is to schedule when ADO policy evaluation must happen if the default
scheduling does not match business requirements. ADO policy evaluation relies on heat map
statistics. MMON evaluates row-level policies periodically and starts jobs to compress
whichever blocks qualify. Segment-level policies are evaluated and executed only during the
maintenance window.
5. The DBA can then view ADO execution results by using the DBA_ILMEVALUATIONDETAILS
and DBA_ILMRESULTS views in the PDB.
6. Finally, the DBA can verify if the segment in the PDB is moved and stored on the tablespace
that is defined in the ADO policy and/or if blocks or the segment was compressed, by viewing
the COMPRESSION_STAT$ table.
Oracle Database 18c: New Features for Administrators 6 - 6
Creating ADO In-Memory Policies
12c
In Oracle Database 12c, without any Automatic Data Optimization (ADO) policies defined on an in-
memory segment, a segment that is populated in the IM column store is removed only if the segment
is dropped, moved, or the INMEMORY attribute on the segment is removed. This behavior can result
in memory pressure if the size of the data to be loaded into memory is more than the free space
available in the IM column store. The performance of the user workload would be optimal if the IM
column store contains the most frequently queried segments.
Oracle Database 12c later introduced three types of ADO In-Memory policies:
• An ADO policy to set the INMEMORY attribute on an object. This type of policy allows
specification of an IM clause as part of the ADO policy clause and annotates the table or
partition with this IM clause when the policy condition is satisfied. It does not populate the
segment to the IM store; the segment gets populated based on the priority in the IM clause.
SQL> ALTER TABLE t1 ILM ADD POLICY SET INMEMORY
AFTER 5 days OF creation;
• An ADO policy for the anticipated length of inactivity (NO ACCESS or MODIFICATION) that
would indicate eviction of the object from the IM column store. The ADO policy considers
heat map statistics. The object is kept in the IM column store as long as the activity does not
subside. Eviction unsets the INMEMORY attribute on the object.
• An ADO policy to modify IM compression: Change the compression level of an object from a
lower level of compression to a higher level.
SQL> ALTER TABLE t1 ILM ADD POLICY MODIFY INMEMORY
MEMCOMPRESS FOR QUERY HIGH AFTER 10 days OF no access;
Oracle Database 18c introduces the Automatic In-Memory (AIM) feature. The benefits of configuring
Automatic In-Memory (AIM) are:
• Ease of management of the IM store: Management of the IM column store for reducing
memory pressure by eviction of cold IM segments involves significant user intervention. AIM
addresses these issues with minimal user intervention.
• Improved performance: AIM ensures that the “working data set” is in the IM column store at
all times. The working data set is a subset of all the IM enabled segments that is actively
queried at any time. The working data set is expected to change with time for many
applications. The working data set (or actively queried IM segments) contains a hot portion
that is active and a cold portion that is not active. For data ageing applications, the action
would be to remove cold IMCUs from the IM column store.
With AIM, the DBA need not define IM priority attributes or ADO IM policies on IM segments.
AIM automatically reconfigures the IM column store by evicting cold data out of the IM column store
and populating the hot data. The unit of data eviction and population is an on-disk segment. AIM
uses the heat map statistics of IM-enabled segments together with user-specified configurations to
decide the set of objects to evict under memory pressure.
• Increase the effective capacity of the IM column store by evicting inactive IM segments
with priority NONE from the IM column store under memory pressure.
• Evict at segment level:
– According to the amount of time that an IM segment has been inactive
– According to the window of time used by AIM to determine the statistics for decision-
making
• Populate hot data.
• Activate heat map statistics: SQL> ALTER SYSTEM SET heat_map = ON;
• Set the initialization parameter:
SQL> ALTER SYSTEM SET INMEMORY_AUTOMATIC_LEVEL = MEDIUM SCOPE = BOTH;
The new INMEMORY_AUTOMATIC_LEVEL initialization parameter makes the IM column store self-
managed eventually. However, limited controls are needed to modify the behavior of this feature and
to disable it if necessary. You can turn on or off the automatic management of the IM column store
by using one of the possible values:
• LOW: When under memory pressure, the database evicts cold segments from the IM column
store. This is the default value.
• MEDIUM: In Oracle Database 12c, an in-memory table is populated into memory at first data
access by default. This default behavior is the “on demand” behavior. Using different priority
levels, table data can be populated into the IM column store soon after the database starts
up. In Oracle Database 18c, this AIM level includes an additional optimization that prioritizes
population of segments under memory pressure rather than allowing on-demand population.
This level ensures that any hot segment that was not populated because of memory pressure
is populated first.
• OFF: This option disables AIM, returning the IM column store to its Oracle Database
12c Release 2 behavior.
Oracle recommends that you provision enough memory for the working data set to fit in the IM
column store. As a general rule, AIM requires an additional 5 KB multiplied by the number
of INMEMORY segments of SGA memory. For example, if 10,000 segments have
the INMEMORY attribute, then reserve 50 MB of the IM column store for AIM.
AIM uses the new DBMS_INMEMORY_ADMIN.AIM_SET_PARAMETER procedure to set the duration
to filter heat map statistics for IM-enabled objects as part of its decision algorithms. The constants
are used to populate the SYS.ADO_IMPARAM$ table. The default value for the sliding stats window
in days is: AIM_STATWINDOW_DAYS_DEFAULT := 31
DBA_INMEMORY_AIMTASKS
V$IM_ADOTASKS Tracks decisions made by AIM
STATUS = RUNNING | at a point in time STATE = RUNNING |
UNKNOWN | UNKNOWN |
DONE DONE
ACTION = EVICT | NO
about the options considered ACTION = EVICT | NO
ACTION | and the decisions made ACTION |
• V$IM_ADOTASKS: This view provides information about AIM tasks. An AIM IM task provides
a way to track decisions made by AIM at a point in time. The STATUS column describes the
current state of the task, RUNNING, UNKNOWN, or DONE.
• DBA_INMEMORY_AIMTASKS: This view provides information on AIM IM tasks to database
administrators. The view columns are identical to the V$IM_ADOTASKS view with an extra
column, IM_SIZE, which corresponds to the in-memory size at the time of task creation.
• V$IM_ADOTASKDETAILS: The database investigates various possible actions as part of an
AIM task. This view provides information about the options considered and the decisions
made (ACTION column).
• DBA_INMEMORY_AIMTASKDETAILS: This is a view that provides the database administrator
with details related to the AIM task actions, and particularly the AIM action decided for this
object.
For details about these new Oracle Database 18c views, refer to the Oracle Database In-Memory
Guide 18c.
18c
SQL> exec DBMS_INMEMORY_ADMIN.IME_CAPTURE_EXPRESSIONS ('WINDOW')
Automatically identifying frequently used complex expressions or calculations, and then storing their
results in the IM column store can improve query performance. Storing precomputed virtual column
results can also significantly improve query performance by avoiding repeated evaluations.
The cached results can range from function evaluations on columns used in application, scan, or join
expressions, to bit-vectors derived during predicate evaluation for in-memory scans. Caching can
also address other internal computations that are not explicitly recited in a database query, such as
hash value computations for join operations.
Where are the in-memory expressions and virtual column results (IMEs) stored?
An IMCU is a basic unit of the in-memory copy of the table data. Each IMCU has its own in-memory
expression unit (IMEU), which contains expression results corresponding to the rows stored in that
IMCU.
Why are expressions and virtual columns considered good IME candidates?
Statistics such as frequency of execution and cost of evaluation on a per-segment basis are
regularly maintained by the optimizer and stored in the Expression Statistics Store (ESS). ESS uses
an LRU algorithm to automatically track which expressions are most frequently used.
In Oracle Database 12c, the DBMS_INMEMORY_ADMIN.IME_CAPTURE_EXPRESSIONS procedure
identifies the most frequently accessed (hottest) expressions in the database in the specified time
range, materializes them as hidden virtual columns, and adds them to their respective tables during
the next repopulation. The time range can be defined as:
• CUMULATIVE: The database considers all expression statistics since the creation of the
database.
• CURRENT: The database considers only expressions statistics from the past 24 hours.
Oracle Database 18c introduces the new WINDOW time range.
Oracle Database 18c: New Features for Administrators 6 - 12
Populating In-Memory Expression Results Within a Window
Optionally, get the current capture state of the expression capture window and the
time stamp of the most recent modification.
SQL> exec DBMS_INMEMORY_ADMIN.IME_GET_CAPTURE_STATE( P_CAPTURE_STATE, -
P_LAST_MODIFIED)
You can define an expression capture window of an arbitrary length, which ensures that only the
expressions occurring within this window are considered for in-memory materialization. This
mechanism is especially useful when you know of a small interval that is representative of the entire
workload. For example, during the trading window, a brokerage firm can gather the set of
expressions, and materialize them in the IM column store to speed-up future query processing for
the entire workload.
To populate expressions tracked in the most recent user-specified expression capture window,
perform the following steps:
1. Open a window by invoking the DBMS_INMEMORY_ADMIN.IME_OPEN_CAPTURE_WINDOW
procedure.
2. Let the workload run until you think you have collected enough expressions.
3. Close the window by invoking the
DBMS_INMEMORY_ADMIN.IME_CLOSE_CAPTURE_WINDOW procedure.
4. Add all the hot expressions captured in the previous window into the IM column store by
invoking the DBMS_INMEMORY_ADMIN.IME_CAPTURE_EXPRESSIONS('WINDOW')
procedure.
You can get the current capture state of the expression capture window and the time stamp of the
most recent modification by invoking the DBMS_INMEMORY_ADMIN.IME_GET_CAPTURE_STATE
procedure.
You can still invoke the DBMS_INMEMORY_ADMIN.IME_CAPTURE_EXPRESSIONS('CURRENT')
procedure to add all the hot expressions captured in the past 24 hours, which includes WINDOW as
well, and the DBMS_INMEMORY_ADMIN.IME_CAPTURE_EXPRESSIONS('CUMULATIVE')
procedure to add all the hot expressions captured since the creation of the database.
Oracle Database 18c: New Features for Administrators 6 - 13
Memoptimized Rowstore
18c
Fast ingest and query rates for thousands of devices from the Internet requires:
• High-speed streaming of single-row inserts
• Very fast lookups to key-value type data in the database buffer cache
– Querying data with the PRIMARY KEY integrity constraint enabled
– Using a new in-memory hash index structure
– Accessing table rows permanently pinned in the buffer cache
• Aggregated and streamed data to the database through the trusted clients
Smart devices connected to the Internet that have the ability to send and receive data require
support for fast ingest and query rates for thousands of devices. The Memoptimized Rowstore
feature is meant to provide high-speed streaming of single-row inserts and very fast lookups to key-
value type data. The feature works only on tables that have PRIMARY KEY integrity constraint
enabled.
To provide the speed necessary to service thousands of devices, the data is aggregated and
streamed to the database through the trusted clients.
The fast query part of the Memoptimized Rowstore feature allows access to existing rows through a
new hash index structure and pinned database blocks.
Oracle Database supports ingest and access of row-based data in a fraction of the time that it takes
for conventional SQL transactions. With the ability to ingest high-speed streaming of input data and
the use of innovative protocols and hash indexing of key-value pairs for lookups, the Memoptimized
Rowstore feature significantly reduces transaction latency and overhead, and enables businesses to
deploy thousands of devices to monitor and control all aspects of their business.
Hash index maps a given key to the address of rows in the database buffer cache:
1. Gets the address of the row in the buffer cache
2. Reads the row from the buffer cache Database Buffer Cache
An in-memory hash table mapping a given key to the location of corresponding rows enables quick
access of the Oracle data block storing the row.
The in-memory hash table is indexed with a user-specified primary key, very similar to hash clusters
containing tables with PRIMARY KEY constraint enabled. This in-memory structure is called a hash
index, although the underlying data structure is a hash table. The data structure resides in the
instance memory, requiring additional space in SGA. You can set the MEMOPTIMIZE_POOL_SIZE
initialization parameter to reserve static SGA allocation at instance startup.
To build a fast code path, having an in-memory hash index data structure is not sufficient. Rows of
tables are stored in disk blocks and, when row data is queried, the database buffer cache caches
tables blocks in the SGA of the database instance. Given that the blocks are aged out based on the
replacement policy used by the buffer cache, the blocks have to be permanently pinned in the buffer
cache to avoid disk I/O. This is the reason for setting the MEMOPTIMIZE FOR READ attribute to a
table that does not change the on-disk structure.
Using the DDL command ALTER TABLE t MEMOPTIMIZE FOR READ cascades the attribute to all
existing partitions (and sub-partitions). Use the NO MEMOPTIMIZE FOR READ clause to disable the
feature on an object. By default, tables are MEMOPTIMIZE FOR READ disabled.
Setting the MEMOPTIMIZE FOR WRITE attribute to a table gives a primary key and inserts the key
and corresponding data and metadata in the hash index structure during row inserts. The hash index
structure is updated during other write operations like delete and update.
Use the DBMS_MEMOPTIMIZE.POPULATE procedure to populate the hash index for an object, table,
partition or subpartition.
SQL> exec DBMS_MEMOPTIMIZE.POPULATE (SCHEMA_NAME => 'SH',
table_name => 'SALES', PARTITION_NAME => 'SALES_Q3_2003')
Oracle Database 18c: New Features for Administrators 6 - 15
DBMS_SQLTUNE Versus DBMS_SQLSET Package
In Oracle Database 12c, to perform manual and automatic tuning of statements, and management of
SQL profiles and SQL Tuning Sets (STS), you can use the DBMS_SQLTUNE package that contains
the necessary APIs.
In Oracle Database 18c, the DBMS_SQLSET package is the new package that contains SQL Tuning
Set functionality.
SQL Tuning Sets: Manipulation
12c
SQL Tuning Set functionality is available only if one of the following conditions exist:
• Tuning Pack is enabled.
• Real Application Testing (RAT) option is installed.
18c
SQL Tuning Set functionality is available for free with Oracle DB Enterprise Edition.
• A new DBMS_SQLSET package is available to create, edit, drop, populate, and query
STS and manipulate staging tables.
SQL> EXEC dbms_sqlset.create_sqlset | delete_sqlset | update_sqlset |
drop_sqlset
In Oracle Database 12c, the package containing the SQL Tuning Set functionality is DBMS_SQLTUNE
that is part of the tuning pack or Real Application Testing option.
In Oracle Database 18c, the DBMS_SQLSET package is the new package to contain the SQL Tuning
Set functionality.
• Create and drop STS: CREATE_SQLSET, DROP_SQLSET
• Populate STS: CAPTURE_CURSOR_CACHE, LOAD_SQLSET
• Query STS content: SELECT_SQLSET function
• Manipulate staging tables: CREATE_STGTAB, PACK_STGTAB, UNPACK_STGTAB,
REMAP_STGTAB
The new package is not part of the Tuning Pack nor the Real Application Testing option. It is
available for free with Oracle Database Enterprise Edition.
Most of the functions and procedures of the DBMS_SQLTUNE package can be found in the new
DBMS_SQLSET package, except the procedures related to profiles (ACCEPT_SQL_PROFILE), tuning
tasks, and baselines.
The Oracle Real Application Testing option in Oracle Database 12c includes SQL Performance
Analyzer (SQLPA), which gives you an accurate assessment of the impact of change on the SQL
statements that make up the workload.
SQLPA helps you forecast the impact of a potential change on the performance of a SQL query
workload. SQLPA is used to predict and prevent potential performance problems for any database
environment change like database upgrades, schema or parameter changes, or statistics gathering
change that affects the structure of the SQL execution plans. This capability provides DBAs with
detailed information about the performance of SQL statements, such as before-and-after execution
statistics, and statements with performance improvement or degradation. This enables you to make
changes in a test environment to determine whether the workload performance will be improved
through a database upgrade.
In Oracle Database 12c, when a SQLPA task is executed for analysis, each statement in the SQL
Tuning Set (STS) is executed one after the other, sequentially. Depending on the number of
statements stored in the STS and their complexity, the execution might experience long running
times.
In an STS, each statement is independent of each other. This makes it possible to concurrently
execute the statements in an STS. Oracle Database 18c allows the concurrent execution of
statements in an STS. You can choose the execution mode for an SPA task to concurrently execute
STS statements and define the degree of parallelism (DOP) to be used during SPA task execution.
After re-executing the SQL statements, you compare and analyze before and after performance,
based on the execution statistics, such as elapsed time, CPU time, and buffer gets.
In Oracle Database 18c, SQL Performance Analyzer (SPA) result set validation allows users to
validate that the same result set is returned during the initial SPA test-execute and during
subsequent test-executes. It assures you that repeated SQL queries are executing as expected and
is required in certain regulatory environments. If the result set returned by a query is different before
and after the change, it is most likely due to a bug in the SQL execution layer. Because this can have
a severe impact on SQL, it is desirable for SPA to be able to detect such issues and report them.
You can ensure the result set validation by setting the COMPARE_RESULTSET parameter.
The example in the slide shows you how to use the DBMS_SQLPA package to invoke SQL
Performance Analyzer to access the SQL performance impact of some changes.
1. Create the tuning task to run SQL Performance Analyzer.
2. Set the degree of concurrency for re-executing the statements with the TEST_EXECUTE_DOP
parameter.
3. Execute the task once to build before-change performance data. You can specify various
parameters, for example, the EXECUTION_TYPE parameter as follows:
- EXPLAIN PLAN to generate explain plans for all SQL statements in the SQL
workload
- TEST EXECUTE to execute all SQL statements in the SQL workload. The procedure
executes only the query part of the DML statements to prevent side-effects to the
database or user data. When TEST EXECUTE is specified, the procedure generates
execution plans and execution statistics.
- COMPARE [PERFORMANCE] to analyze and compare two versions of SQL performance
data
- CONVERT SQLSET to read the statistics captured in a SQL Tuning Set and model
them as a task execution
4. Produce the before-change report (special settings for report: set long 100000,
longchunksize 100000, and linesize 90).
5. Make your changes and execute the task again after making the changes.
6. Generate the after-changes report.
7. Compare the two executions. You can set the COMPARE_RESULTSET parameter to TRUE to
validate that the result set returned during the initial SPA test-execute is identical to the result
during subsequent test-executes.
8. Generate the analysis report.
Note: For more information about the DBMS_SQLPA package, see the Oracle Database PL/SQL
Packages and Types Reference Guide.
SQL Tuning Advisor provides better execution plans, faster SQL, less resource usage, and
enhanced performance of Exadata systems by using new algorithms.
• SQL Tuning Advisor executes a new analysis to determine if any of these system
statistics are not up to date:
– I/O Seek Time
– Multi-Block Read Count (MBRC)
– I/O Transfer Speed
• If any of the system statistics are found to be stale and gathering them improves the
performance of the SQL being tuned, SQL Tuning Advisor recommends an Exadata-
On Exadata systems, the cost of smart scans is dependent on three system statistics:
• I/O Seek Time
• Multi-Block Read Count (MBRC)
• I/O Transfer Speed
The values of these system statistics are usually different on Exadata as compared to non-Exadata
and can influence which execution plan would be optimal.
In Oracle Database 18c, SQL Tuning Advisor executes a new analysis to determine if any of these
system statistics are not up to date. If any of the system statistics are found to be stale and gathering
them improves the performance of the SQL being tuned, this will be recommended via a SQL profile
called an Exadata-aware SQL profile.
Accepting such a profile impacts performance of only the SQL being tuned and not any of the other
SQLs. This is consistent with the existing behavior of a SQL profile.
The SQL Tuning Advisor report in this case displays:
1- SQL Profile Finding (see explain plans section below)
--------------------------------------------------------
A potentially better execution plan was found for this statement.
Recommendation (better benefit)
-------------------------------
- Consider accepting the recommended SQL profile. It is an Exadata-aware
SQL profile.
execute dbms_sqltune.accept_sql_profile(task_name => 'TASK_XXXXX',
task_owner => 'EXAUSR', replace => TRUE);
1. Create the external table before querying against the external table.
CREATE TABLE ext_emp (id NUMBER, …, email VARCHAR2(25))
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_dir • DEFAULT DIRECTORY
ACCESS PARAMETERS
( records delimited by newline • ACCESS PARAMETERS
badfile ext_dir:'empxt%a_%p.bad'
logfile ext_dir:'empxt%a_%p.log' • DISCARDFILE, BADFILE, LOGFILE
values are null
fields terminated by ',' missing field
• LOCATION
(emp_id, first_name, last_name, job_id) )
LOCATION ('empext1.dat') ) • REJECT LIMIT
REJECT LIMIT UNLIMITED;
/extdir/empext1.dat
2. Modify parameters during a query: no need to alter the external table definition.
SQL> SELECT * FROM ext_emp EXTERNAL MODIFY ( ACCESS PARAMETERS ( BADFILE ext_dir:'empxt2%a_%p.bad',
LOGFILE ext_dir:'empxt2%a_%p.log')
LOCATION ('empext2.dat'));
/extdir/empext2.dat
In Oracle Database 12c, querying an external table requires a persistent object for the external table
to be created in the data dictionary.
You can define the DEFAULT DIRECTORY, ACCESS PARAMETERS, LOCATION, and REJECT
LIMIT values. When querying the external table, these parameters can be modified.
Modifications in ACCESS PARAMETERS are limited to DISCARDFILE, BADFILE, and LOGFILE.
If the second query in the slide is running concurrently with the first one that reads the data from
‘empext1.dat,’ there is no need to create a separate external table for the second query. The
second query overrides the default LOCATION to fetch external data from another location. The
queries share the external metadata, which was created for a single EXT_EMP table in the data
dictionary.
Increasing the flexibility and ease of SQL access, inlined external tables:
• Are similar to inline views
• Allow the runtime definition of an external table as part of a SQL query statement
• Transparently access data outside the Oracle database
• Simplify the access of external data by a simpler and more efficient code
SQL> SELECT ext_emp.id FROM EXTERNAL ( (id NUMBER, …, email VARCHAR2(25))
TYPE ORACLE_LOADER
DEFAULT DIRECTORY ext_dir
ACCESS PARAMETERS
Oracle Database 18c offers the possibility to query an external table without creating a persistent
object for the external table in the data dictionary. Compared to the example in the previous slide,
the first step can be skipped.
In this case, the query inlines the external table. For inlining an external table, the EXTERNAL
keyword along with other information must be provided. The same external table parameters and
any user-specified columns that must be specified in the CREATE TABLE syntax must also be
specified when inlining the external table in a query.
This information includes a list of external table columns defining the table, access driver type and
external table parameters. A REJECT limit can be specified as an option. Note that the MODIFY
keyword must be omitted when inlining an external table because the external table is not referenced
in the data dictionary.
External table metadata exist only for the query duration. It is created during query compilation and
purged when the query has been aged out of the cursor cache.
The user querying the inlined external table must have the READ privilege on the directory object
containing the external data, and the WRITE privilege on the directory objects containing the bad, log
and discard files.
In the example in the slide, the external table is aliased as EXT_EMP. This allows inlined external
tables to be joined.
There are restrictions:
• Partitioning and LONG, BFILE, or ADT external table columns are not supported.
• Creating a materialized view or materialized zone map that includes an inline external table
clause in the definition query raises an error.
• Uses the features of Database In-Memory when the external data must be queried
repeatedly as multiple accesses of the external storage.
– Set the INMEMORY attribute on the external table / all external table partitions.
– Set the MEMCOMPRESS attribute.
SQL> CREATE TABLE test (…) ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER … )
INMEMORY MEMCOMPRESS FOR CAPACITY HIGH;
• Enables populating data from external tables into the in-memory column store.
SQL> EXEC DBMS_INMEMORY.POPULATE ('HR', 'test')
Oracle Database 18c enables the population of data from external tables into the in-memory column
store.
This allows population of data that is not stored in Oracle Database. This can be valuable when you
have other external data stores and you want to perform advanced analytics on that data with
Database In-Memory. This can be particularly valuable when the external data needs to be queried
repeatedly. You can avoid multiple accesses of the external storage and the queries can use the
features of Database In-Memory multiple times.
Data from external sources with ORACLE_LOADER and ORACLE_DATAPUMP access types can be
summarized and populated into the in-memory column store where repeated, ad-hoc analytic
queries can be run that might be too expensive to run on the source data.
The in-memory external tables also benefit from in-memory expressions.
You can set the INMEMORY attribute and its correlated MEMCOMPRESS attribute when creating and
altering an external table. If the external table is partitioned, all individual partitions are defined as in-
memory segments. The ability to exclude certain columns is not yet implemented.
Querying an in-memory external table requires the QUERY_REWRITE_INTEGRITY parameter in the
session to be set as STALE_TOLERATED and if updates of the external file occur, either repopulation
of the in-memory segment with the DBMS_INMEMORY.REPOPULATE procedure or altering the in-
memory table as NO INMEMORY and resetting it as MEMORY is required.
Statistics are collected on in-memory external tables because they are on in-memory heap tables
like IM populate external table read time (ms).
If Oracle Database 18c allows heat maps and makes Automatic Data Optimization (ADO) manage
eviction of cooling segments from the in-memory area, this is not the case on external tables and
external table partitions.
Oracle Database 18c: New Features for Administrators 7 - 5
Analytic Views
Analytic views (AVs) are metadata-only objects defined over standard tables or views:
• They provide a hierarchical organization and analytical hierarchy-aware calculations.
• They are queried via SQL or the DBMS_MDX_ODBO package.
12c
Microsoft’s Multidimensional Expression (MDX) provides features that are not
accessible in SQL query.
18c
Enhancing the AV SQL query capabilities adds support for two such features:
– Query-scoped calculations
Dynamically modify an AV at query-time
– Filter-before aggregate
Oracle Database 12c introduced Analytic views (AVs). Along with their associated objects—attribute
dimensions and hierarchies—AVs are metadata-only objects defined over standard tables or views.
They provide a hierarchical organization and analytical hierarchy-aware calculations, helping Oracle
to compete in the business analytics space with players such as SAP’s HANA.
An AV encapsulates aggregations, calculations, and joins of fact data that are specified by attribute
dimensions and hierarchies and by measures.
AVs may be queried either directly in SQL or via Microsoft’s Multidimensional Expression (MDX)
language. The MDX interface is provided by the PL/SQL package DBMS_MDX_ODBO, and is called by
the Oracle OLE DB for OLAP Provider (ODBO) or an XML for Analysis (XMLA) configuration.
MDX provides features that are not accessible in SQL query.
When you create an AV with the CREATE ANALYTIC VIEW statement, the calculations are burnt
into the persisted AV and the user cannot add additional computations at query time.
Oracle Database 18c addresses that shortfall in SQL by enhancing the AV SQL query capabilities to
add support for two features:
• Query-scoped calculations
• Filter-before aggregate
Oracle Database 18c offers the ability to dynamically modify an AV at query time, allowing utilization
of the two preceding features on a query-by-query basis.
An AV is defined on top of a fact table (or view), which provides the data for the leaves of the
associated hierarchies. A query on an AV produces rows for all members of its hierarchies. Measure
data for rows representing non-leaves are aggregated according to the function specified in the AV
metadata, typically a SUM function. Any predicates specified in the WHERE clause simply reduce the
rows returned, but do not impact the aggregated measured data.
MDX uses a feature called visual totals that can be toggled on or off via a check box in Excel to
impact the aggregated measured data.
• When the feature is disabled, the data returned via an MDX AV query is identical to the data
returned for the equivalent SQL AV query. In the example of the slide, the total for the year
2016 where selection includes the year 2016 but none of its descendants, the data for 2016
is aggregated from all of its leaves.
• When the feature is enabled, however, they can diverge. When visual totals is enabled, if a
node has any descendants in the selection only those descendants are used to aggregate up
to that node. For example, if the selection includes the year 2016 but none of its
descendants, the data for 2016 is aggregated from all of its leaves. If the selection includes
2016 and, in addition includes Q1-2016 and Q2-2016, data for 2016 using only those
quarters is aggregated.
A user can produce either non-visual or visual results when using SQL to query an AV.
Each hierarchy may specify a filter-before aggregate predicate, which serves to filter the leaves of
that hierarchy before aggregating the measures. The predicate for a given hierarchy specifies some
set of hierarchy members. The fact rows are then filtered to include only the leaf descendants of
those members.
If the example in the slide had facts at the month level rather than quarter level, a filter-before
predicate that included (Q1-2016, Q2-2016) would therefore filter the fact rows to only (JAN-2016,
FEB-2016, MAR-2016, APR-2016, MAY-2016, JUN-2016). This does not impact the sales data for
the quarter rows because each quarter includes all months within that quarter, but causes the year
row to aggregate only over the first six months. The resulting query includes rows for the non-filtered
leaves and their ancestors.
In addition to filtering leaves based on hierarchy navigation, you can also define calculated
measures that are not declared in the AV, like the percentage change of sales from a current period
and a previous period. The temporary calculated measures apply to the rows resulting from the filter-
before aggregation.
A WHERE clause can be added to simply reduce the rows returned after the filter-before aggregation.
The WHERE clause does not impact the aggregated measure data.
Use the USING and ADD MEASURES clauses within a SELECT statement to define a calculated
measure.
The query in the example of the slide adds two calculated measures that are not burnt into the AV,
the sales value of the previous period and the percent change of sales from the previous period and
the current period.
The query in the slide combines FILTER FACT and ADD MEASURES to return sales, sales prior
period, and percent change sales prior period for the first half of years in Mexico and Canada.
The query in the slide combines FILTER FACT and ADD MEASURES to return sales, sales prior
period, percent change sales prior period for the first half of years in Mexico and Canada. Moreover
the WHERE clause added simply reduces the rows returned after the filter-before aggregation and the
measures calculation are applied. If you compare the resulting rows with the resulting rows from the
previous slide, you observe that it does not impact the filter-before aggregated values nor the
aggregated measured data; it restricts only the resulting aggregated rows to those whose sales are
greater than 30000000.
• A table function (TF) is a function that returns a collection of rows that can be called from
the from-clause of a SQL query block.
SQL> SELECT * FROM TF_NOOP(emp);
Goal
• Provide a framework for DBAs and developers for writing PTFs that are simple to use
In Oracle Database 12c, you can quickly write a very simple table function that will accept a range of
table shapes but will be quite slow because it cannot be parallelized and data is returned only after
all source rows have been processed.
In Oracle Database 18c, a polymorphic table function (PTF) provides an efficient and scalable
mechanism to extend the analytical capabilities of the RDBMS to be used by SQL and PL/SQL
developers who require a simpler, more flexible and performant table function. A table function will
act like a normal Oracle row-source object, accepting a wide range of table shapes where the
RDBMS manages the execution plan.
The SQL writer is able to invoke table functions (TF) without knowing the details of the
implementation of the PTF, and the PTF does not need to know about the details or how the function
is being executed by the RDBMS (for example in serial or parallel), and whether the input rows to the
PTF were partitioned and/or ordered.
1. Create the package containing the DESCRIBE function and FETCH_ROWS procedure.
CREATE OR REPLACE PACKAGE change_case_p AS
function Describe(tab IN OUT DBMS_TF.Table_t, new_case varchar2)
return DBMS_TF.describe_t;
procedure Fetch_Rows(new_case varchar2);
END change_case_p;
/
CREATE OR REPLACE PACKAGE BODY change_case_p AS …
The DESCRIBE function returns a descriptor containing the type information of the new column that
the PTF produces.
Additionally, DESCRIBE marks the columns of the input table with two kinds of non-mutually
exclusive annotations:
• Read Columns: The columns that are going to be read during execution. By default, none of
the input table columns are marked “read”.
• Pass-through Columns: The columns that should be passed (unmodified) from the PTF
input to the output. By default, all of the input table columns are marked “pass-through”.
The input to a PTF is a single stream of rows that is divided into arbitrary sized chunks of rows
(including, possibly, zero rows). Each of these chunks is called a rowset.
The FETCH_ROWS function is responsible for consuming the rows in the input stream, one rowset at
a time, and producing the corresponding new columns. There is only one rowset active at any time.
• Each call to FETCH_ROWS must act upon the active rowset, and after processing the active
rowset, it can either return or remain inside the FETCH_ROWS and request and process
another rowset.
The execution procedure FETCH_ROWS is called by the RDBMS during query execution. The PTF
execution calls FETCH_ROWS. Associated with each call to FETCH_ROWS is an input rowset (a
collection of rows from the underlying table or query) that the PTF is expected to process.
The FETCH_ROWS procedure uses the PTF Server API (either DBMS_TF.Get_Row_Set or
DBMS_TF.Get_Col) to read the input rowset, and typically this rowset will be used to produce an
output rowset (a collection of rows to be returned as output), which is then written back to the
RDBMS using the PTF Server API (either DBMS_TF.Put_Row_Set or DBMS_TF.Put_Col). Each
call to FETCH_ROWS is accompanied by a rowset, which is some data/system-specific collection of
input rows that is expected to be processed by the PTF.
Conceptually, a TS PTF operates on an entire table (or a logical partition of a table), while an RS
PTF can produce a new row exclusively from a single input row.
A TS PTF is designed to be used for implementing analytic functions that can act like aggregation
functions.
The query can optionally partition the TS PTF input and optionally order it. This is not allowed for an
RS PTF.
In data analysis applications, users need to find the most frequent values.
Example: Find the top three job titles contributing to the most payroll expenses.
• Use a row-limiting clause to limit the rows returned by the query.
– Specify the number of rows to return with the FETCH FIRST/NEXT keywords.
– Specify the percentage of rows to return with the PERCENT keyword.
• Queries that order data and limit row output are referred to as Top-N queries.
• The Top-N queries return exact results.
• FETCH FIRST n ROWS ONLY applies to global ordering.
In Oracle Database 12c, SQL SELECT syntax has been enhanced to allow a row-limiting clause,
which limits the number of rows that are returned in the result set.
Limiting the number or rows returned can be valuable for reporting, analysis, data browsing, and
other tasks. Queries that order data and then limit row output are widely used and are often referred
to as Top-N queries.
You can specify the number of rows or percentage of rows to return with the FETCH FIRST/NEXT
keywords. You can use the OFFSET keyword to specify that the returned rows begin with a row after
the first row of the full result set.
You specify the row-limiting clause in the SQL SELECT statement by placing it after the ORDER BY
clause. Note that an ORDER BY clause is not required.
• OFFSET: Use this clause to specify the number of rows to skip before row limiting begins.
The value for offset must be a number. If you specify a negative number, offset is treated as
0. If you specify NULL or a number greater than or equal to the number of rows that are
returned by the query, 0 rows are returned.
• ROW | ROWS: Use these keywords interchangeably. They are provided for semantic clarity.
• FETCH: Use this clause to specify the number of rows or percentage of rows to return.
- FIRST | NEXT: Use these keywords interchangeably. They are provided for
semantic clarity.
- row_count | percent PERCENT: Use row_count to specify the number of rows
to return. Use percent PERCENT to specify the percentage of the total number of
selected rows to return. The value for percent must be a number.
Example: Find the top three job titles contributing to the most payroll expenses within each
department.
• Use a rank window function and a nested query.
SQL> SELECT deptno, job, sum_sal
FROM (SELECT deptno, job, SUM(sal) sum_sal, RANK() OVER
(PARTITION BY deptno ORDER BY sum(sal) DESC)
sum_sal_rank
FROM emp
GROUP BY deptno, job)
WHERE sum_sal_rank <= 3;
Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate
functions in that they return multiple rows for each group.
The group of rows is called a window and is defined by the RANK clause. For each row, a sliding
window of rows is defined. The window determines the range of rows used to perform the
calculations for the current row. Window sizes can be based on either a physical number of rows or
a logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the final ORDER BY
clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic
functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY
clause.
Use the PARTITION BY clause to partition the query result set into groups based on one or more
values. If you omit this clause, then the function treats all rows of the query result set as a single
group.
You can specify multiple analytic functions in the same query, each with the same or different
PARTITION BY keys.
Use the ORDER BY clause to specify how data is ordered within a partition.
When aggregation functions and analytic functions sort large volumes of data, exact Top-
N queries require lots of memory and are time consuming.
• Approximate query processing is much faster.
• It is useful for situations where a tolerable amount of error is acceptable.
APPROX_FOR_AGGREGATION = true • Automatically replaces exact query processing for
aggregation queries with approximate query processing.
•
APPROX_FOR_COUNT_DISTINCT = true Automatically replaces COUNT (DISTINCT expr) queries
with APPROX_COUNT_DISTINCT queries.
• Use the new approximate functions, APPROX_COUNT and APPROX_SUM to replace their
exact counterparts of the exact version, COUNT and SUM.
• For each APPROX_COUNT / APPROX_SUM that appears in the SELECT list, a
corresponding APPROX_RANK function in the HAVING clause is required.
Oracle Database 18c introduces the APPROX_COUNT and APPROX_SUM approximate functions to
replace their exact counterparts, COUNT and SUM functions in approximate Top-N queries.
To use one or the other approximate function in a SELECT list, there must be an APPROX_RANK
function in the HAVING clause.
In the example of the slide, the basic syntax returns the Top-N rows globally.
Compared to the basic syntax for the exact aggregate queries, the extended syntax lifts the following
restriction: It is legal to apply an expression on top of an approximate function.
SELECT job, 0.9 * APPROX_SUM(sal) sum_sal
FROM emp
GROUP BY job
HAVING APPROX_RANK(ORDER BY APPROX_SUM(sal) DESC) <= 10;
• Find the jobs that are among the top 10 in terms of total salary per department.
SQL> SELECT deptno, job, APPROX_SUM(sal),
APPROX_RANK(partition by deptno ORDER BY APPROX_SUM(sal) desc) Rk
FROM emp
GROUP BY deptno, job
HAVING APPROX_RANK(partition by deptno ORDER BY APPROX_SUM(sal) desc)
<= 10;
If you want to partition the table and return Top-N rows per partition, called “intra partition top N”, the
basic syntax must use a view and rank window function.
SELECT deptno, job, sum_sal
FROM (SELECT deptno, job, sum(sal) sum_sal,
RANK() OVER (PARTITION BY deptno ORDER BY sum(sal) DESC)
sum_sal_rank
FROM emp
GROUP BY deptno, job)
WHERE sum_sal_rank < 10;
There is an extended syntax for approximate Top-N queries to handle such cases as in the example
in the slide. The detailed processing is as follows:
1. GROUP BY exp_1, …, expr_j first. Each output of GROUP BY contains the group by keys and
the approximate aggregated values.
2. Partition the GROUP BY outputs. The partition key is specified in APPROX_RANK(PARTITION
BY … ORDER BY … DESC). Within each partition, the approximate aggregated values are
ranked by the specified order. Each output of PARTITION BY contains the group by keys,
the approximate aggregated values, and their corresponding ranks. The PARTITION BY
keys must be a subset of the GROUP BY keys. When PARTITION BY keys are equivalent to
GROUP BY keys, the output PARTITION BY is simply the output of GROUP BY
concatenated with ranks that are all one because there is only one row per partition.
3. Filters are applied on the PARTITION BY outputs. A HAVING clause is mandatory. The
HAVING clause can contain only ANDed predicates. Each predicate must be in the format of
APPROX_RANK(PARTITION BY … ORDER BY … DESC)<= N.
• Find the jobs that are among the top 2 in terms of total salary, and among the top 3 in
terms of number of employees holding the job titles per department.
SQL> SELECT deptno, job, APPROX_SUM(sal), APPROX_COUNT(*)
FROM emp GROUP BY deptno, job
HAVING APPROX_RANK(partition by deptno order by APPROX_SUM(sal) desc)
<= 2
AND APPROX_RANK(partition by deptno order by APPROX_COUNT(*) desc) <= 3;
• There can be multiple approximate functions in the SELECT list For each approximate
function, there has to be a corresponding predicate in the HAVING clause.
To use both approximate functions in a SELECT list, you must define a HAVING clause for the first
approximate function and an ANDed predicate for the other approximate function.
The query asks to return the jobs that are among the top 2 in terms of total salary, and among the
top 3 in terms of number of employees holding the job titles per department. A job title ‘CEO’ might
satisfy the first filter condition (CEO has the highest pay) but not satisfy the second filter condition
(one CEO in a company only).
The corresponding execution plan shows:
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 120 | 4 (25)| 00:00:01 |
|* 1 | SORT GROUP BY APPROX| | 8 | 120 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 42 | 630 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(APPROX_RANK(PARTITION BY "DEPTNO" ORDER BY
APPROX_SUM("SAL") DESC)<=2 AND APPROX_RANK(PARTITION BY "DEPTNO" ORDER
BY APPROX_COUNT(0) DESC)<=3)
APPROX is the option of the SORT GROUP BY row source to indicate that the row source contains
approximate aggregates.
• Find the jobs that are among the top 3 in terms of total salary, and among the top 2 in
terms of number of employees holding the job titles per department.
SQL> SELECT deptno, job, APPROX_SUM(sal), APPROX_COUNT(*)
FROM emp
GROUP BY deptno, job
HAVING APPROX_RANK(partition by deptno order by APPROX_SUM(sal) desc)
<= 3
AND APPROX_RANK(partition by deptno order by APPROX_COUNT(*) desc)
<= 2;
The query asks to return the jobs that are among the top 2 in terms of total salary, and among the
top 3 in terms of number of employees holding the job titles per department.
Comparing the query in the slide with the query in the previous slide, the results are different. Both
filter conditions must be satisfied to produce appropriate groups.
• Report the accuracy of the approximate aggregate by using the MAX_ERROR attribute.
Sharding Enhancements
Composite Sharding:
Data is first partitioned by list or range across
multiple shardspaces, and then further
partitioned by consistent hash across multiple
Introduced in Oracle Database 12c release 12.2.0.1, Oracle Sharding provided two methods of
sharding data:
• System-managed sharding
• Composite Sharding
System-managed sharding is a sharding method that does not require the user to specify a
mapping of data to shards. Data is automatically distributed across shards using partitioning by
consistent hash. The partitioning algorithm evenly and randomly distributes data across shards. The
distribution used in system-managed sharding is intended to eliminate hot spots and provide uniform
performance across shards. Oracle Sharding automatically maintains balanced distribution of data
when shards are added to or removed from an SDB.
Consistent hash is a partitioning strategy that is commonly used in scalable distributed systems. It is
different from traditional hash partitioning. With traditional hashing, the bucket number is calculated
as HF(key) % N, where HF is a hash function and N is the number of buckets. This approach works
well if N is constant, but requires reshuffling of all data when N changes. More advanced algorithms,
such as linear hashing, do not require rehashing of the entire table to add a hash bucket, but they
impose restrictions on the number of buckets (such as it can only be a power of 2), and on the order
in which the buckets can be split.
The implementation of consistent hashing that is used in Oracle Sharding avoids these limitations by
dividing the possible range of values of the hash function (for example, from 0 to 232) into a set of N
adjacent intervals, and assigning each interval to a chunk. In this example, the SDB contains 1024
chunks, and each chunk gets assigned a range of 222 hash values. Therefore, partitioning by
consistent hash is essentially partitioning by the range of hash values.
Oracle Database 18c introduces the user-defined sharding method that lets you explicitly specify the
mapping of data to individual shards. It is used when, because of performance, regulatory, or other
reasons, certain data needs to be stored on a particular shard, and the administrator must have full
control over moving data between shards.
Another advantage of user-defined sharding is that, in case of planned or unplanned outage of a
shard, you know exactly what data is not available. The disadvantage of user-defined sharding is the
need for the database administrator to monitor and maintain balanced distribution of data and
workload across shards.
With user-defined sharding, a sharded table can be partitioned by range or list. There is no
tablespace set defined for user-defined sharding. Each tablespace has to be created individually and
explicitly associated with a shardspace. A shardspace is a set of shards that store data that
corresponds to a range or list of key values.
As with system-managed sharding, tablespaces created for user-defined sharding are assigned to
chunks. However, no chunk migration is automatically started when a shard is added to the SDB.
The user needs to execute the MOVE CHUNK command for each chunk that needs to be migrated.
GDSCTL CREATE SHARDCATALOG supports user-defined sharding with the value USER in the –
sharding option
The SPLIT CHUNK command, which is used to split a chunk in the middle of the hash range for
system-managed sharding, is not supported for user-defined sharding. You must use the ALTER
TABLE SPLIT PARTITION statement to split a chunk.
12c
Sharded databases must consist of sharding catalogs and shards that can be:
• Single-instance database
• Oracle RAC–enabled stand-alone databases
• CDBs not supported
18c
• A shard and shard catalog can be a single PDB in a CDB.
• GDSCTL ADD SHARD command includes the –cdb option.
18c
• Split chunk support
• Automatic CDR support of tables with unique indexes/constraints
Enhancements in Oracle GoldenGate 13c were introduced to provide support for Oracle Sharding
high availability, but there were some limitations. In 18c GoldenGate now supports the GDSCTL
SPLIT CHUNK command. Auto CDR was introduced in Oracle Database 12.2 (and Oracle
GoldenGate 12.3) to automate the conflict detection and resolution configuration in active-active
GoldenGate replication setups. However, Auto CDR was allowed only on tables with primary keys. In
Oracle Database 18c, this restriction is relaxed and Auto CDR is supported on tables with just
unique keys/indexes but no primary keys.
12c
• Shards managed individually
• No aggregate views from all shards
18c
SHARDS() clause and shard_id
SQL> SELECT sql_text, shard_id FROM SHARDS(sys.v$sql) a WHERE a.sql_id = '1234';
In Oracle Database 12c Release 2, to perform maintenance operations, you had to go to each
database individually. Easy, centralized diagnostics collection from all of the shards was not
available.
With Oracle Database 18c, you can use the SHARDS() clause to query Oracle-supplied tables to
gather performance, diagnostic, and audit data from V$ views and DBA_* views. The shard catalog
database can be used as the entry point for centralized diagnostic operations using the SQL
SHARDS() clause. The SHARDS() clause allows you to query the same Oracle supplied objects,
such as V$, DBA/USER/ALL views and dictionary objects and tables, on all of the shards and return
the aggregated results.
As shown in the examples, an object in the FROM part of the SELECT statement is wrapped in the
SHARDS() clause to specify that this is not a query to a local object, but to objects on all shards in
the sharded database configuration. A virtual column called SHARD_ID is automatically added to a
SHARDS()-wrapped object during execution of a multi-shard query to indicate the source of every
row in the result. The same column can be used in predicate for pruning the query.
A query with the SHARDS() clause can be executed only on the shard catalog database.
12c
Multi-shard queries always used SCN synchronization and were resource intensive
18c
New initialization parameter: MULTISHARD_QUERY_DATA_CONSISTENCY
You may want to specify different data consistency levels for some multi-shard queries, because, for
example, it may be desirable for some queries to avoid the cost of SCN synchronization across
multiple shards and these shards could be globally distributed. Another use case is when you are
using standbys for replication and it is acceptable to have slightly stale data for multi-shard queries,
the results could be fetched from the primary or its standbys.
A new user-visible database parameter, multishard_query_consistency_level, has been
added in Oracle Database 18c to specify consistency level for multi-shard queries. The parameter
can have one of the following values:
• strong (default): With this setting, SCN synchronization is performed across all shards, and
data is consistent across all shards. This setting provides global consistent read capability.
This is the default value.
• shard_local: With this setting, SCN synchronization is not performed across all shards.
Data is consistent within each shard. This setting provides the most current data.
• delayed_standby_allowed: With this setting, SCN synchronization is not performed
across all shards. Data is consistent within each shard. This setting allows data to be fetched
from Data Guard standby databases when possible (for example, depending on load
balancing), and may return stale data from standby databases.
The default mode is strong, which performs SCN synchronization across all shards. Other modes
skip SCN synchronization. The delayed_standby_allowed mode allows fetching data from the
standbys as well, depending on load balancing and do on and thus may have stale data.
This parameter can be set either at the system level or at the session level.
See the Oracle Database Reference Guide for more information about
MULTISHARD_QUERY_DATA_CONSISTENCY usage.
LOB is a widely used, first class data type in Oracle Database. Release 18c enables the use of
LOBs, JSON, and spatial objects in an Oracle Sharding environment, which is useful for applications
that use these data types where storage in sharded tables would facilitate business requirements.
JSON operators that generate temporary LOBs, large JSON documents (those that require LOB
storage), spatial objects, index and operators, and persistent LOBs can be used in an Oracle
Sharding environment. The following interfaces are new or changed as part of this feature.
This release enables JSON operators that generate temporary LOBs, large JSON documents (those
that require LOB storage), spatial objects, index and operators, and persistent LOBS to be used in a
sharded environment.
In a system-managed sharded database, you must specify a tablespace set for the LOBs, and then
include it in the CREATE SHARDED TABLE statement for the parent table as shown in the examples
here.
SQL> CREATE SHARDED TABLE customers ( CustId VARCHAR2(60) NOT NULL, … image BLOB,
CONSTRAINT pk_customers PRIMARY KEY (CustId),
CONSTRAINT json_customers CHECK (CustProfile IS JSON))
In a composite sharded database, you must specify a tablespace set for each shardspace for the
LOBs, and then include them in the CREATE SHARDED TABLE statement for the parent table as
shown in the examples in the slide.
In a user-defined sharded database, you must specify a tablespace, not a tablespace set, for each
shardspace for the LOBs, and then include them in the CREATE SHARDED TABLE statement for the
parent table as shown in the examples in the slide.
12c
• There are restrictions on query shapes.
• Only system-managed sharding is supported.
18c
• All query shapes supported
• System-managed, user-defined, and composite sharding methods supported
• Centralized execution plan display available
• Oracle supplied objects in queries
In Oracle Database 12.2, there were several restrictions on the query shapes that could be used on
queries over multiple shards, and multi-shard queries were supported only in sharded databases
using the system-managed sharding method.
The restrictions lifted in Oracle Database 18c are:
• Support for composite and user-defined sharding
• Multi-shard query execution plan display
• Support for all query shapes like views, subqueries, joins on non-sharding column, and so
on.
• Support for Oracle-supplied tables/views (using SHARDS() clause and SHARD_ID) and
PL/SQL functions.
• Support for multi-column sharding keys.
• Use of SET operators
12c
Oracle Sharding documentation has its own book, Oracle Database Using Oracle Sharding,
included in Oracle Database documentation library in Oracle Help Center.
In Oracle Database 18c, the Oracle Sharding documentation has been moved from part seven of the
Oracle Database Administrator’s Guide to its own new book, called Oracle Database Using Oracle
Sharding, in the Oracle Database documentation library in Oracle Help Center.
Database Sharding
To get detailed information about how to perform any of the operations covered in this lesson, refer to
the following guides in the Oracle documentation:
• Oracle Database SQL Language Reference 12c Release 2 (12.2)
• SQL*Plus User’s Guide and Reference Release 2 (12.2)
• SQLcl – The New SQL*Plus
• Video: Oracle SQL Developer Meets SQL*Plus
• Oracle SQLcl Slidedeck: Overview of Our New Command-line Interface
• The Modern Command Line
• SQLcl: The New Challenger for the SQL*Plus Crown
• Kris’ blog
Server
Server A Server B Server C
Sharding is a data tier architecture where data is horizontally partitioned across independent
databases. Each database in such a configuration is called a shard. All shards together make up a
single logical database, which is known as a sharded database or SDB.
Horizontal partitioning involves splitting a database table across shards so that each shard contains
the table with the same columns but a different subset of rows. The diagram in the slide shows an
unsharded table on the left with the rows represented by different colors. On the right, the same table
data is shown horizontally partitioned across three shards or independent databases. Each partition
of the logical table resides in a specific shard. Such a table is referred to as a sharded table.
Sharding is a shared-nothing database architecture because shards do not share physical resources
such as CPU, memory, or storage devices. Shards are also loosely coupled in terms of software;
they do not run clusterware.
From a database administrator’s perspective, an SDB consists of multiple databases that can be
managed either collectively or individually. However, from an application developer’s perspective, an
SDB looks like a single database: the number of shards and the distribution of data across them are
completely transparent to database applications.
Sharding eliminates performance bottlenecks and makes it possible to linearly scale performance
and capacity by adding shards. Sharding is a shared-nothing architecture that eliminates single
points of failure—such as shared disks, SAN, and clusterware—and provides strong fault isolation.
The failure or slowdown of one shard does not affect the performance or availability of other shards.
Sharding enables storing particular data close to its consumers and satisfying regulatory
requirements when data must be located in a particular jurisdiction. Applying configuration changes
on one shard at a time does not affect other shards, and allows administrators to first test changes
on a small subset of data. Sharding is well suited to deployment in the cloud. Shards may be sized
as required to accommodate whatever cloud infrastructure is available and still achieve required
service levels. A sharded database (logical representation) supports up to 1,000 shards
(independent databases).
• Relational schemas
• Database partitioning
• ACID properties and read consistency
• SQL and other programmatic interfaces
• Complex data types
• Online schema changes
• Multicore scalability
Oracle Sharding provides the benefits of sharding without sacrificing the capabilities of an enterprise
RDBMS.
Oracle Sharding is for OLTP applications that are suitable for a sharded database.
Existing applications that were never intended to be sharded require some level of redesign to
achieve the benefits of a sharded architecture. In some cases, it may be as simple as providing the
sharding key; in other cases, it may be impossible to horizontally partition the data and workload as
required by a sharded database.
Many customer-facing web applications, such as e-commerce, mobile, and social media are well-
suited for sharding. Such applications have a well-defined data model and data distribution strategy
(hash, range, list, or composite) and primarily access data by using a sharding key. Examples of
sharding keys include customer_ID, account_number, and country_id. Applications also
usually require partial denormalization of data to perform well with sharding.
OLTP transactions that access data associated with a single value of the sharding key are the
primary use cases for a sharded database—for example, lookup and update of a customer’s records,
subscriber documents, financial transactions, e-commerce transactions, and so on. Because all the
rows that have the same value of the sharding key are guaranteed to be on the same shard, such
transactions are always single-shard and executed with the highest performance and provide the
highest level of consistency. Multi-shard operations are supported, but with a reduced level of
performance and consistency. Such transactions include simple aggregations, reporting, and so on,
and play a minor role in a sharded application relative to workloads dominated by single-shard OLTP
transactions.
Shards are independent Oracle databases that are hosted on database servers that have their own
local resources: CPU, memory, and disk. No shared storage is required across the shards. A
sharded database is a collection of shards. Shards can all be placed in one region or can be placed
in different regions. A region in the context of Oracle Sharding represents a data center or multiple
data centers that are in close network proximity. All shards of an SDB always have the same
database schema and contain the same schema objects.
A global service is an extension to the notion of a traditional database service. All the properties of
traditional database services are supported for global services. For sharded databases, additional
properties are set for global services, for example, database role, replication lag tolerance, region
affinity between clients and shards, and so on. For a read/write transactional workload, a single
global service is created to access data from any primary shard in an SDB.
The shard catalog is an enhanced Global Data Services (GDS) catalog to support Oracle Sharding.
A shard director is a specific implementation of a global service manager that acts as a regional
listener for clients that connect to an SDB, and maintains a current topology map of the SDB. Oracle
supports connection pooling in data access drivers such as OCI, JDBC, ODP.NET, and so on. In
Oracle 12c Release 2, these drivers can recognize sharding keys that are specified as part of a
connection request. The diagram in the slide shows the typical components of Oracle Sharding.
The shard catalog is a special-purpose Oracle Database that is a persistent store for SDB
configuration data, and plays a key role in centralized management of a sharded database. All
configuration changes, such as adding and removing shards and global services, are initiated on the
shard catalog. All DDLs in an SDB are executed by connecting to the shard catalog.
The shard catalog also contains the master copy of all duplicated tables in an SDB. It uses
materialized views to automatically replicate changes to duplicated tables in all shards. The shard
catalog database also acts as a query coordinator that is used to process multi-shard queries and
queries that do not specify a sharding key.
High availability for the shard catalog can be implemented by using Oracle Data Guard. The
availability of the shard catalog has no impact on the availability of the SDB. An outage of the shard
catalog affects only the ability to perform maintenance operations or multi-shard queries during the
brief period required to complete an automatic failover to a standby shard catalog. OLTP
transactions continue to be routed and executed by the SDB, and are unaffected by a catalog
outage.
The global service manager was introduced in Oracle Database 12c to route connections based on
database role, load, replication lag, and locality. In support of Oracle Sharding, global service
managers have been enhanced to support the routing of connections based on the location of data.
A global service manager, in the context of Oracle Sharding, is known as a shard director.
A shard director is a specific implementation of a global service manager that acts as a regional
listener for clients that connect to an SDB, and maintains a current topology map of the SDB. Based
on the sharding key that is passed during a connection request, it routes the connections to the
appropriate shard.
For a typical SDB, a set of shard directors is installed on dedicated low-end commodity servers in
each region. Multiple shard directors should be deployed for high availability. In Oracle Database
12c Release 2, up to five shard directors can be deployed in a given region.
Shardgroup
Shard Director Shard Catalog shgrp1
Region
Shdir1,2 shardcat
Availability_Domain1
Primaries
Clients Connection …
Pools
HA Standbys
Connection
Pools …
Data Guard
Fast-Start Failover
Oracle Sharding is built on the Global Data Services (GDS) architecture. GDS is the Oracle
scalability, availability, and manageability framework for multidatabase environments. GDS presents
a multi-database configuration to database clients as a single logical database by transparently
providing failover, load balancing, and centralized management for database services.
GDS routes a client request to an appropriate database based on availability, load, network latency,
replication lag, and other parameters. In Oracle Database 12c Release 1, GDS supports only fully
replicated databases: it assumes that when a global database service is enabled on multiple
databases, all of them contain a full set of data provided by the service.
Oracle Database 12c Release 2 extends the concept of a GDS pool to a Sharded GDS pool. Unlike
the regular GDS pool that contains a set of fully replicated databases, the sharded GDS pool
contains all shards of an SDB and their replicas. For database clients, the sharded GDS pool creates
an illusion of a single sharded database, the same way as the regular GDS pool creates an illusion
of a single non-sharded database.
The diagram in the slide illustrates a typical GDS architecture that has two data centers (APAC,
EMEA) and two sets of replicated databases (SALES, HR). The GDS catalog is using Oracle Data
Guard between the two regions for high availability. The SALES database is replicated with Active
Data Guard. The HR database is replicated with Oracle GoldenGate.
• Use a sharding key (partition key) to distribute partitions across shards at the tablespace
level.
• The NUMBER, INTEGER, SMALLINT, RAW, (N)VARCHAR, (N)CHAR, DATE, and
TIMESTAMP data types are supported for the sharding key.
A sharded table is a table that is partitioned into smaller and more manageable pieces among
multiple database instances, called shards. Oracle Sharding is implemented based on the Oracle
Database partitioning feature. It is essentially distributed partitioning because it extends partitioning
by supporting the distribution of table partitions across shards.
Partitions are distributed across shards at the tablespace level, based on a sharding key. Each
partition of a sharded table resides in a separate tablespace, and each tablespace is associated with
a specific shard. Depending on the sharding method, the association can be established
automatically or defined by the administrator. Even though the partitions of a sharded table reside in
multiple shards, to the application, the table looks and behaves exactly the same as a partitioned
table in a single database. The SQL statements that are issued by an application need not refer to
shards or depend on the number of shards and their configuration.
The slide syntax shows a table that is partitioned by consistent hash, which is a special type of hash
partitioning that is commonly used in scalable distributed systems. This technique automatically
spreads tablespaces across shards to provide an even distribution of data and workload. The
database creates and manages tablespaces as a unit, called a tablespace set. The PARTITIONS
AUTO clause specifies that the number of partitions should be automatically determined. This type of
hashing provides more flexibility and efficiency in migrating data between shards, which is important
for elastic scalability.
A sharded table family is a set of tables that are sharded in the same way. Parent-child relationships
between database tables with a referential constraint in a child table (foreign key) that refers to the
primary key of the parent table form a tree-like structure where every child has a single parent. Such
a set of tables is referred to as a table family. A table in a table family that has no parent is called the
root table. There can be only one root table in a table family. In Oracle Database 12c Release 2, only
a single table family is supported in an SDB.
Reference partitioning is the recommended way to create a sharded table family. The corresponding
partitions of all the tables in the family are stored in the same tablespace set. Partitioning by
reference simplifies the syntax because the partitioning scheme is specified only for the root table.
Also, partition management operations that are performed on the root table are automatically
propagated to its descendants. For example, when adding a partition to the root table, a new
partition is created on all its descendants. The partitioning column is present in all tables in the
family. This is despite the fact that reference partitioning, in general, allows a child table to be equi-
partitioned with the parent table without having to duplicate the key columns in the child table. The
reason for this is that reference partitioning requires a primary key in a parent table because the
primary key must be specified in the foreign key constraint of a child table that is used to link the
child to its parent. However, a primary key on a sharded table must either be the same as the
sharding key or contain the sharding key as the leading column. This makes it possible to enforce
global uniqueness of a primary key without coordination with other shards, a critical requirement for
linear scalability.
Distribution of partitions across shards is achieved by creating partitions in tablespaces that reside
on different shards. Each partition of a sharded table is stored in a separate tablespace, making the
tablespace the unit of data distribution in an SDB. To minimize the number of multi-shard joins, the
corresponding partitions of all the tables in a table family are always stored in the same shard. This
is guaranteed when the tables in a table family are created in the same set of distributed tablespaces
as shown in the syntax examples for this lesson, where the tablespace set ts1 is used for all tables.
However, it is possible to create different tables from a table family in different sets of tablespaces,
for example, the Customers table in the tablespace set ts1 and Orders in the tablespace set ts2. In
this case, it must be guaranteed that the tablespace that stores partition 1 of Customers always
resides in the same shard as the tablespace that stores partition 1 of Orders. To support this
functionality, a set of corresponding partitions from all the tables in a table family, called a chunk, is
formed. A chunk contains a single partition from each table of a table family.
The illustration in the slide shows a chunk that contains corresponding partitions from the tables of
the Customers-Orders-LineItems schema.
System-managed sharding is a sharding method that does not require the user to specify a mapping
of data to shards. Data is automatically distributed across shards using partitioning by consistent
hash. The partitioning algorithm evenly and randomly distributes data across shards. The distribution
used in system-managed sharding is intended to eliminate hot spots and provide uniform
performance across shards. Oracle Sharding automatically maintains balanced distribution of data
when shards are added to or removed from an SDB.
Consistent hash is a partitioning strategy that is commonly used in scalable distributed systems. It is
different from traditional hash partitioning. With traditional hashing, the bucket number is calculated
as HF(key) % N where HF is a hash function and N is the number of buckets. This approach works
fine if N is constant, but requires reshuffling of all data when N changes. More advanced algorithms,
such as linear hashing, do not require rehashing of the entire table to add a hash bucket, but they
impose restrictions on the number of buckets, such as the number of buckets can only be a power
of 2, and on the order in which the buckets can be split.
The implementation of consistent hashing that is used in Oracle Sharding avoids these limitations by
dividing the possible range of values of the hash function (for example, from 0 to 232) into a set of N
adjacent intervals, and assigning each interval to a chunk. In this example, the SDB contains 1024
chunks, and each chunk gets assigned a range of 222 hash values. Therefore, partitioning by
consistent hash is essentially partitioning by the range of hash values.
Data is first partitioned by list or range across multiple shardspaces, and then further
partitioned by consistent hash across multiple shards in each shardspace.
The composite sharding method allows you to create multiple shardspaces for different subsets of
data in a table partitioned by consistent hash. A shardspace is a set of shards that stores data that
corresponds to a range or list of key values. System-managed sharding does not give you any
control over the assignment of data to shards.
When sharding by consistent hash on a primary key, there is often a requirement to differentiate
subsets of data within an SDB in order to store them in different geographic locations, allocate to
them different hardware resources, or configure high availability and disaster recovery differently.
Usually this differentiation is done based on the value of another (non-primary) column, for example,
customer location or a class of service.
With composite sharding, data is first partitioned by list or range across multiple shardspaces, and
then further partitioned by consistent hash across multiple shards in each shardspace. The two
levels of sharding make it possible to automatically maintain a balanced distribution of data across
shards in each shardspace, and at the same time, partition data across shardspaces. The slide
illustration shows two tablespace sets: tbs1 at the top and tbs2 at the bottom. Tablespace set tbs1 is
labeled “Shardspace for GOLD customers - shspace1” and contains three shards, each of which
contains a range of tablespaces and their respective partitions. Tablespace set tbs2 is labeled
“Shardspace for SILVER customers - shspace2” and contains four shards, each of which contains a
range of tablespaces and their respective partitions.
In addition to sharded tables, an SDB can contain tables that are duplicated on all shards. For many
applications, the number of database requests handled by a single shard can be maximized by
duplicating read-only or read-mostly tables across all shards. This strategy is a good choice for
relatively small tables that are often accessed together with sharded tables. A table with the same
contents in each shard is called a duplicated table.
Oracle Sharding synchronizes the contents of duplicated tables by using Materialized View
Replication. A duplicated table on each shard is represented by a read-only materialized view. The
master table for the materialized views is located in the shard catalog. The CREATE DUPLICATED
TABLE statement automatically creates the master table, materialized views, and other objects
required for materialized view replication. The materialized views on all the shards are automatically
refreshed at a configurable frequency. The refresh frequency of all duplicated tables is controlled by
the SHRD_DUPL_TABLE_REFRESH_RATE database initialization parameter. The default value for the
parameter is 60 seconds.
• Direct Routing: In the first case (the first bullet point), a transaction happens on a single
shard. In the second case (second bullet point), JDBC/UCP, OCI, and ODP.NET recognize
the sharding keys.
• Proxy Routing: In the last case (last bullet point), the queries perform in parallel across
shards (for example, aggregates on sales data).
App Tier
Connection
Pool
Routing Tier
Shard
Directors
App Tier
Connection
Pool
Routing Tier
Shard
Directors
App Tier
Connection
Pool
• The DBA can manually move or split a chunk from one shard to another.
• When a new shard is added, chunks are automatically rebalanced.
• Before a shard is removed, chunks must be manually moved.
• Connection pools are notified (via ONS) about a split, a move, addition or removal of
shards, auto-resharding, and read-only access operations.
• All shards can be patched with one command via opatchauto.
• EM supports monitoring and management of SDB.
In the second case (the second bullet point), RMAN incremental backup and transportable
tablespace are used.
In the fourth case (the fourth bullet point), the application can either reconnect or access read-only.
Oracle Sharding architecture uses separate server hosts for the shard catalog, shard directors, and
shards. The number of shards supported in a given sharded database (SDB) is 1,000. Deploying a
sharded database can be a lengthy process because the Oracle software is installed separately on
each server host.
The slide presents a very high-level overview of the steps that are necessary to deploy
Oracle Sharding. For detailed information, see Oracle Database Administrator’s Guide 12c Release
2 (12.2).
The slide continues with the very high-level overview of the steps that are necessary to
deploy Oracle Sharding. For detailed information, see Oracle Database Administrator’s Guide 12c
Release 2 (12.2).
14.Use GDSCTL connected to the shard director host to run the DEPLOY command, which:
– Creates all primary and standby shard databases using DBCA
– Enables archiving and flashback for all shards
– Configures Data Guard Broker with Fast-Start Failover enabled
– Starts observers on the standby group’s shard director
15.Use GDSCTL to add and start a global service that runs on all primary shards.
16.Use GDSCTL to add and start a global service for read-only workloads on all standby
shards.
The slide continues with the very high-level overview of the steps that are necessary to
deploy Oracle Sharding. For detailed information, Oracle Database Administrator’s Guide 12c
Release 2 (12.2).