Sei sulla pagina 1di 9

DISTRIBUTED DATABASE

Distributed databases are systems that act as single database but are located in
different locations. These locations can be anywhere, from in the next office to the other
side of the world. In a networked environment with various nodes connected together,
distributed databases act as a single system. In order to fully understand how a distributed
system works, the DBA must have a knowledge and understanding of multiple hardware
systems, networking, client/server technology, and database management.

There is little information concerning the management of distributed databases


dramatically changes with the release of Oracle8i and the further development of data
warehouses.

Describing Each Type of Database :

Distributed databases are actually two types of systems :

• Distributed databases with remote queries, data manipulation, and two-phase


commit.
• Replicated databases through data managed methods such as snapshots and
triggers, or other non-database managed methods (such as the COPY in
SQL*PLUS)

A distributed database, in the purest form, is a series of independent databases


logically linked together through a network to form a single view. Replicated databases
contain information from other remote databases copied through a network connection.

Replicated databases are most easily classified by the method used to pass
information between them. The following are the two primary methods for this copy
process (most commonly referred to as propagation) :

• Distributed transactions
• Snapshot refreshes

The distributed transaction is the process wherein a user’s updates to one site and
changes are sent to another site by means of triggers and procedures. Snapshots are
copies of a table ( or subset) that are propagated to each of the remote sites from a master
site.

In order to determine the best access technique for an optimum execution plan,
the optimizer must first be capable of determining a sufficient number of alternative
paths. The remote join enhancements in Oracle8i present more options, allowing for
better execution plans to be generated with a corresponding performance increase.
DATABASE NAMING CONVENTIONS

Access to the Internet and Internet-based databases require naming conventions.


Oracle recommends that the naming convention of databases and their links follow
standard Internet domain-naming conventions. This convention can have several parts,
and each is divided by dots like the ones in IP address.

The name of the database is read from right to left. This naming convention can
have several parts, with the first (rightmost) being the base or root part domain. By
default, the domain is world. This is not required in Net8 but for consistency, it might be
best to include it in order to support the older versions of SQL*Net.

The domain name can be based on the structure of the company or locations. For
example, the naming convention for the database in Germany can be done in a few ways.
In the simples form, the name can be GERMANY.WORLD. If the database name is
based on location, the name would be GERMANY.BWC with BWC being the domain.
One other way would be to expand the location to continents, and then the name would
be GERMANY.EUROPE.BWC. Whatever the naming convention, it must be easily
understood for future maintainability yet remain transparent to programmers and users.

There are some limitations when naming a domain and database. In addition to
normal Oracle naming conventions, (no space, no special characters, and so on), the size
of the name may be limited by the operating system or network.

ACHIEVING TRANSPARENCY

It is important that when providing names to the tables and objects on remote
systems, the naming convention allows the programmer and user of the system to access
the table as they would if it were local. Transparency is the concept that all objects are
accessible and look the same for the DBA and user alike. Oracle, through the use of
SQL*Net and transparent gateways, enables the development of a system that looks the
same regardless of database vendor, type, and location.

The purpose of transparency is to provide seamless access to all databases. For


example, users should be able to access any table in the same method. A table located on
SYBASE database should be accessible with the same SQL syntax as a local table. A
table located on an Oracle database in Germany should be accessible with the same
syntax as a local table in Detroit. For example, through the use of a transparent gateway,
the SYBASE system will now look like an Oracle database and can be referenced by
using a database link.

Heterogeneous service(HS) agents have been made multithreaded in Oracle8i.


This new feature will reduce the amount of system resources consumed when there are
large numbers of user sessions concurrently accessing the same non-Oracle system. This
more efficient use of system resources allows a greater number of concurrent user
sessions.
Oracle Database Sybase Database
(db.hpux.world) (db.ibm.world)

TCP/IP Protocol Adapter Transparent Gateway


Adapter for Sybase

HP/UX Token Ring Protocol


Adapter

TCP/IP Network IBM MVS

CLIENT Token Ring Network


Multi-protocol
Adapter

There must be a methodology in place to propagate any change of database


address to other distributed sites. This is ultimately the responsibility of the DBA. Many
sites place the TNSNAMES.ORA and SQLNET.ORA on a centralized application server,
which enables management of TNSNAMES in less painful manner. However, this may
not always be an optimal solution. In a distributed environment, each site might have
other independent databases and cannot have a localized TNSNAMES. If this is the case,
the company might decide to use Oracle Names rather than Net8 as the method of
managing connectivity. Schema names, userIDs, and passwords must also be managed
when creating a distributed environment.

Agent self-registration was introduced in Oracel8.0. It reduces or eliminates the


need for DBA intervention in configuring heterogeneous services. In Oracle8i, code has
been rewritten to make the self-registration process more efficient.

New in Oracle8i is an agent-specific, shared library for HS object file, other than
drivers, that is substituted when linking agent executables. While the benefits from this
change are platform-specific, they can improve scalability by using a single agent library
for all types of agents (extproc, hsalloci, hssqlpss, hsdepxa and hsots). Also memory
requirements might be reduced because agent executables become quite small.
USING A DISTRIBUTED DATABASE

The distributed database is made up of several databases that are linked by


database links and network connections. Management of these systems can vary greatly.
In large, highly decentralized companies, each location may have its own DBA. In a
highly centralized company, one DBA may be responsible for databases located in
different states, and in some cases, countries.

There are several reasons to utilize a distributed database system. The following
are some of them :

• Application design – Certain software design evolve into a distributed system.


Companies that have multiple locations with separate IT/IS departments or
databases will utilize the same or similar software. Each locality will maintain
data that is unique to its own environment. As the company evolves, each locality
will require access to data located at the other sites. This will eventually develop
into a distributed system.
• Improved Performance – Local databases will be smaller than a larger centralized
database. As a result, queries and other transactions on the database will be faster.
Network activity will be significantly reduced, improving the overall system.
• Smaller, less expensive hardware requirements – By reducing the total number of
users on each system, the hardware required to support such a system can be
considerably smaller.
• Improved reliability and availability – By distributing the data over several sites,
the system is more likely to be up. While one site might be down, the rest of the
system will still be up and accessible. This can be improved by distributing not
only the data but the software as well. The distribution of software adds another
dimension of reliability.
• Improved mass deployment support and front office applications – In release 8.1
(8i), Oracle is providing its advanced replication to back office types of
applications and front office applications. Back office applications require near
real-time replication of data. The front office applications area is growing market,
particularly for mass deployment where the advanced replication functionality has
many advantages.

SETTING UP A DISTRIBUTED SYSTEM

The setup of a distributed system might be dependent on the application. If the


software utilizes the distributed database, many of the tasks performed by the DBA, such
as the creation of database links, might be performed automatically during the installation
of the software.

Each database will be installed as an autonomous database. The setup should be


based on the number of concurrent users on the database in each country. Other factors to
consider when installing the database are as follows :
• Overall use of the database
• Other applications using the database
• Standard DBA tuning techniques
• Type of network protocol for connectivity
• How dynamic the data is, and how it will affect the network traffic
• Current level of network traffic and what the new databases will do to affect it

USING DATABASE LINKS

Database links comprise the method used by the Oracle access a remote database
object. There are three types of databases links :

• PUBLIC LINKS : A public database link is similar to a public synonym; when


referenced, it is accessible to all users.

• PRIVATE LINKS : Private links are accessible by owner(schema) of the link.

• GLOBAL LINKS : A global link is created automatically when using Oracle


Names.

The syntax for creating a public or private database link is essentially the same :

CREATE
[SHARED]
[PUBLIC]
DATABASE LINK dblink
[authenticated clause]|[CONNECT TO [ CURRENT_USER|user
IDENTIFIED BY password] [authenticated clause] USING ‘{connect string}’;

There are several differences between Oracle8 and Oracle7 for this syntax.

SHARED is an entirely new process. In order to use this process, the database
must be using a multithreaded server. This enables the creation of database link that can
use existing connections, if available.

AUTHENTICATED BY username IDENTIFIED BY password

The authenticated clause is associated with the SHARED portion of the database
link. This must be used when using SHARED.

CURRENT_USER uses the new Oracle8 global user type. This powerful new
feature enables creation of an user that can access any database on a node (server) with
the use of a single login.
Using the BWC example, the following code would be create a simple public link
in the Germany database from Detroit :

CREATE PUBLIC DATABSE LINK germany.bwc.com


USING ‘GERMANY’;

If not defined by Oracle Names, the name GERMANY must appear, either as an
alias or as the actual name in TNSNAMES.ORA. The TNSNAMES.ORA must be on the
client and server. With this syntax, the user must have a current login on both Germany’s
and Detroit’s database.

USING INITIALIZATION PARAMETERS FOR DISTRIBUTED SYSTEMS :

Parameter Description
COMMIT_POINT_STRENGTH This parameter is used to set the commit
(0-255) point site in the two-phased commit. The
site with the highest commit point strength
will be the commit point site. Each of the
sites using this as the commit point site
must be on the same node. The factors
determining which database should be the
commit point site should be ownership
(driver) of data, criticality of the data, and
availability of the system.
DISTRIBUTED_TRANSACTIONS Limits the number of concurrent distributed
(0-TRANSACTIONS) transactions. If set to 0, the process RECO
(Oracle’s recovery process) is not activated,
and distributed capabilities of the database
are disabled.
GLOBAL_NAMES If set to TRUE, the name referenced in the
link must match the name of the database
and not the alias. This must be used in order
to utilize advanced replication features of
Oracle8.
DML_LOCKS (20-unlimited) Limits the number of DML locks in a single
transaction.
ENQUEUE RESOURCES (10-65535) Allows several concurrent process to share
resources. To determine whether this value
is set appropriately, enqueue_waits in the
v$systat table. If this is a non-zero value,
increase the enqueue resoures.
MAX_TRANSACTION BRANCHES The maximum value for this parameter has
(1-32) been increased from 8 to 32. Branches are
the numbers of different servers (or server
groups) that can be accessed in a single
distributed transaction. Reducing this
number might decrease the amount of
shared pool use.
OPEN_LINKS (0-255) The number of concurrent open
connections to other databases by one
session. In a distributed environment, care
must be taken to ensure that this value is
not less than the number of remote tables
that can be referenced in a single SQL
statement. If the value is set to 0, there can
be no distributed transactions.

OPEN_LINKS_PER INSTANCE It limits the number of open links created


(0-UB4MAXVAL) by an external transaction manager.

IDENTIFYING POTENTIAL PROBLEMS WITH A DISTRIBUTED SYSTEM :

System changes are the biggest problem in managing a simple distributed system.
It similar to managing a single, autonomous database, but the difference is that each
database can see objects in other databases. Following are the changes within a system
that must be coordinated within a distributed system.

• Structural changes - Procedures and triggers used to update or replicate specific


remote tables will fail when the remote table is structurally altered or removed.
Database links that reference a dropped table will fail.
• Schema changes – In addition to tables being removed, the privileges of an user
referenced in a database link must have the proper privileges. Changes to profiles
and roles must be coordinated to ensure that they will not affect the rest of the
distributed system.
• Changes to SQL*Net objects – Files such as TNSNAMES, PROTOCOL and
TNSNAV will affect the distributed database system.
• System changes – Coordination between outages is crucial, particularly to
systems that depend on distributed updates, snapshots, or replication. In many
cases, if this process is broken, manual intervention must occur in order to repair
the problem.

USING DISTRIBUTED TRANSACTIONS :

When referring to a distributed transaction and the two-phased commit, there is a


complexity that might seem overwhelming. Oracle7 and Oracle8i manage the distributed
transaction automatically. Most DBAs will not have the opportunity to actually see a two-
phase commit because they are trigger-based processes.
UNDERSTANDING TWO-PHASED COMMIT :

A two-phased commit occurs with all data manipulation language (DML), and as
the name implies, it is made up of two distinct phases.

The First Phase (Prepare Phase) :

The first phase of a two-phased commit is called the prepare phase. The initiating
database is known as the global coordinator; it sends a message to all other databases
informing them that an update is about to occur. The receiving database will respond to
the global coordinator if the update can occur. The receiving database can respond in
three ways :

• Prepared – Ready for update.


• Read-only – No preparation necessary.
• Abort – The child cannot perform the update.

The receiving databases must also perform some system checks, including
sending a prepare statement to any of its own coordinating databases. This includes
placing locks (if not read-only) on the row to be updated. After all the receiving databases
have responded, the global coordinator places the transaction in-doubt. If all the children
respond with prepared or read-only, the global coordinator determines the commit point
site. The commit point site is the first database to be committed. Then it keeps up with
the status of the transaction until all transactions have been completed.

The second phase, or the commit phase, is the process of committing the data.
After all the receiving databases have successfully committed and communicated their
success to the global coordinator, the global coordinator sends a message to the commit
point site. The commit point site removes the status as in-double. Each of the children
will commit and release the row-level lock on the record.

DEALING WITH IN-DOUBT TRANSACTION :

There might also be information in the ADVICE column. This indicates to the
DBA where the transaction was. For example

ALTER SESSION ADVISE COMMIT;

The column GLOBAL_TRANS_ID in the DBA_PC2_PENDING table provides


the DBA with the transaction ID. This ID is the same on all databases involved in the
distributed transaction.
Look at the DBA_2PC_NEIGHBORS. The table looks like this :

LOCAL_TRAN_ID – The local transaction ID for this distributed transaction.


IN_OUT – Indicates whether the transaction is IN or OUT.

DATABASE – If the transaction is IN, this is the name of the database sending the
transaction. If the transaction is OUT, this is the name of the database link to the database
to which the transaction is being sent.

DBUSER_OWNER – If the transaction is IN, this is the name of the local user. If the
transaction is OUT, this is the owner of the database link.

INTERFACE – C for commit request, N if in prepare or read-only state.

DBID – Oracle unique identifier for the connecting database.

SESS# - Session ID of the connection.

BRANCH# - Transaction of ID for this branch.

Using the information in this table (the database and LOCAL_TRAN_ID), follow
this thread to the next database.

Query the DBA_2PC_PENDING table. If LOCAL_TRAN_ID and


GLOBAL_TRAN_ID match, this is the global coordinator. This is where the DBA
should start to follow the transaction thread.

To force a transaction, use the TRANSACTION_ID in the DBA_2PC_PENDING


table, as in the following :

Commit force ‘transaction id’;


Rollback force ‘transaction_id’;

If in doubt, always roll the transaction back. Then communicate this rollback to
the user community. As an added precaution, check the status of the tables affected to
ensure data consistency.

Potrebbero piacerti anche