Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
This is a discussion on Oracle 10g RAC load-balancing configuration within the Database
Discussions forums in Database and Unix Discussions category; Hi, I am new to RAC
administration. I have been trying to configure load-balancing across two nodes in a cluster.
192.168.2.5, 192.168.2.6 are public IP addresses of the nodes and 192.168.2.9, 192.168.2.10
are the VIP addresses. My configuration files look like this:
-------------------- SQLNET.ORA ---------------------------
TRACE_LEVEL_CLIENT = OFF
names.directory_path = (TNSNAMES)
names.default_domain = world
name.default_zone = world
break_poll_skip=2000000000
----------------------------------------------------------------------------
--------------------- TNSNAMES.ORA --------------------------
TPCC.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)
TPCC1.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)
TPCC2.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
(CONNECT_DATA =
(SID=PLSExtProc)(PRESENTATION=RO)))
LISTENERS_TPCC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
----------------------------------------------------------------------------
LISTENER_TPCC2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.6)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
)
)
----------------------------------------------------------------------------
INIT.ORA ---------------------------
BACKGROUND_DUMP_DEST =
/home/oracle/app/oracle/OraHome_1/rdbms/log
USER_DUMP_DEST =
/home/oracle/app/oracle/OraHome_1/rdbms/log
control_files = ?/dbs/tpcc_disks/control_001
parallel_max_servers = 48
DB_BLOCK_SIZE = 2048
recovery_parallelism = 20
compatible = 10.1.0.2
db_name = tpcc
db_files = 200
db_file_multiblock_read_count = 16
pga_aggregate_target = 800m
dml_locks = 500
log_checkpoint_interval = 100000000
log_checkpoints_to_alert = TRUE
open_cursors = 400
processes = 200
sessions = 600
transactions = 400
transactions_per_rollback_segment = 1
cursor_sharing = force
cursor_space_for_time = TRUE
timed_statistics = TRUE
disk_asynch_io = TRUE
db_writer_processes = 5
hpux_sched_noage = 178
db_cache_size = 2700m
java_pool_size = 0
large_pool_size = 300m
shared_pool_size = 1000m
log_buffer = 1024000
db_recycle_cache_size = 16m
db_cache_advice = ON
plsql_optimize_level = 2
undo_management = AUTO
-------------------------------------------------------------------
On both the nodes I have started the listeners LISTENER_TPCC1 and
LISTENER_TPCC2 respectively and their status looks similar to this:
-------------------------------------------------------------------
> lsnrctl start LISTENER_TPCC1
Connecting to
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.
2.9)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias LISTENER_TPCC1
Version TNSLSNR for HPUX: Version 10.1.0.2.0 -
Production
Start Date 25-APR-2006 20:16:12
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File
/home/oracle/app/oracle/OraHome_1/bench/bench1000/benchrun/network/listener
.ora
Listener Log File
/home/oracle/app/oracle/OraHome_1/network/log/listener_tpcc1.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.9)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.5)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)) )
The listener supports no services
The command completed successfully
-------------------------------------------------------------------
Everytime I try to start the database instances I get the following
error:
Can somebody please help me out in pointing out what am I doing wrong?
Thanks a lot,
Sudhanshu
SOLUTION
Thanks for your time guys. I tried quite a few things in the pfile -
single quotes, without quotes, double quotes - finally I just took the
entry from the tnsnames.ora and pasted it there in the pfile itself. So
the entries in the new pfile look like this:
tpcc1.local_listener = '(ADDRESS =
(PROTOCOL=TCP)(HOST=192.168.2.9)(PORT=1521))'
tpcc2.local_listener = '(ADDRESS =
(PROTOCOL=TCP)(HOST=192.168.2.10)(PORT=1521))'
remote_listener = '(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST =
192.168.2.9) (PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST =
192.168.2.10)(PORT = 1521)))'
After this the database came up without problems. Still connection load
balancing was not working. So I figured it had to do something with
tnsnames.ora. I rewrote the tnsnames.ora file as:
TPCC.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = tpcc))
)
TPCC1.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
(INSTANCENAME = tpcc1)
(SERVER = DEDICATED)
)
)
TPCC2.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
(INSTANCENAME = tpcc2)
(SERVER = DEDICATED)
)
---------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
--------------------------
> lsnrctl services LISTENER_TPCC1
Connecting to (DESCRIPTION=(address=(protocol=ipc)(key=extproc)) )
Services Summary...
Service "plsextproc" has 1 instance(s).
Instance "plsextproc", status UNKNOWN, has 1 handler(s) for this
service...
Handler(s):
"DEDICATED" established:0 refused:0
LOCAL SERVER
Service "tpcc" has 2 instance(s).
Instance "tpcc1", status READY, has 2 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.9)(PORT=152 1))
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Instance "tpcc2", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.10)(PORT=15 21))
The command completed successfully
--------------------------------------------------------------------------------------------
After checking several connections using:
sqlplus "tpcc-at-tpcc/tpcc"
I could see the connections were being load balanced properly between
the two nodes.
Thanks a lot for your time guys! Seems the problem had to do with
entries in tnsnames.ora and pfile.
- Sudhanshu
RAC Networking,
Failover, and Load
Balancing
Setting up failover, load
balancing, TAF and FCF
Objectives
At the end of this module the
student will understand the
following tasks and concepts.
• Understand how Oracle Net
interacts with RAC.
• Understand how to configure
tnsnames for RAC with
Transparent Application Failover
(TAF)
• Understand how to configure
JDBC connections for RAC
• Understand how to configure
Fast Connection Failover as an
alternative to RAC
Overview
• The Listener
• Configuration Files
• Oracle Net and RAC
• Connect-Time Failover
• TAF Failover
• Load Balancing
• JDBC Configuration for Failover
and Load Balancing
• Fast Connection Failover
Listener
Client
Node 1
Node 2
Database
Listener
• Resides on the server
• Listens for incoming client
connection requests
• Manages the traffic to the
server; when a client requests a
network session with a server,
the listener actually receives the
request and brokers the client
request
• If the client's information
matches the listener's
information, then the listener
grants a connection to the
server.
listener.ora
• Configuration file for the listener
process. Identifies:
o The listener name
listening
• Typically resides in the
%ORACLE_HOME%\network\adm
in directory on Windows NT.
tnsnames.ora
• Configuration file that
determines how you connect to
the database.
o Connect to specific instance
o Load Balancing
Connect-Time Failover
• One service
o Tries one node
o Session is lost
o Reconnection occurs
automatically
Connect-Time Failover
• TESTA1 =
• (DESCRIPTION=
• (ADDRESS_LIST=
• (LOAD_BALANCE=OFF)
• (FAILOVER=ON)
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb1)(PORT=1521))
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb2)(PORT=1521)))
• (CONNECT_DATA=
• (SERVICE_NAME=TESTA)
• )
• )
• TESTA2 =
• (DESCRIPTION=
• (ADDRESS_LIST=
• (LOAD_BALANCE=OFF)
• (FAILOVER=ON)
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb2)(PORT=1521))
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb1)(PORT=1521)))
• (CONNECT_DATA=
• (SERVICE_NAME=TESTA)
• )
• )
Failover - Backup
• Pairs of services
o The first service defines the
o Session is maintained
o Reconnection occurs
automatically
Failover – Backup
• TESTA1 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA1)
• (FAILOVER_MODE =
• (BACKUP = TESTA2)
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5))))
• TESTA2 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA2)
• (FAILOVER_MODE =
• (BACKUP = TESTA1)
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5))))
Failover - Preconnect
• Pairs of services
o The first service defines the second as
its backup
o The second service defines the first
service as its backup
o Both services preconnect to their
backup when the primary connections
are made
• Used for RAC failover
• Balance is done manually
• In the event of a failure
o Transaction is lost
o Session is maintained
o Queries are restarted
o Connection is quickly switched over to
the preconnected backup
Failover - Preconnect
• TESTA1 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1-vip)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA1)
• (FAILOVER_MODE =
• (BACKUP = TESTA2)
• (TYPE = SELECT)
• (METHOD = PRECONNECT)
• (RETRIES = 120)
• (DELAY = 5))))
• TESTA2 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2-vip)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA2)
• (FAILOVER_MODE =
• (BACKUP = TESTA1)
• (TYPE = SELECT)
• (METHOD = PRECONNECT)
• (RETRIES = 120)
• (DELAY = 5))))
Load Balancing
• One service
o Oracle Net determines which
Load Balancing
• TESTA =
• (DESCRIPTION =
• (ADDRESS_LIST =
• (LOAD_BALANCE = on)
• (FAILOVER = off)
• (ADDRESS = (PROTOCOL =
TCP)(HOST = oradb1-vip)(PORT
= 1523))
• (ADDRESS = (PROTOCOL =
TCP)(HOST = oradb2-vip)(PORT
= 1523))
• )
• (CONNECT_DATA =
• (SERVER = DEDICATED)
• (SERVICE_NAME = TESTA)
• )
• )
Transparent Application
Failover
• One service
o Multiple sessions are created
o No loss of connection
o Transaction is lost
o Query is restarted
Transparent Application
Failover
• TESTA =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (ADDRESS_LIST =
• (LOAD_BALANCE = on)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1-
vip)(PORT = 1523))
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2-
vip)(PORT = 1523))
• )
• (CONNECT_DATA =
• (SERVER = DEDICATED)
• (SERVICE_NAME = TESTA)
• (FAILOVER_MODE =
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5)
• )
• )
• )
Characteristics of TAF
• TAF protects:
o Client/server connection
o OCI programs
drivers)
o Sql*plus
srvctl or OEM
o Failover takes a few seconds
as one minute
o Can increase risk of disrupted
HKEY_LOCAL_MACHINE :
SYSTEM CurrentControlSet :
Services : Tcpip : Parameters
o Add KeepAliveTime :
Fast Connection
Failover
• Fast Connection Failover
provides a way to control
failover response
o Effective for Oracle 10g
sensitivity
o Fast and efficient
m:6200,RAC2.mycompany.co
m:6200,
o RAC3.mycompany.com:6200,
APPS1.mycompany.com:6200
,
o APPS2.mycompany.com:6200
ONS on the
Applications Tier
• When starting the
application:
o Specify the system property
–
Doracle.ons.iraclehome=<locatio
n of ons on client>
o Include the ons.jar file in the
application CLASSPATH
Modifying Server-Side
ONS for FCF
• Use the
%ORA_CRS_HOME%\bin\racgons
command to modify the server-
side ONS configuration to be
aware of the mid-tier nodes:
o racgons add_config
APPS1.mycompany.com:6200
APPS2.mycompany.com:6200
o This adds the host and port
Differences between
FCF and TAF
• Application-Level
Connection
Retries.
o Fast Connection Failover
supports application-level
connection retries. This
gives the application control
of responding to connection
failovers.
o TAF supports connection
retries only at the OCI/Net
layer.
Differences between
FCF and TAF
• Integration with
the Connection
Cache
o Fast Connection Failover is
well-integrated with the
Implicit Connection Cache,
which allows the connection
cache manager to manage
the cache for HA.
o Failed connections are
automatically invalidated in
the cache.
o TAF works at the network
level on a per-connection
basis, which means that the
connection cache cannot be
notified of failures.
Differences between
FCF and TAF
• Event-Based
o Fast Connection Failover is
based on the RAC event
mechanism. This means that
Fast Connection Failover is
efficient and detects failures
quickly for both active and
inactive connections.
o TAF is based on the network
call mechanism.
Differences between
FCF and TAF
• Load-Balancing
Support
o Fast Connection Failover
supports UP event load
balancing of connections and
runtime work request
distribution across active
RAC instances.
Review
• What are the two key
configuration files for Oracle
Net?
• What key phrase in the
tnsnames.ora file influences
load balance behaviour?
• What key phrase in the
tnsnames.ora file influences
failover behaviour?
• Which Oracle Net configurations
allow users to maintain
connections through a failover
event?
• Which JDBC drivers are
supported under TAF and FCF?
• What is the name of the utility
to add mid-tier nodes to the
server-side ONS configuration?
Summary
• Listener process
• Configuration files
o listener.ora
o tnsnames.ora
• Failover
o Basic
o Preconnect
• Load Balancing
• TAF
• JDBC Thick and Thin drivers and
TAF
• Setting up Fast Connection
Failover
The listener is a process that resides on the server whose responsibility is to
listen for incoming client connection requests and manage the traffic to the
server.
The listener brokers the client request, handing off the request to the server.
Every time a client (or server acting as a client) requests a network session with
a server, a listener receives the actual request. If the client's information matches
the listener's information, then the listener grants a connection to the server.
The tnsnames.ora (or the equivalent) must be used if you want to use
Transparent Application Failover. There must be a copy located on each RAC
server. In addition, key components of the server-side tnsnames.ora file must be
present in a client-side tnsnames.ora file. However, for JDBC connections, the
tnsnames.ora content is often contained in a long connection string in the JDBC
configuration file on the client. Note that the JDBC driver for TAF must be the
Oracle supported “thick” OCI driver.
Connect-Time Failover
By setting up multiple network paths, OCI will try each of the paths in the connection list. In this
manner you can set up multiple nodes in the cluster, but at the same time, specifying which node
is the primary connection. This can be done by putting the following entry into the tnsnames.ora
file.
prod1.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=OFF)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod1.us.company.com)))
prod2.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=OFF)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod2.us.company.com)))
In this example, there are two different service names. The first uses prod1 as the primary and
prod2 as the failover, the second uses prod2 as the primary and prod1 as the failover. This
allows you to do your own manual load balancing, including failover to the designated node. The
connection is re-established during failover, but session state is lost for all transactions and
queries.
This type of failover pre-dates RAC, and can even be used with some Oracle 8i Net connections.
TAF failover is preferred with RAC.
This failover method uses only one address per service name. However, each service explicitly identifies another service as its
BACKUP service. If the primary service fails, the connection will be switched to the backup service. This can be done by putting
the following entry into the tnsnames.ora file.
prod1.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)
(INSTANCE_NAME = prod1.us.company.com)
(FAILOVER_MODE =
(BACKUP = prod2.us.company.com)
(TYPE = select)
(METHOD = basic))))
prod2.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)
(INSTANCE_NAME = prod2.us.company.com)
(FAILOVER_MODE =
(BACKUP = prod1.us.company.com)
(TYPE = select)
(METHOD = basic))))
In this example, there are two different service names. The first uses prod1 as the primary and prod2 as the failover, the second
uses prod2 as the primary and prod1 as the failover. Note the extended syntax under “FAILOVER_MODE=” as compared to
connect-time failover. This syntax is RAC specific, and allows the session state to be maintained through failover for queries
(transactions are rolled back). The RETRIES and DELAY parameters allow the connection to be retried if the first attempt is not
successful.
Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.
Failover - Preconnect
Note that setting the “METHOD = preconnect” allows the backup connection to be pre-
allocated. This can aid failover speed, at the cost of twice as many concurrent connections.
prod1.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)
(INSTANCE_NAME = prod1.us.company.com)
(FAILOVER_MODE =
(BACKUP = prod2.us.company.com)
(TYPE = select)
(METHOD = preconnect))))
prod2.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)
(INSTANCE_NAME = prod2.us.company.com)
(FAILOVER_MODE =
(BACKUP = prod1.us.company.com)
(TYPE = select)
(METHOD = preconnect))))
Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.
Load Balancing
prod.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod.us.company.com)))
Here either prod1 or prod2 will be used. The system used is determined at
connection time. Load balancing is commonly used with RAC, both in the
server-side tnsnames.ora file and the client-side tnsnames.ora file.
Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.
The TAF mechanism provides some fault tolerance in the event of a node failure. It does this by
establishing a connection to multiple nodes in the cluster. TAF will automatically resend some
transactions, and will not result in the loss of connection to the user in the event of a failure. TAF
is configured by specifying both FAILOVER and LOAD_BALANCE as shown here:
prod.us.company.com=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=prod1)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=prod2)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=prod.us.company.com)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))
session - In this mode a new session is created on the failover server in the event of a failure.
select - Cursor processing is maintained. This incurs significant overhead on the client.
preconnect - establishes the connection to the failover server at initial connection time.
Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.
TAF automatically restores some or all of the following elements associated with active database connections. Other elements,
however, may need to be embedded in the application code to enable TAF to recover the connection.
Executed Commands
If a command was completely executed upon connection failure, and it changed the state of the database, TAF does not resend
the command. If TAF reconnects in response to a command that may have changed the database, TAF issues an error message
to the application.
Active Transactions
Any active transactions are rolled back at the time of failure because TAF cannot preserve active transactions after failover. The
application instead receives an error message until a ROLLBACK is submitted.
Note the “LOAD_BALANCE=on” is set, but failover is not set (TAF failover is not
supported with the thin driver).
Do not modify the server-side ons.config file to add the mid-tier hosts and ports.
This does not modify the OCR. Use the racgons command instead.
Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.
You should not attempt to run both TAF and FCF at the same time.
Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.
You should not attempt to run both TAF and FCF at the same time.
Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.
You should not attempt to run both TAF and FCF at the same time.
Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.
You should not attempt to run both TAF and FCF at the same time.
Oracle RAC Administration - Part 15: Connection Load
Balancing and FAN
Tarry Singh, tarry.singh@gmail.com
Brief intro
In our last article, we looked at the service architecture. We will continue our quest to understand
the workload characteristics and the load balancing connections in our Virtualized RAC.
With Oracle Notification Services (ONS), we are able to have our RAC balance our client
connections across the nodes. The CLB (connection load balancing) can be handled on the
server-side or on the client-side. The client-side takes care of the connection sharing across the
Listeners, while server-side load balancing manages and handles the connection requests to the
most willingly available node whose information is in turn provided to it by the LBA (Load
Balancing Advisory).
This in turn is analyzing the goals that may have been set for the LBA. You can use a long goal or
a short goal for this connection load balancing. Let’s take a look at them both quickly.
Long Goals: Use long goals for applications that need long uninterrupted connections. It is the
default connection load balancing goal. To see how to implement them, lets take a simple syntax.
Using the DBMS_SERVICE and CLB_GOAL_LONG packages, you can define the connection load
balancing for long connections as follows and here FOKESERV is our service:
Short Goals: Think of quick and fast transactions like ordering or placing a secure purchase like
that on Amazon or any other place. You would prefer to fill the order up fast and if, for any reason
the process is delayed, the connection is restarted, ensuring that the transaction is performed in a
short-lived connection. Checking the syntax, here CARTSERV is the service in question:
Note: When you install your RAC with the DBCA (Database Configuration Assistant), server-side
load balancing is enabled by default. Also check out the sample script in tnsnames.ora file. The
services created through the DBCA will have the Connection Load Balancing default setting as
CLB_GOAL=CLB_GOAL_LONG.
The ONS maintains client connections until the client decides to close the connection or if a node
fails or any other form of outage prevents the continuity of that connection. However, if you
configure TAF (transparent application failover) then the connections are moved to another
healthy instance. For a typical SARG (Select Arguments), TAF can restart the query. Alternately,
from the client side, if you were accessing information on the web, the search would simply carry
on without you noticing the failover of the connection to another node. However, if you were in the
middle of a transaction (INSERT, UPDATE, or DELETE) then the application must rollback the
failed transaction and submit it again. Any other session's customizations must be re-executed.
However, in a normal processing scenario the sessions don’t move irrespective of the workload
changes, newer sessions are just redirected to other nodes.
With the services, the deployment of TAF only becomes easier. By defining a TAF policy for a
service, all connections using this service will have TAF automatically enabled for them. No client-
side intervention is required, meaning the server-side TAF setting will override the TAF setting at
the client level. In order to define the TAF policy you will need the DBMS_SERVICE procedure:
Client-side load balancing is defined in your client connection with the parameter
LOAD_BALANCE=ON (the default is ON). Upon setting this parameter to ON, Oracle will pick out an
address from the address list at random and connect to that node, thereby balancing the client
connections across the cluster. Run the command lsnrctl services occasionally to see what
services a listener supports.
FAN is a notification mechanism that RAC uses to notify other processes about configuration and
service level information such as includes service status changes, such as UP or DOWN events.
Applications can respond to FAN events and take immediate action. FAN UP and DOWN events
can apply to instances, services, and nodes.
RAC publishes the FAN events the minute any changes are made to the cluster. So, instead of
waiting for the application to check on individual nodes to detect an anomaly, the applications are
notified by FAN events and are able to react immediately.
FAN also publishes load balancing advisory (LBA) events. Applications are in a position to take
full advantage of the LBA FAN events to facilitate a smooth transition of connections to healthier
nodes in a cluster. One can take advantage of FAN is the following ways:
• When using integrated Oracle Client, the applications can use FAN with no
programming whatsoever. Oracle 120g JDBC, ODP.NET and OCI would be
considered as the components of the integrated clients.
• Programmatic changes in ONS API make it possible for applications to still
subscribe to the FAN events and can execute the event handling actions
appropriately.
• FAN can be implemented with server-side callouts on your database tier.
For instance, a typical DOWN event will prevent any further disruption of service by cleanly
terminating the sessions on that failed node and notifying the user. Moreover, a typical UP event
can address and allocate extra resources for new incoming requests/ connections. There are
however several additional benefits with the server-side callouts, mainly you can utilize FAN in
order to:
• Logging
• Paging/SMS the DBA and/or to open trouble tickets when the resources fail
to (re)start
• Change resource plans or to shut down services when the number of
available instances decreases, thus preventing further load on the cluster
and keeping the RAC running until another healthy node is added to the
cluster.
• Automate the fail service back to PREFERRED instances when required.
Conclusion
In this article, we looked at the CLB and FAN. In future articles, we will continue to discuss the
services architecture and discuss FAN in detail and take a look at the LBA.
We will take a little detour from our regular path of following the manual, some would say, to more
practical path. One topic that most intrigues both an Oracle DBA and their clients (in their own
way) is connectivity. DBAs are sometimes puzzled with the load balancing and failover part of
connectivity. Where is it, on the Client-side OR the Server-side? Is tnsnames.ora still useful if I
deploy the Easy Connect Net Naming method? Let's brush up on some basics.
Easy Connect Naming methods were introduced in Oracle 10g. The benefits were simple. The
need to do a service name lookup in the tnsnames.ora files for TCP/IP environment was
eliminated, a directory naming system is no longer needed and clients are connected
automatically. Connecting is also fairly simple:
Syntax:
CONNECT username/password@[//]host[:port][/service_name]
So our typical examples could look like:
CONNECT oe/oe@localhost/nickserv
CONNECT scott/tiger@172.22.202.100/examples
Connect-Time Failover
From the clients end, when your connection fails at one node or service, you can then do a look
up from your tnsnames.ora file and go on seeking a connection with the other available node.
Take this example of our 4-node VMware ESX 3.x Oracle Linux Servers:
FOKERAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
nick01.wolga.com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = nick02.wolga.
com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = brian01.wolga.
com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = brian02.wolga.
com)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = fokerac)
)
)
Here the first address in the list is tried at the client’s end. Should the connection to
nick01.wolga.nl fail, then the next address, nick02.wolga.nl, will be tried. This phenomenon is
called connection-time failover. You could very well have a 32-node RAC cluster monitoring the
galactic system at NASA and thus have all those nodes typed in your tnsnames.ora file.
Moreover, these entries do not necessarily have to be part of the RAC cluster. So it is possible
that you are using Streams, Log Shipping or Advanced Replication to maintain your HA (High
Availability) model. These technologies facilitate continued processing of the database by such a
HA (High Availability) model in a non-RAC environment. In a RAC environment we know (and
expect) the data to be the same across all nodes since there is only one database.
Transparent Application Failover actually refers to a failover that occurs when a node or instance
is unavailable due to an outage or other reason that prohibits a connection to be established on
that node. This can be set to on with the following parameter FAILOVER. Setting it to ON will
activate the TAF. It is turned on by default unless you set it to OFF to disable it. Now, when you
turn it on you have two types of connections available by the means of the FAILOVER_MODE
parameter. The type can be session, which is default or select. When the type is SESSION, if the
instance fails, then the user is automatically connected to the next available node without the
user’s manual intervention. The SQL statements need to be carried out again on the next node.
However, when you set the TYPE to SELECT, then if you are connected and are in the middle of
your query, then your query will be restarted after you have been failed over to the next available
node. Take this example of our tnsnames.ora file, (go to the section beginning with
CONNECT_DATA):
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = fokerac.wolga.com)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
This is also demonstrated in this image from my article on Quest’s soRAC tool.
This is more intelligent in the sense that should the instance, (where the user has fired a query),
go down then the session will remember the rows in its Global Buffer Cache, restart the query
and pick up from where the user left off when they were disconnected from the first node.
Compare it to a scenario where the manager is running the query through a client such as Crystal
Reports or Oracle Reports. Running the monthly query to get the data from his Server Monitoring
software, let's say NetIQ, to get the statistics of all the servers the DBA needs to draw a report on
monitoring the CPU, memory or disk usage. Now if he ran a query that had to fetch 5000 rows
and he gets cut off at 3000 rows, he still has to reissue (not manually though) start from the
beginning but will only see the results when he crosses the 3001 mark. Then upon prompt, the
results will be shown. However, this does not apply to the DML (Data Manipulation Language)
such as insert, update, delete. The user is met with an error if node failure happens in the middle
of the processing and all the (uncommitted) transactions are rolled back. The session eventually
fails over without the user having to connect, however, he does have to issue the query again.
Preconnect Method
Consider our tnsnames.ora file again. Here in our older example, where the METHOD was
BASIC, there is some additional time loss in reconnecting the lost connection to the immediate
available node. However, when you set the METHOD to PRECONNECT, a session is connected
to all available nodes in the list. The advantage is that failover is a lot quicker than the BASIC
method but the downside is that the other instances will continue to claim necessary resources
such a system memory to keep those connections alive.
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = fokerac.wolga.com)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = PRECONNECT)
(RETRIES = 180)
(DELAY = 5)
)
)
Conclusion
As a DBA, these little things are very crucial in deciding what you intend to do with the RAC. Is it
a DSS or will it be used for a heavy OLTP environment. On a DSS, you can easily choose a
PRECONNECT method as you expect to have less connections with long durations (batch
processing, etc.) where as a typical OLTP will perform better if the cluster is optimally configured
with a BASIC method.