Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Databases
Instances
Applications
Node Monitoring
Event Services
High Availability
Databases
Instances
Applications
Cluster Management
Node Management
Event Services
High Availability
Storage Management (with help of ASM and other new ACFS filesystem)
Removed OS dependent hang checker etc, manages with own additional monitor
process
6. What are the background process that exists in 11gr2 and functionality?
Process
Name
Functionality
crsd
cssd
diskmon
evmd
mdnsd
Multicast domain name service (mDNS): Allows DNS requests. The mDNS
process is a background process on Linux and UNIX, and a service on
Windows.
gnsd
Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS
and external DNS servers. The GNS process performs name resolution within
the cluster.
ons
oraagent
orarootage
nt
oclskd
gipcd
ctssd
Owner
ohasd
init, root
root
grid owner
evmd, evmlogger
grid owner
octssd
root
ons, eons
grid owner
Oracle Agent
oragent
grid owner
orarootagent
root
gnsd
root
gpnpd
grid owner
mdnsd
grid owner
9. As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup
sequence OCSSD starts first before than ASM, how is it possible?
How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM? This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To solve this
problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the
header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields
tell CSS where to find the voting file. This does not require the ASM instance to be up. Once
the voting disks are located, CSS can access them and joins the cluster.
1.
Client Connected through SCAN name of the cluster (remember all three IP addresses
round robin resolves to same Host name (SCAN Name), here in this case our scan name
is cluster01-scan.cluster01.example.com
2.
The request reaches to DNS server in your corp and then resolves to one of the node
out of three. a. If GNS (Grid Naming service or domain is configured) that is a
subdomain configured in the DNS entry for to resolve cluster address the request will be
handover to GNS (gnsd)
3.
Here in our case assume there is no GNS, now the with the help of SCAN listeners
where end points are configured to database listener.
4.
5.
6.
Configuring or reconfiguring itself using profile data, making host names and
addresses resolvable on the network
To add a node, simply connect the server to the cluster and allow the cluster to configure the
node.
To make it happen, Oracle uses the profile located in
$GI_HOME/gpnp/profiles/peer/profile.xml which contains the cluster resources, for example
disk locations of ASM. etc.
So this profile will be read local or from the remote machine when plugged into cluster and
dynamically added to cluster.
13. What are the file types that ASM support and keep in disk groups?
Control files
Flashback logs
Data Pump
dump sets
Data files
DB SPFILE
Data Guard
configuration
Temporary data
files
RMAN backup
sets
Change
tracking
bitmaps
Online redo
logs
OCR files
Archive logs
Transport data
files
ASM SPFILE
Is cluster-aware
Supports reading from mirrored copy instead of primary copy for extended clusters
Description
RBAL
ARBn
GMON
MARK
Onnn
PZ9n
The node listener is a process that helps establish network connections from ASM
clients to the ASM instance.
Is capable of listening for all database instances on the same machine in addition to
the ASM instance
Also from 11gR2 manages the cluster resources like network,vip,disks etc
cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE
Description
RBAL
ARBn
GMON
MARK
Onnn
PZ9n
Supported
MirroringLevel
s
Default
Mirroring Leve
l
External
redundancy
Unprotected
(None)
Unprotected
(None)
Normal
redundancy
Two-wayThreewayUnprotected
(None)
Two-way
High
redundancy
Three-way
Three-way
ASM stripes files using extents with a coarse method for load balancing or a fine method to
reduce latency.
26. How many ASM Diskgroups can be created under one ASM Instance?
ASM imposes the following limits:
1.
2.
3.
Sets permissions on the Oracle Inventory (central inventory) directory. Reconfigures primary
and secondary group memberships for the installation owner, if necessary, for the Oracle
Inventory directory and the operating system privileges groups.
crsctl stop cluster (possible only from 11gr2), please note crsctl commands becomes
global now, if you do not specify node specifically the command executed globally for
example
crsctl stop crs (stops in all crs resource in all nodes)
crsctl stop crs n <ndeoname) (stops only in specified node)
36. CRS is not starting automatically after a node reboot, what you do to make it
happen?
crsctl enable crs (as root)
to disable
crsctl disable crs (as root)
41. What is the difference between TAF and FAN & FCF? at what conditions you
use them?
1) TAF with tnsnames
a feature of Oracle Net Services for OCI8 clients. TAF is transparent application failover
which will move a session to a backup connection if the session fails. With Oracle 10g
Release 2, you can define the TAF policy on the service using dbms_service package. It will
only work with OCI clients. It will only move the session and if the parameter is set, it will
failover the select statement. For insert, update or delete transactions, the application must
be TAF aware and roll back the transaction. YES, you should enable FCF on your OCI client
when you use TAF, it will make the failover faster.
Note: TAF will not work with JDBC thin.
2) FAN with tnsnames with aq notifications true
FAN is a feature of Oracle RAC which stands for Fast Application Notification. This allows the
database to notify the client of any change (Node up/down, instance up/down, database
up/down). For integrated clients, inflight transactions are interrupted and an error message
is returned. Inactive connections are terminated.
FCF is the client feature for Oracle Clients that have integrated with FAN to provide fast
failover for connections. Oracle JDBC Implicit Connection Cache, Oracle Data Provider for
.NET (ODP.NET) and Oracle Call Interface are all integrated clients which provide the Fast
Connection Failover feature.
3) FCF, along with FAN when using connection pools
FCF is a feature of Oracle clients that are integrated to receive FAN events and abort inflight
transactions, clean up connections when a down event is received as well as create new
connections when a up event is received. Tomcat or JBOSS can take advantage of FCF if the
Oracle connection pool is used underneath. This can be either UCP (Universal Connection
Pool for JAVA) or ICC (JDBC Implicit Connection Cache). UCP is recommended as ICC will be
deprecated in a future release.
4) ONS, with clusterware either FAN/FCF
ONS is part of the clusterware and is used to propagate messages both between nodes and
to application-tiers
ONS is the foundation for FAN upon which is built FCF.
RAC uses FAN to publish configuration changes and LBA events. Applications can react as
those published events in two way :
- by using ONS api (you need to program it)
- by using FCF (automatic by using JDBC implicit connection cache on the application server)
you can also respond to FAN event by using server-side callout but this on the server side
(as their name suggests it)
47. Can you modify VIP address after your cluster installation?
Yes
48. How do you interpret AWR report in RAC instances, what sections in awr report for rac
instances are most important?
Read here.
Update 12-May-2013, Some practical questions added here
1. Viewing Contents in OCR/Voting disks
There are three possible ways to view the OCR contents.
a. OCRDUMP (or)
b. crs_stat -p
(or)
c. By using strings.
Voting disk contents are not persistent and are not required to view the contents,
because the voting disk contents will be overwritten. if still need to view, strings
are used.
SCAN IP can be disabled if not required. However SCAN IP is mandatory during the
RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment
by the concurrent manager (kind of job scheduler in oracle apps).
To disable the SCAN IP,
i. Do not use SCAN IP at the client end.
ii. Stop scan listener
srvctl stop scan_listener
iii. Stop scan
srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
srvctl disable scan
b. Case 2: Migrating disk group from one to another with different diskgroup name.
1) Create the Disk group with new name in the new storage.
2) Create the spfile in new diskgroup and change the parameter scope = spfile
for control files etc.
c. Case 3: Migrating disk group to new storage but no additional diskgroup given
1) Take the RMAN backup as copy of all the databases with new format and
place it in the disk.
2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries)
3) Take a backup of pfile and modify the following referring to new diskgroup
name
.control_files
.db_create_file_dest
.db_create_online_log_dest_1
.db_create_online_log_dest_2
.db_recovery_file_des
4) stop the database
5) Unmount the diskgroup
asmcmd umount ORA_DATA
6) use asmcmd renamedg (11gr2 only) command to rename to new
diskgroup
renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA
verbose=true
7)
8) start the database in mount with new pfile taken backup in step 3
9) Run the rename file scripts generated at step2
9) Add the diskgroup to cluster the cluster (if using rac)
srvctl modify database -d orcl -p +NEW_FRA/orcl/spfileorcl.ora
srvctl modify database -d orcl -a "NEW_DATA"
srvctl config database -d orcl
srvctl start database -d orcl
10) Delete the old diskgroup from cluster
crsctl delete resource ora.ORA_DATA.dg
11) Open the database.
8.How to find the database in which particular service is attached to when you have a large
number of databases running in the server, you cannot check one by one manually
Write a shell script to read the database name from oratab and iterate the loop taking inpt
as DB name in srvctl to get the result.
#!/bin/ksh
ORACLE_HOME=
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`
done
9. Difference between OHAS and CRS
OHAS is complete cluster stack which includes some kernel level tasks like managing
network,time synchronization, disks etc, where the CRS has the ability to manage the
resources like database,listeners,applications, etc With both of this Oracle provides the high
availability clustering services rather only affinity to databases.
ORACLE CAREER
same block is modified by more than one instance, synchronization/locking of the data blocks does not take place and blocks may be
overwritten by others in the cluster. This state is called split brain.
How do you determine what protocol is being used for Interconnect traffic?
One of the ways is to look at the database alert log for the time period when the database was started up.
What methods are available to keep the time synchronized on all nodes in the cluster?
Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster Time Synchronization Service (CTSS) can be used.
Where does the Clusterware write when there is a network or Storage missed heartbeat?
The network ping failure is written in $CRS_HOME/log
How do you find out what object has its blocks being shipped across the instance the most?
You can use the dba_hist_seg_stats.
What would be the possible performance impact in a cluster if a less powerful node (e.g.
slower CPUs) is added to the cluster?
All processing will show down to the CPU speed of the slowest server.
Datafiles
Redo logfiles
Spfiles
In 12c the files below can also new be stored in the ASM Diskgroup
Password file
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
ACTIVE_INSTANCE_COUNT
UNDO_MANAGEMENT
Is there an easy way to verify the inventory for all remote nodes
You can run the opatch lsinventory -all_nodes command from a single node to look at the inventory details for all nodes in the cluster.
ADDITIONAL RESOURCES
RAC Blog
affinity for increased performance. resource affinity optimized the system in situation where
update transactions are being executed in one instance. when activity shift to another
instance the resource affinity correspondingly move to another instance. If activity is not
localized then resource ownership is hashed to the instance.
In 10g dynamic remastering happens in file+object level.the process of remastering is very
stringent. For one instance should touch more than 50 times than the other instance in
particular period(say 10 mints). this touch ratio and time can be tuned by gc_affinity_limit
and _gc_affinity_time parameter.
Q If there is some issue with virtual IP how will you troubleshoot it?How will you
change virtual ip?
To change the VIP (virtual IP) on a RAC node, use the command
Q What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation
that ensures high availability of databases by providing instance failover, media failover
features.
Q What is GRD?
GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.
This process is called as Global Cache service process.This process maintains statuses of
datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote instances and
manages global data block access and transmits block images between the buffer caches of
different instances.This processing is a part of cache fusion feature.
Q How to export and import crs resources while migrating Oracle RAC to new
server.
Below script generate svrctl add script for database, instance, service and 11G listeners
from OCR from current RAC.
Save the result of the script and run it at new RAC.
for DBNAME in $(srvctl config database)
do
# Generate DB resource
srvctl config database -d $DBNAME -a | awk -v dbname="$DBNAME" \
'BEGIN { FS=":" }
$1~/Oracle home/ || $1~/ORACLE_HOME/ {dbhome = "-o" $2}
$1~/Spfile/ || $1~/SPFILE/ {spfile = "-p" $2}
$1~/Disk Groups/ {dg = "-a" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s\n", "srvctl add database -d ",
dbname, dbhome, spfile, dg }'
END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s %s %s %s %s %s\n", "srvctl
add service -d ",dbname, "-s ", sname, pref, avail ,ft, fm,g, "-P BASIC"}'
echo "srvctl start service -d $DBNAME -s $sname"
done
done
# Listener at 11G Home. 10G listener can't ba added with srvctl.
Q What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using OEM(Enterprise
Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NETCA
Q What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related to
instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.
applications can receive FAN events and react immediately.This prevents applications from
polling database and detecting a problem after such a state change.
Q State the initialization parameters that must have same value for every
instance in an Oracle RAC database
Some initialization parameters are critical at the database creation time and must have
same values.Their value must be specified in SPFILE or PFILE for every instance.The list of
parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT
Q What is ORA-00603: ORACLE server session terminated by fatal error or ORA29702: error occurred in Cluster Group Service operation?
RAC node name was listed in the loopback address...
Q What are the modes of deleting instances from ORacle Real Application cluster
Databases?
We can delete instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
Q How do we verify that an instance has been removed from OCR after deleting
an instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat
PROD2
CPU 15
12 GB RAM
PROD3
CPU 8
16 GB RAM
What are you looking for here? What tuning information do you expect?
It is a 3 node cluster with different hardware configuration running RAC.
I would put 20% of the memory for Oracle in each node. So that would mean that the SGA is
different in each of the nodes.
Also since the CPU's are different PROD2 can have more number of max number of
processes as compared to the rest of them.
But as I said this is just configuration, this is not tuning. Question is not clear.
Q Write a sample script for RMAN for the recovery if all the instance are down.
(First explain the procedure how you will restore)
Bring all nodes down.
Start one Node
Restore all datafiles and archive logs.
Recover 1 Node.
Open the database.
bring other nodes up.
Confirm that all nodes are operational.
Q. Clients are performing some operation and suddenly one of the datafile is
experiencing problem what do you do? The cluster is a two node one.
A. Bring the datafile offline recover the datafile.
OCR file. Use the following command to generate an export of the online OCR file:
In 10.2
# ocrconfig export -s online
In 11g
# ocrconfig -manualbackup
The new OCR disk must be owned by root, must be in the oinstall group, and must have
permissions set to 640. Provide at least 100 MB disk space for the OCR.
On one node as root run:
# ocrconfig -replace ocr
# ocrconfig -replace ocrmirror
Now run ocrcheck to verify if the OCR is pointing to the new file
Moving Voting Disk
==================
Note: crsctl votedisk commands must be run as root
Shutdown the Oracle Clusterware (crsctl stop crs as root) on all nodes before making any
modification to the voting disk. Determine the current voting disk location using:
crsctl query css votedisk
Take a backup of all voting disk:
dd if=voting_disk_name of=backup_file_name
To move a Voting Disk, provide the full path including file name:
crsctl delete css votedisk force
crsctl add css votedisk force
After modifying the voting disk, start the Oracle Clusterware stack on all nodes
3. After the Summary screen, OUI will start copying under the $CRS_HOME (this is the
$ORACLE_HOME for Oracle Clusterware) in the local node the libraries and executables.
- here we will have the daemons and scripts init.* created and configured properly.
Oracle Clusterware is formed of several daemons, each one of which have a special function
inside the stack. Daemons are executed via the init.* scripts (init.cssd, init.crsd and
init.evmd).
- note that for CRS only some client libraries are recreated, but not all the executables (as
for the RDBMS).
4. Later the software is propagated to the rest of the nodes in the cluster and the
oraInventory is updated.
5. The installer will ask to execute root.sh on each node. Until this step the software for
Oracle Clusterware is inside the $CRS_HOME.
Running root.sh will create several components outside the $CRS_HOME:
- OCR and VD will be formated.
- control files (or SCLS_SRC files ) will be created with the correct contents to start Oracle
Clusterware.
These files are used to control some aspects of Oracle Clusterware like:
- enable/disable processes from the CSSD family (Eg. oprocd, oslsvmon)
- stop the daemons (ocssd.bin, crsd.bin, etc).
- prevent Oracle Clusterware from being started when the machine boots.
- etc.
- /etc/inittab will be updated and the init process is notified.
In order to start the Oracle Clusterware daemons, the init.* scripts first need to be run.
These scripts are executed by the daemon init. To accomplish this some entries must be
created in the file /etc/inittab.
- the different processes init.* (init.cssd, init.crsd, etc) will start the daemons (ocssd.bin,
crsd.bin, etc). When all the daemons are running then we can say that the installation was
successful
- On 10.2 and later, running root.sh on the last node in the cluster also will create the
nodeapps (VIP, GSD and ONS). On 10.1, VIPCA is executed as part of the RAC installation.
6. After running root.sh on each node, we need to continue with the OUI session. After
pressing the 'OK' button OUI will include the information for the public and
cluster_interconnect interfaces. Also CVU (Cluster Verification Utility) will be executed.
Q What are Oracle Clusterware processes for 10g on Unix and Linux
Cluster Synchronization Services (ocssd) Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user
Event manager daemon (evmd) A background process that publishes events that crs
creates.
Process Monitor Daemon (OPROCD) This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.
RACG (racgmain, racgimon) Extends clusterware to support Oracle-specific requirements
and complex resources. Runs server callout scripts when FAN events occur.
Q What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.
Q What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR
Report?
This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system
administrator find the fault with network.
Q Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215,
however sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now
you will get detailed error stack.
With Oracle RAC 10g Release 2 or later, you can also use the export
command:
#ocrconfig -export -s online, and use -import option to restore the contents
back.
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR
with the command:
# ocrconfig -manualbackup
How do you backup voting disk
#dd if=voting_disk_name of=backup_file_name
How do I identify the voting disk location
#crsctl query css votedisk
How do I identify the OCR file location
check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck
Is ssh required for normal Oracle RAC operation ?
"ssh" are not required for normal Oracle RAC operation. However "ssh"
should be enabled for Oracle RAC and patchset installation.
What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application
Clusters (RAC) 11g Release 2 feature that provides a single name for clients
to access an Oracle Database running in a cluster. The benefit is clients
using SCAN do not need to change if you add or remove nodes in the
cluster.
Click here for more details from Oracle
What is the purpose of Private Interconnect ?
Clusterware uses the private interconnect for cluster synchronization
(network heartbeat) and daemon communication between the the clustered
nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process
listeners.
This in order to facilitate:
a. the FAN or Fast Application Notification feature or allowing applications to
respond to database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load
balancing accross different rac nodes dependent of the load on the different
nodes. The rdbms MMON is creating an advisory for distribution of work
every 30seconds and forward it via racgimon and ONS to listeners and
applications.
Labels: Oracle RAC Interview questions
http://dbaanswers.blogspot.com/2007/06/sroracle-dba-racdatagaurd-interview.html