Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ARCHIVE GAP
An archive gap means, a set of archived redo logs that could not be transmitted to the
Standby site from the Primary database. Mostly this problem occurs the network connectivity
becomes unavailable between the Primary and Standby site. When the network is available
again Data Guard resumes redo data transmission from the Primary to Standby site. Oracle
DG provides 2 methods for GAP resolution. They are AUTOMATIC and FAL.
Lets consider an extended n/w failure occurred between the Primary and Standby machines
which causes the Standby is very far behind the Primary database, then an RMAN INCREMENTAL
BACKUP can be used to roll the Standby database forward faster than redo log apply.
When the archived logs are missing on the Standby database, simply we can ship missing logs
from the Primary to Standby database; (If missing logs are very less count e.g. (below 15).
We need to register all Shipped logs in the Standby database so that Gap can be resolved.
In this article I will demonstrate how to resolve archive log gaps using following methods.
# On Primary database
# On Standby database
# On Primary database
THREAD# MAX(SEQUENCE#)
---------- --------------
1 551
# On Standby database
THREAD# MAX(SEQUENCE#)
---------- --------------
1 551
9 rows selected.
$ cd /u01/app/oracle/diag/rdbms/stbycrms/stbycrms/trace
$ tail -f alert_stbycrms.log
DETECTING GAPS
Oracle Data Guard provide us with a simple view (v$archive_gap) to detect a gap.
On Standby database
Output of Standby database is currently missing log files from sequence# 552 to 556; the
Standby database is 5 logs behind the Primary database. ORACLE NOTE: Refer BUG #10072528
V$ARCHIVE_GAP may not detect archive gap when Physical Standby is open read only.
After identifying a gap (as shown above), as a DBA you need to query the primary database
to locate the archived redo logs on the primary database. I have configured the local
archive destination on primary is LOG_ARCHIVE_DEST_1.
NAME
--------------------------------------------------------------------------------------
/u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/o1_mf_1_552_cnvodqq5_.arc
/u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/o1_mf_1_553_cnvohcy1_.arc
/u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/o1_mf_1_554_cnvokk6z_.arc
/u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/o1_mf_1_555_cnvom1l4_.arc
/u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/o1_mf_1_556_cnvomxkz_.arc
Copy the above redo log files to the Physical Standby database and register them using the
ALTER DATABASE REGISTER LOGFILE ... SQL statement on the Physical Standby database.
$ ls /u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/
As per above example you need to transfer all archive logs to the Standby Server.
Database altered.
Database altered.
.
SYS> alter database register logfile '/u01/app/oracle/flash_recovery_area/STBYCRMS/archive
log/2016_06_01/o1_mf_1_554_cnvokk6z_.arc';
Database altered.
Database altered.
Database altered.
Recovery process would start automatically or else stop the managed recovery process (MRP)
and re-start it once again; thats it!
On Primary to Standby Server, archive logs were not transferred by the log transfer service,
(but through SCP) so the managed recovery process will not have any info about these logs.
Then manually transferred logs need to be registered with the managed recovery process
before they applied by the log apply service.
The Standby database lags far behind from the Primary database then i noticed that there
is a huge sequence mismatch between Primary & Standby database. There could be some reasons.
In cases where a Physical Standby database is out of Sync with Primary database. In case
of archive logs are missing/corrupt we have to rebuild the Standby from scratch. If the
database size in terabytes again rebuilding the Standby database could be a tedious job;
but we a solution to resolve this kind of issues.
As a DBA you can go for an RMAN Incremental backup to sync a Physical Standby with the
Primary database; using the command RMAN BACKUP INCREMENTAL FROM SCN create a backup on
the Primary database that starts at the standby databases current SCN, which can then be
used to roll the Standby database forward in time.
Please assume bunch of archive logs are deleted/corrupted on the Primary database Server
before they are transferring to the Standby database Server. In this case, I demonstrate
an efficient way to Sync Standby with Primary (an alternative to rebuild the Standby DB)!
DISASTER RECOVERY
# On Primary database
# On Standby database
THREAD# MAX(SEQUENCE#)
---------- --------------
1 653
# On Standby database
THREAD# MAX(SEQUENCE#)
---------- --------------
1 558
Note the difference, On Standby last applied SEQUENCE# 558 but on Primary SEQUENCE# 653
SYS> @archive_gap.sql
# On Standby database
Last SCN of the STANDBY, will be used for the RMAN Incremental backup at Primary DB Server.
CURRENT_SCN
-----------
3473651
CURRENT_SCN
-----------
4538393
Note the SCN DIFFERENCE 4538393 3473651 = 1064742 (Now the Standby is lag behind).
But I want to know how long the Standby database is far behind in terms of (hours/days).
To know that I used the scn_to_timestamp function to translate the SCN to timestamp.
# On Primary database
SCN_TO_TIMESTAMP(4538393)
---------------------------------------
01-JUN-16 11.12.52.000000000 PM
SCN_TO_TIMESTAMP(3473651)
--------------------------------------------------
01-JUN-16 04.32.44.000000000 AM
The Standby database far behind more than 18 hours from the Primary database.
NOTE: Query NOT WORKABLE in OPEN READ ONLY MODE or MOUNTED MODE.
$ ls /u01/app/oracle/flash_recovery_area/CRMS/archivelog/2016_06_01/
To Sync Standby with Primary we need archive logs SEQUENCE# 559 to SEQUENCE# 644. But these
archives are NOT found at Primary Site; then only choice is RMAN Incremental backup.
On the Primary, take an Incremental backup from the SCN# (3473651) of the Standby database.
The last recorded SCN# (3473651) of the Standby database.
# Connect target as Primary & Create control file backup for Standby database
$ ls -l /u03/rman-bkp/
total 4707832
-rw-r----- 1 oracle oinstall 4792238080 Jun 2 02:42 stby_03r73ftc_1_1
-rw-r----- 1 oracle oinstall 11927552 Jun 2 02:42 stby_04r73gm6_1_1
-rw-r----- 1 oracle oinstall 11927552 Jun 2 02:49 stdb_ctrl.ctl
Using OS utility to transfer these backups to the Standby Server from Primary Server.
$ scp * oracle@192.168.222.134:/u03/bkp/
oracle@192.168.222.134's password: ******
$ ls -l /u03/bkp/
total 4707832
-rw-r----- 1 oracle oinstall 4792238080 Jun 2 05:40 stby_03r73ftc_1_1
-rw-r----- 1 oracle oinstall 11927552 Jun 2 05:40 stby_04r73gm6_1_1
-rw-r----- 1 oracle oinstall 11927552 Jun 2 05:40 stdb_ctrl.ctl
Note the location of all data files & Control file(s) at the Standby Database Server.
$ cd /u01/app/oracle/flash_recovery_area/stbycrms/
$ mv control02.ctl /tmp/control02.ctl.bkp
$ cd /u01/app/oracle/oradata/stbycrms/
$ mv control01.ctl /tmp/control01.ctl.bkp
NOTE: If you do not want rename control files, you can remove these files at OS level.
$ export ORACLE_SID=stbycrms
$ rman target /
RMAN does not know about these backup sets. We must catalog the new backup sets on the
Standby database. I.e. when we register these backup files to the Standby database control
file with RMAN CATALOG command, the Standby control file gets updated about these backup
files. It helps to recover the Standby using these RMAN backup files.
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_646_cnxof4nl_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_647_cnxocnkr_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_652_cnxpz1lz_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_645_cnxodgj5_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_651_cnxodqq5_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_650_cnxp48lh_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_649_cnxoc5oq_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_653_cnxpzpvs_.arc
File Name:
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_01/o1_mf_1_648_cnxobwyc_.arc
Do you really want to catalog the above files (enter YES or NO)? Y
cataloging files...
cataloging done
The recovery process will use cataloged incremental backup sets because we have registered.
This backup taken for Physical Standby DB Sync; noredo key word is required. Check here
If data files have been added to the Primary database during the archive log gap time, they
were not included in the incremental backup sets. We need to register newly created files
to the Standby database. We can find newly created files using CURRENT_SCN of the Standby.
# Execute the following query at Primary DB - (Last SCN of the Standby database)
SYS> select file#, name from v$datafile where creation_change# > 3473651;
FILE# NAME
---------- --------------------------------------------
6 /u01/app/oracle/oradata/crms/users03.dbf
RMAN> run
{restore datafile 6;}
All changed blocks has been captured in the incremental backup and also updated at the
standby database, thus brings the Standby database up to date with the primary database.
Check the SCNs in Primary and Standby it should be close to each other.
THREAD# MAX(SEQUENCE#)
---------- --------------
1 773
THREAD# MAX(SEQUENCE#)
---------- --------------
1 773
SYS> select process, status, thread#, sequence#, block#, blocks from v$managed_standby;
...
REFERENCE DOCS :
https://docs.oracle.com/cd/E11882_01/server.112/e41134/rman.htm#SBYDB00759
http://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmdupdb.htm#sthref955
https://web.stanford.edu/dept/itss/docs/oracle/10gR2/backup.102/b14191/rcmdupdb008.htm
# Perform Recover
RMAN> recover database noredo;
These are Steps I have done to roll the Standby database forward in time. Thats it!
If connectivity is lost between the Primary and Standby databases (due to network problems),
redo data being generated on the primary database cannot be sent to the Standby database.
Once a connection is reestablished, the missing archived redo log files are automatically
detected by Data Guard, which then automatically transmits the missing archived redo log
files to the Standby databases. The Standby databases are synchronized with the Primary
database, without manual intervention by the DBA. How this happens? Lets dig into deep.
RFS process Compares the Sequence number of currently being archived file with the sequence
number of previously received archived redo file; (if currently being archived redo file
sequence# is greater than the last sequence# received plus one, there is a gap). An archived
redo log file is uniquely identified by its sequence number and thread number.
Now 3 files are missing. There is a gap between both files, then RFS automatically requests
the missing redo log sequence# from the Primary DB again via the ARCH-RFS Heartbeat Ping.
The archiver of the Primary then retransmits these missing archived redo files.
This type of Gap Resolution is using the Service defined in log_archive_dest_n on the
Primary database Serving this Standby database.
The archiver process of the Primary database polls the Standby databases every minute -
(Referred to as heartbeat) to see if there is a gap in the sequence of archived redo logs.
If a gap is detected, the ARCH process sends the missing archived redo log files to the
Standby databases that reported the gap. Once a gap is resolved then the Transport Process
(ARCH/LGWR) is notified about the Resolution of the Gap.
As I said above, RFS receives an archive log on the Standby database, then the archive logs
are registered in the Standby Control file with name and location. Missing log files are
typically detected by the Log Apply Services (MRP) on the Standby database. If the archived
redo log file is missing/corrupted for any reason (eg. it got deleted). FAL is required to
resolve a Gap, to obtain a new copy of the corrupted or deleted file.
Since MRP has NO direct communications (LOG TRANSPORT SERVICES) of the primary database;
it must use the FAL_SERVER and FAL_CLIENT initialization parameters to resolve the gap.
Both of these parameters must be set in the Standby Initialization parameter file.
FAL_CLIENT and FAL_SERVER need to be defined in the initialization parameter file of the
Standby database(s). They mainly used for GAP RESOLUTION through FAL background process.
Using FAL parameters the MRP in the Physical Standby database checks automatically and also
Resolving gaps at the time redo is applied.
The destination where the FAL_SERVER database Should send the archived log(s).
In earlier release when you set FAL_CLIENT parameter on the Standby database, the Primary
database (FAL_SERVER) uses Standby database (NET SERVICE NAME) to connect the Standby DB.
# On Primary database
FAL_SERVER: An Oracle NET SERVICE NAME (TNS-Alias or Connect Descriptor). This parameter
must be configured on the Standby database System that points from where the missing archive
log(s) Should be requested. Its recommended to set FAL_SERVER on the Primary DB with the
value of the Standby database NET SERVICE NAME thus helps when you do Switch over event.
Once the Log Apply Services (MRP) detect an ARCHIVE GAP, then it sends a FAL Request to
the FAL_SERVER; Once communication has been established with the Primary database, it
passes the Sequence number of archived files (which are causing for the archive gap) to
retransmit from the archiver process of the Primary database; and additionally it passes
the service name defined by the FAL_CLIENT parameter to the Primary ARCH process.
An ARCH Process on the FAL_SERVER tries to pick up the requested Sequences from that
database and sends them to the FAL_CLIENT. I.e. The Primary database ARCH process Ships the
requested archived logs to the remote archive destination of corresponding service name.
RFS[54]: Opened log for thread 1 sequence 1076 dbid 1613387466 branch 913081878
Media Recovery Log
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_04/o1_mf_1_1076_co5qn3sd_.arc
RFS[55]: Opened log for thread 1 sequence 1077 dbid 1613387466 branch 913081878
Media Recovery Log
/u01/app/oracle/flash_recovery_area/STBYCRMS/archivelog/2016_06_04/o1_mf_1_1077_co5qn4db_.arc
In order to successfully complete a Gap Request the requested archive log Sequence(s) must
be available on the FAL_SERVER database.
When you have multiple Physical Standby databases, the FAL mechanism can automatically
retrieve missing archived redo log files from another Physical Standby database.
The FAL-Request fails and a corresponding Message will be put in the ALERT.LOG of the
Standby database. Example taken for some other SEQUENCE (1078 1081).
Every minute the Primary database polls its Standby databases to see if there are gaps in
the sequence of archived redo log files.
The FAL (Client) requests to transfer archived redo log files automatically.
The FAL (Server) services the FAL requests coming from the FAL Client.
A separate FAL server is created for each incoming FAL client.
FAL is available since Oracle 9.2.0 for Physical Standby database and Oracle 10.1.0 for
Logical Standby databases. In 11g, if you not set FAL_CLIENT parameter, the Primary database
will obtain service name from related LOG_ARCHIVE_DEST_n parameter.
REFERENCE LINKS
http://flylib.com/books/en/1.145.1.66/1/
https://docs.oracle.com/cd/B19306_01/server.102/b14239/log_transport.htm#i1268294
Oracle Note: Data Guard Gap Detection and Resolution Possibilities [ID 1537316.1]