Sei sulla pagina 1di 49

Oracle Data Guard

Causes of Data Loss


Hardware & system errors Human errors 36% 49%

Computer viruses
Software corruption Natural disasters

7%
4% 3%

Source: Disaster Recovery Journal

System Failures Site Failures Human Error

Real Application Clusters


Continuous Availability for all Applications

Data Guard
Guaranteed Zero Data Loss

Unplanned Downtime

Flashback
Guaranteed Zero Data Loss

Storage/Net Failures
System Maintenance Planned Downtime Database Maintenance

ASM Mirroring
Storage Failure Protection

Dynamic Reconfiguration
Capacity on Demand without Interruption

Online Redefinition
Adapt to Change Online

What is a Standby Database ?


A copy of a production database that you can use for disaster protection. You can update the standby database with redo logs from the production database in order to keep it current. If a disaster destroys the production database, you can activate the standby database and make it the new production database. You can maintain the standby data in one of the following modes:
For physical standby databases Redo Apply For logical standby databases SQL Apply A Standby Database is NOT Data Guard

Why Data Guard?


Data Guard helps you protect your Data. Takes your data and automatically puts it elsewhere Makes it available for Failover in case of failure. The apply process also revalidates the log records to prevent application of any log corruptions Geographically dispersed sites Useful for logical data corruptions if lag behind used Flexible configuration options for protection level Reporting and backups can be diverted to standby Automatic resync for failed primary Switchover for Maintenance

Traditional Physical Standby Databases Investment in Disaster Recovery

Active Data Guard 11g Investment in Improved Quality of Service

Requirements

Data Guard 11g has several options for deploying different CPU architectures, O.S. binaries and Oracle database binaries, on primary and standby systems. For example, the primary database may be on Windows, and the standby database may be on Linux. See MetaLink Note 413484.1 for latest capabilities and restrictions

Bandwidth Requirements

Depends on Redo generation Find peak redo in AWR report


Load Profile ~~~~~~~~~~~~ Redo size: Per Second --------------51,944.64 Per Transaction --------------5,177.09

Bandwidth in MBPS = (redo bytes per sec /0.7)8)/1,000,000

Physical Standby

Protection Modes Physical Standby Architecture Standby Redo Logs Real Time Apply Automatic Resynchronization

Database Protection Modes


Maximum Protection No Data Loss and No data divergence Arch_dest: mandatory, lgwr, sync, affirm Primary db shutdown when unable to access stdby Maximum Availability Arch_dest: mandatory, lgwr, sync, affirm Protection auto lowered when stdby is unavailable Maximum Performance Arch_dest: lgwr/arch, sync/async, mandatory/optional Minimal performance impact

Maximum Availability Mode


Protection Mode
Maximum Availability Zero Data Loss

Failure Protection
Protects Against Primary Failure

Redo Shipping
LGWR using SYNC

Zero Data Loss as long as the network stays up! Enforces protection of every transaction Configuration: LGWR SYNC If last standby is unavailable, processing continues at primary When the standby becomes available again, synchronization with the primary is automatic

ALTER DATABASE SET STANDBY TO MAXIMIZE AVAILABILITY;

Architecture
Primary database transactions LGWR MRP or LSP (MRP only)
RFS

Standby database

Online redo logs

Oracle net

Standby redo logs ARC0

Backup Reports

FAL
ARC0 Archived redo logs

Archived redo logs

Standby Redo Logs


Standby redo logs Archived redo logs

Redo from primary database

RFS
MRP/LSP

ARC0

Standby database

Real Time Apply


Redo data is applied to the standby database as soon as it is received from the primary database
In Oracle9i Data Guard this apply has to wait till an archivelog is created on the standby database

For Redo Apply: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE When real time apply is enabled, RECOVERY_MODE column in
V$ARCHIVE_DEST_STATUS displays MANAGED REAL TIME APPLY

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=tmstby 2> OPTIONAL LGWR SYNC AFFIRM 3> VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) 4> DB_UNIQUE_NAME=tmstby';
SQL> ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(tmtst,tmstby)' SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY; SQL> SELECT PROTECTION_MODE, PROTECTION_LEVEL FROM V$DATABASE; PROTECTION_MODE PROTECTION_LEVEL ----------------------------------------MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY

Real-Time Apply Architecture


Oracle
Transactions

Net

Physical Logical Standby Database MRP/ LSP

LGWR

RFS

Primary Database ARCH

Online Redo Logs

Standby Redo Logs

ARCH

Real Time Apply

Archived Redo Logs

Archived Redo Logs

Real Time Apply Benefits

Standby databases now more closely synchronized with the primary More up-to-date, real-time reporting Faster switchover and failover times Reduces planned and unplanned downtime Better Recovery Time Objective (RTO) for DR

Real Time Apply -Tuning Media Recovery

Monitor via
Data Dictionary, OEM, Standby Statspack in 11G

Big performance boost in Oracle 11G


Up to 100% increase in redo apply performance

New standby statspack in Oracle 11G


See MetaLinkNote 454848.1 Includes information specific to a standby Output from V$RECOVERY_PROGRESS Output from V$MANAGED_STANDBY

Data Dictionary
V$DATABASE
DATABASE_ROLE: LOGICAL STANDBY, PHYSICAL STANDBY or PRIMARY PROTECTION_LEVEL: current protection mode setting.

FS_FAILOVER_STATUS: synchronization status

V$DATAGUARD_STATS V$DATAGUARD_STATUS

V$LOG & V$STANDBY_LOG: Redo log changed. V$MANAGED_STANDBY : Recovery progress

Determining Query Latency


From Primary (requires database link)

select scn_to_timestamp( (select current_scn from v$database) )-scn_to_timestamp( (select current_scn from v$database@dg) ) from dual;
If you do not wish to connect to the Primary -determine the value for APPLY LAG for a best estimate
Use Enterprise Manager monitoring

V$DATAGUARD_STATS select value,unit,time_computed from v$dataguard_stats where name='apply lag';


Query

Automatic Resynchronization Network connectivity problems may occur Data Guard automatically resynchronizes standbys after network connectivity restored Implicit ARCH process idling away on the primary pings all standbys on a regular basis to see if they are missing any redo data If so it sends them the missing redo data Explicit Gap discovered during apply process in physical standby Based on FAL_SERVER and FAL_CLIENT settings, primary notified, and it sends missing redo data

Data Guard Role Transitions


Switchover
Planned role reversal Used for OS or hardware maintenance

Failover
Unplanned role reversal Use in emergency Zero or minimal data loss depending on choice of data protection mode

Different steps for Physical and Logical Standby Switchover using Enterprise Manager is literally two mouse clicks Well do a Physical Standby Failover via the command line using the Broker

Data Guard Broker

Fast-Start Failover Fast-Start Failover Demo

Client Failover Oracle 11G Active Data Guard

Fast-Start failover
Makes Data Guard more than a Standby Database. Enables automatic failover with no data loss.

A feature of Oracle Database Enterprise Edition.


Only supports up to Maximum Availability Mode. Requires 3rd server. Install DGMGRL client part of Oracle client administrator software. Observer process continuously monitors primary and standby databases. If the listener is not running on port 1521, local_listener must be set in the spfile. Observer detects failure. Observer automatically executes database failover once threshold has been exceeded. DB_ROLE_CHANGE trigger fires: enables primary service. This trigger can be customized to restart JDBC mid-tier clients and calls any other OCI enabled application.

Fast-Start Failover

1. Data Guard in steady state transmitting redo 2. Observer monitoring state of the configuration

Fast-Start Failover

3. Disaster strikes the primary connections lost

Fast-Start Failover

4. Observer times out 5. Observer validates connection with target standby 6. Observer begins Fast-Start Failover

Fast-Start Failover

7. Target standby automatically becomes new primary (DB_ROLE_CHANGE trigger fires)

Fast-Start Failover

8. After old primary is repaired, Observer re-establishes connection 9. Observer automatically reinstates old primary to be a new standby 10. Redo transmission starts from new primary to new standby

Events that trigger Fast-Start Failover

Database conditions:
Server crash or shutdown (without db shutdown) Database instance failure (or last instance failure in a RAC configuration) Shutdown abort (or shutdown abort of the last instance in a RAC configuration) Datafiles taken offline due to I/O errors

Network conditions:
When both the Observer and the standby database lose their network connection to the primary database, and when the standby database
confirms that it is in a synchronized state.

Fast-Start Failover Conclusion


Fast
Site failover time measured in seconds, not minutes Failover is automatic, no manual intervention

Reliable
Eliminates human error Zero data loss failover

Simple
Automatically determines if failover criteria is met Original primary database is automatically reinstated as a new standby database following failover

Fast-Start Failover Conclusion


Prevention of "Split Brain" due to accidental startup of former primary database Reduced downtime through automatic activation of the standby database A failover solution without a shared disk system with additional advantages (enhanced data availibity) and even reduced failover time compared to HA cluster Many technical prerequisites (Flashback database, special Maximum Availability Mode) No automatic failover to a second standby database possible

Fast-Start failover
Requirements Fast-Start Failover is a feature of Oracle Data Guard, and can't run without a Data Guard Broker configuration! Observer machine and configuration Special entry in Data Guard Broker configuration

Maximum Availability Mode (mandatory)


but: special startup behaviour but: primary stalls in certain situations Flashback database must be activated

Demo: Switchover

1. Configure Broker and Fast_Start Failover 2. Configure Observer 3. Shutdown abort on the primary database TMTST 4. Wait until Fast_Start occurs on TMSTBY 5. Restart the old primary TMTST 6. Verify that observer reinstates database TMTST

Demo: Configure Fast_Start Failover


Flash-Recovery areas are setup on both sides SQL> show parameter DB_RECOVERY_FILE_DEST NAME TYPE db_recovery_file_dest string db_recovery_file_dest_size big integer Setup Flashback Database (on both): SQL> select FLASHBACK_ON from v$database; FLASHBACK_ON NO SQL> SHUTDOWN IMMEDIATE; SQL> STARTUP MOUNT; SQL> ALTER DATABASE FLASHBACK ON; SQL> ALTER DATABASE OPEN; SQL> show parameter flash NAME TYPE db_flashback_retention_target integer VALUE /tst/dump/oracle/fra 2G

VALUE 1440 (1 day)

Broker Parameters: SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1 ='+TVBSDG1/dr1tmstby.dat; SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2 ='+TVBSDG1/dr2tmstby.dat; SQL> ALTER SYSTEM SET DG_BROKER_START=TRUE;

Demo: Configure Fast_Start Failover


Listener.ora on Primary SID_LIST_LSNR_DGTEST = (SID_LIST= (SID_DESC = (GLOBAL_DBNAME = tmtst.vodacom.co.za) (ORACLE_HOME = /tst/opt/apps/oracle/database/10.2.0.4) (SID_NAME = tmtst) ) (SID_DESC = (GLOBAL_DBNAME = tmtst_DGMGRL.vodacom.co.za) (ORACLE_HOME = /tst/opt/apps/oracle/database/10.2.0.4) (SID_NAME = tmtst) ) )

Demo: Configure Fast_Start Failover


Tnsnames on both: TMTST.VODACOM.CO.ZA = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = prab03.vodacom.co.za)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = tmtst_DGMGRL.vodacom.co.za) ) ) TMSTBY.VODACOM.CO.ZA = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = tvbs01.vodacom.co.za)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = tmstby_DGMGRL.vodacom.co.za) ) )

Demo: Configure Observer


#> dgmgrl DGMGRL> connect sys/xxx@tmtst Connected. DGMGRL> CREATE CONFIGURATION TMDRTEST AS > PRIMARY DATABASE IS tmtst -> SHOW PARAMETER DB_UNIQUE_NAME > CONNECT IDENTIFIER IS tmtst; -> tns entry Configuration "tmdrtest" created with primary database "tmtst" DGMGRL> ADD DATABASE tmstby AS > CONNECT IDENTIFIER IS tmstby > MAINTAINED AS PHYSICAL; Database "tmstby" added DGMGRL> ENABLE CONFIGURATION; DGMGRL> SHOW CONFIGURATION; DGMGRL> SHOW DATABASE VERBOSE tmtst; DGMGRL> EDIT DATABASE tmtst SET PROPERTY 'LogXptMode'='SYNC'; DGMGRL> EDIT DATABASE tmtst SET PROPERTY FastStartFailoverTarget='tmstby'; DGMGRL> EDIT DATABASE tmstby SET PROPERTY FastStartFailoverTarget='tmtst'; DGMGRL> ENABLE FAST_START FAILOVER; DGMGRL> START OBSERVER; --> warning, prompt will not be returned!

Demo: Switchover
DGMGRL> SWITCHOVER TO tmstby; ------ duration 90 seconds! Performing switchover NOW, please wait... Operation requires shutdown of instance "tmtst" on database "tmtst" Shutting down instance "tmtst"... ORA-01109: database not open Database dismounted. ORACLE instance shut down. Operation requires shutdown of instance "tmstby" on database "tmstby" Shutting down instance "tmstby"... ORA-01109: database not open Database dismounted. ORACLE instance shut down. Operation requires startup of instance "tmtst" on database "tmtst" Starting instance "tmtst"... ORACLE instance started. Database mounted. Operation requires startup of instance "tmstby" on database "tmstby" Starting instance "tmstby"... ORACLE instance started. Database mounted. Switchover succeeded, new primary is "tmstby"

Verify on both
SELECT DATABASE_ROLE,STATUS,DB_UNIQUE_NAME, PROTECTION_MODE, PROTECTION_LEVEL, SWITCHOVER_STATUS, checkpoint_change#, current_scn ,STANDBY_BECAME_PRIMARY_SCN, FS_FAILOVER_STATUS, FS_FAILOVER_CURRENT_TARGET,

FS_FAILOVER_THRESHOLD, FS_FAILOVER_OBSERVER_PRESENT,
FS_FAILOVER_OBSERVER_HOST FROM V$DATABASE

Client Failover Best Practices


SQL> exec DBMS_SERVICE.CREATE_SERVICE ( service_name => 'tmOCI.vodacom.co.za', network_name => 'tmOCI.vodacom.co.za ', aq_ha_notifications => true, failover_method => 'BASIC', failover_type => 'SELECT', failover_retries => 180, failover_delay => 1); SQL> exec DBMS_SERVICE.START_SERVICE('tmOCI.vodacom.co.za'); SQL> select value from v$parameter where name = 'service_names'; VALUE -----------------------------------------------tmtst_DGMGRL.vodacom.co.za, tmOCI.vodacom.co.za

Client Failover Best Practices


Configure startup trigger for service
SQL> CREATE OR REPLACE TRIGGER manage_OCIservice after startup on database DECLARE role VARCHAR(30); BEGIN SELECT DATABASE_ROLE INTO role FROM V$DATABASE; IF role = 'PRIMARY' THEN DBMS_SERVICE.START_SERVICE('tmOCI.vodacom.co.za'); ELSE DBMS_SERVICE.STOP_SERVICE('tmOCI.vodacom.co.za'); END IF; END;

Client Failover Best Practices


Client tns entry Configuration TMOCI=(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP) (HOST = prab03.vodacom.co.za) (PORT = 1521)) (ADDRESS = (PROTOCOL = TCP) (HOST = tvbs01.vodacom.co.za) (PORT = 1521)) (LOAD_BALANCE = yes)) (CONNECT_DATA= SERVICE_NAME=tmOCI)) )

Active Data Guard 11g Investment in Improved Quality of Service

Active Data Guard

Begin with a Data Guard 11g physical standby database


If redo apply is running, stop redo apply Open the standby database read-only Start redo apply

Data Guard Broker & Enterprise Manager


Data Guard Broker CLI
Stop redo apply with the following command EDIT DATABASE TMSTBY' SET STATE=APPLY-OFF Open standby read-only via SQL*Plus

SQL> alter database open read only;


Restart redo apply via broker CLI EDIT DATABASE TMSTBY' SET STATE=APPLY-ON

Oracle Enterprise Manager 10g


Stop redo apply within Data Guard GUI Open standby in read-only mode in Advanced Startup Options Restart redo apply within Data Guard GUI

Supported Operations for Read Only


When connected to an Active Data Guard standby database, read-only applications can perform/use:
Selects Alter session / system Set role Lock table Call stored procedures DBlinks to write to remote databases Stored procedures to call remote procedures via DBlinks SET TRANSACTION READ ONLY for transaction level read consistency Complex queries e.g. grouping set queries and with clause queries

THANK YOU

Thinus.Meyer@vodacom.co.za http://martinmeyer.blogspot.com

Potrebbero piacerti anche