Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Techniques
Silberschatz Chapter 16
Database Recovery
Basic Storage Structure
– Storage structure: stable Storage
– Data Access, Buffering and DBMS caching
What is recovery?
– Types Of Failure
– Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
– Checkpoints in the System Log
– Transaction Rollback
How to do recovery?
– Log based recovery: Deferred Update (no UNDO/REDO)
– Log based recovery: Immediate Update (UNDO/REDO)
– Shadow Paging recovery (no UNDO/no REDO)
– ARIES
Slide -2
Database Recovery
Basic Storage Structure
Storage structure: stable Storage
Data Access, Buffering and DBMS caching
What is recovery?
– Types Of Failure
– Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
– Checkpoints in the System Log
– Transaction Rollback
How to do recovery?
– Log based recovery: Deferred Update (no UNDO/REDO)
– Log based recovery: Immediate Update (UNDO/REDO)
– Shadow Paging recovery (no UNDO/no REDO)
– ARIES
Slide -3
Storage Structure
Volatile storage:
– does not survive system crashes.
– examples: main memory, cache memory.
Nonvolatile storage:
– survives system crashes.
– examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM.
Stable storage:
– a theoretical form of storage that survives all failures.
– approximated by maintaining multiple copies on
distinct nonvolatile media (e.g. combination of RAID
and archive tape backups, copy block to remote site).
Slide -4
Example of Data Access
buffer
buffer
Buffer Block A x input(A)
x
Buffer Block B Y A
Y
output(B) B
read(X)
write(Y)
disk
x1 x2
y1
memory
Slide -5
Data Access
Physical blocks are those blocks residing on the disk.
We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.
Slide -6
Data Access (Cont.)
Transaction transfers data items between system buffer blocks and its
private work-area using the following operations :
– read(X) assigns the value of data item X to the local variable xi.
– write(X) assigns the value of local variable xi to data item {X} in the
buffer block.
– both these commands may necessitate the issue of an input(BX)
instruction before the assignment, if the block BX in which X resides is
not already in memory.
Transactions
– Perform read(X) while accessing X for the first time;
– All subsequent accesses are to the local copy.
– After last access, transaction executes write(X).
DBMS maintains a directory for the cache to keep track of which database
items are in the buffers. It maintains a number of lists containing:
– active transaction list ( started but not committed as yet)
– all committed transactions since last check point
– all aborted transactions since last check point
DBMS first checks cache directory for required disk page. If not in cache
directory, DBMS will look from the disk. The disk page containing the item
is copied into cache.
It may be necessary to replace (or flush) the cache buffers to make space
available using LRU (least recently used) or FIFO (first in first out) buffer
replacement strategy.
Slide -8
Oracle SGA
Large pool
– To support occasional processing that requires large chunks, e.g. backups and
parallel processing.
Slide -10
DBMS caching
When performing an action on an item, the DBMS first checks
the DBMS cache (in-memory buffers) to determine if the disk
page containing the item is in the cache.
Slide -11
DBMS caching
Slide -12
Database Recovery
Basic Storage Structure
– Storage structure: stable Storage
– Data Access, Buffering and DBMS caching
What is recovery?
Types Of Failure
Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
Checkpoints in the System Log
Transaction Rollback
How to do recovery?
– Log based recovery: Deferred Update (no UNDO/REDO)
– Log based recovery: Immediate Update (UNDO/REDO)
– Shadow Paging recovery (no UNDO/no REDO)
– ARIES
Slide -13
Types Of Failure
Network failure
Instance failure
Media failure
Slide -22
Recovery for Non-catastrophic failures
Slide -23
Recovery Concepts: Write-Ahead Logging (WAL)
Write-ahead logging
– This is used by in-place update.
– BFIM (old value) is recorded in log entry and the entry is flushed
(force-written) to disk before AFIM (new value) replaces BFIM.
REDO type log entry includes AFIM (new value) so it can redo and set
database item to new value
– Should be idempotent, i.e executing it over and over is equivalent to just
once.
UNDO type log entry includes BFIM (old value) so it can undo and set
database item back old value.
Slide -24
Recovery Concepts: Steal/No-Steal, Force/No-Force
Steal
– A transaction updates a cache page and then is written to disk before
commits.
– “Steals” a page, as in the case of needing to free up buffer frames for another
transaction, so it needs to write the most recent updated page quickly to disk,
even before it is committed.
– Does not require large buffer memory to store updated pages .
No-steal
– A transaction updates a cache page but cannot be written to disk before it
commits. (e.g. the pin bit is set to 1).
Force
– All pages updated by a transaction are immediately written to disk when
transaction is committed.
No-force
– Pages updated by a transaction are not immediately written to disk when
transaction is committed.
– A deferred update approach.
Slide -25
Exercise:
Why do most DBMS use a steal/no-force strategy?
Slide -26
Recovery Concepts: Checkpoints
Slide -27
Checkpoints in System Log
Slide -28
Checkpoints (Cont.)
All committed transactions in the log before a
checkpoint do not need to have their WRITE
operations REDONE in case of a system failure.
1. Scan backwards from end of log to find the most recent <checkpoint>
record .
2. Continue scanning backwards till a record <Ti start> is found.
3. Need only consider the part of log following from the start record.
Earlier part of log can be ignored during recovery, and can be erased
whenever desired.
4. Recovery in case of immediate modification:
• For all transactions (starting from Ti or later) with no <Ti commit>,
execute undo(Ti).
• Scanning forward in the log, for all transactions starting from Ti or
later with a <Ti commit>, execute redo(Ti).
Slide -30
Example Of Checkpoints Recovery Using Immediate Update
Tc Tf
T1
T2
T3
T4
Slide -31
Recovery Concepts: Transaction Rollback
DB
Slide -33
Example: Illustrating cascading rollback
(a) The read and write operations of three transactions.
(b) System log at point of crash
(c) Operations before the crash 1. What recovery is necessary if system crash
before [read_item, T3, A]?
DB
Slide -34
Example: Illustrating cascading rollback
(a) The read and write operations of three transactions.
(b) System log at point of crash
(c) Operations before the crash 2. What recovery is necessary if system
crashes before [write_item, T2,D, 25, 26]?
DB
Slide -35
Database Recovery
Basic Storage Structure
– Storage structure: stable Storage
– Data Access, Buffering and DBMS caching
What is recovery?
– Types Of Failure
– Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
– Checkpoints in the System Log
– Transaction Rollback
How to do recovery?
Log based recovery: Deferred Update (no UNDO/REDO)
Log based recovery: Immediate Update (UNDO/REDO)
Shadow Paging recovery (no UNDO/no REDO)
ARIES
Slide -36
Two techniques (approaches) in Recovery:
Log-based recovery.
– Deferred update – RDU (No-Undo/Redo)
– Immediate update – RIU (Undo/Redo)
Shadow-paging (NO-UNDO/NO-REDO)
Slide -37
Recovery based on deferred update techniques - RDU
This is suitable for transactions that are short with a few item
changes which does not take up excessive buffer space.
Slide -39
Deferred Update and Recovery
Silberschatz fig,17.4, example of transaction logs
Slide -40
Recovery based on immediate update techniques - RIU
Using the WAL write-ahead-logging protocol, the log (on disk) records the
update operations before the update is applied to the database.
If all physical updates of the database occur prior to the commit point.
– Use UNDO/-NO-REDO algorithm.
If only some physical update of the database occurs prior to the commit
point .
– Transaction updates prior to the commit point are stored in main
memory buffer by force writing.
Slide -41
Immediate Database Modification
Output of updated blocks to disk can take place at any time before or
after transaction commit.
Order in which blocks are output to disk can be different from the
order in which they are written.
Slide -42
Immediate Database Modification (Cont.)
Recovery procedure has two operations instead of one:
– undo(Ti) restores the value of all data items updated by Ti to their
old values, going backwards from the last log record for Ti.
– redo(Ti) sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti
Silberschatz fig,17.7
Slide -44
Summary Of Recovery On Concurrent Transactions
Slide -45
Exercise
What recovery
is necessary
for these
concurrent
transactions
using deferred
update?
Slide -46
Shadow Paging
Slide -47
Shadow Paging
Slide -49
An example of shadow paging.
Slide -50
The ARIES Recovery Algorithm
Slide -51
ARIES Recovery Algorithm
Redo pass:
– Repeats history, redoing all actions from RedoLSN
RecLSN and PageLSNs are used to avoid redoing actions already
reflected on page
Undo pass:
– Rolls back all incomplete transactions
Transactions whose abort was complete earlier are not undone
– Key idea: no need to undo these transactions: earlier undo
actions were logged, and are redone as required
Silberschatz 16.8.6.2
Slide -52
Database Recovery
Basic Storage Structure
– Storage structure: stable Storage
– Data Access, Buffering and DBMS caching
What is recovery?
– Types Of Failure
– Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
– Checkpoints in the System Log
– Transaction Rollback
How to do recovery?
– Log based recovery: Deferred Update (no UNDO/REDO)
– Log based recovery: Immediate Update (UNDO/REDO)
– Shadow Paging recovery (no UNDO/no REDO)
– ARIES
Slide -53
Oracle Recovery Structure
Controlfiles contain pointers to datafiles, dictating where datafiles
should be in relation to redo log entries. Controlfiles are used during
database mount.
– This file is typically multiplexed and stored as control01.ctl,
control02.ctl and control03.ctl.
Redo logs and archive logs consist of records of all transactions made
to a database.
– This file is multiplexed in three groups and store as redo01.log,
redo02.log and redo03.log..
– Restoration of recovered backup is a simple process of applying
redo log entries to the datafile, until the datafile “catches up” to the
time indicated by controlfile
Undo log segments are used for transaction rollback (before commit)
– stored in the undo tablespace datafile undotbs01.dbf
Slide -54
LGWR Process
Each redo log group consists of a redo log file (member) and its multiplexed
copies.
LGWR writes redo records to all members of a redo log group until the file is
filled or a log switch is requested, in which case it writes to the next group.
Source: Oracle Database 10g: Administration Workshop I Slide -55
ARCn Process
If NOARCHIVELOG mode is set (i.e. the default), the redo log file will be
overwritten. To check: SQL> ARCHIVE LOG LIST;
Typically the distance between the checkpoint position and the end of
the redo log group can never be more than 90% of the smallest redo log
group.
Source: Oracle Database 10g: Administration Workshop I Slide -61
Using Advisor (MTTR – Mean Time To Recovery)
Slide -65
UNDO DATA vs REDO DATA
COMMIT statement
– Making pending changes in existing transaction in current session
permanent
– Permanently stores changes to a database
ROLLBACK statement
– Removing pending changes in existing transaction in current session
– Reverses changes done with COMMIT
Slide -67
Transaction Prior To Commit
COMMIT statement
– Making pending changes in existing transaction in current session
permanent
– Permanently stores changes to a database
Source: Oracle 10g DB Administrator: Implementation and Administration Slide -69
Transaction Rollback
Slide -72
Oracle Backup Structure
Slide -73
What is Oracle Backup?
Applying redo
Slide 78
Use of Undo Data
Slide 79
Use of Undo Data
Slide 80
Redo log changes
are applied to
data files until
current online log Undo to roll back
is reached and uncommitted
most recent changes.
transactions have
been re-entered.
Slide 81
Flashback Technology
Slide -82
Flashback Drop and Recycle Bin
Slide -84
Recycle Bin
like a data
dictionary
table without
tablespace
Slide -85
Flashback Query Data
Slide -86
Methods of Backup and Recovery
Slide -88
Cold Backup
Since database is shutdown, it provides a consistent snapshot of the database to
be copied.
Slide -94
Database Availability (24x7x365)
Slide -97
Acceptable Loss Upon Failure
Slide -98
Available Equipment
Slide -99
Planning for Potential Disaster and Recovery
Slide -100
Physical Standby Database
Silberschatz 16.9
Slide -104
Four main issues concerning remote backup
1. Detection of failure:
Backup site must detect when primary site has failed
– To distinguish primary site failure from link failure maintain several
communication links between the primary and the remote backup.
2. Transfer of control:
– To take over control backup site first perform recovery using its copy
of the database and all the long records it has received from the
primary. Thus, completed transactions are redone and incomplete
transactions are rolled back.
– When the backup site takes over processing it becomes the new
primary.
– To transfer control back to old primary when it recovers, old primary
must receive redo logs from the old backup and apply all updates
locally.
Slide -105
Remote Backup Systems (Cont.)
Slide -106
Remote Backup Systems (Cont.)
4. Time to commit
– Ensure durability of updates by delaying transaction commit until update is
logged at backup; avoid this delay by permitting lower degrees of durability.
Slide -107