Sei sulla pagina 1di 100

Chapter 16 : Concurrency Control

Chapter 16: Concurrency Control


Lock-Based Protocols
Timestamp-Based Protocols
Validation-Based Protocols
Multiple Granularity
Lock-Based Protocols
A lock is a mechanism to control concurrent access to a data item
Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
Lock requests are made to concurrency-control manager. Transaction can
proceed only after request is granted.
Lock-Based Protocols (Cont.)
Lock-compatibility matrix




A transaction may be granted a lock on an item if the requested lock is
compatible with locks already held on the item by other transactions
Any number of transactions can hold shared locks on an item,
but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
If a lock cannot be granted, the requesting transaction is made to wait till
all incompatible locks held by other transactions have been released.
The lock is then granted.
Lock-Based Protocols (Cont.)
Example of a transaction performing locking:
T
2
: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
Locking as above is not sufficient to guarantee serializability if A and B
get updated in-between the read of A and B, the displayed sum would be
wrong.
A locking protocol is a set of rules followed by all transactions while
requesting and releasing locks. Locking protocols restrict the set of
possible schedules.
Pitfalls of Lock-Based Protocols
Consider the partial schedule









Neither T
3
nor T
4
can make progress executing lock-S(B) causes T
4

to wait for T
3
to release its lock on B, while executing lock-X(A) causes
T
3
to wait for T
4
to release its lock on A.
Such a situation is called a deadlock.
To handle a deadlock one of T
3
or T
4
must be rolled back
and its locks released.
Pitfalls of Lock-Based Protocols (Cont.)
The potential for deadlock exists in most locking protocols. Deadlocks
are a necessary evil.
Starvation is also possible if concurrency control manager is badly
designed. For example:
A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock
on the same item.
The same transaction is repeatedly rolled back due to deadlocks.
Concurrency control manager can be designed to prevent starvation.
The Two-Phase Locking Protocol
This is a protocol which ensures conflict-serializable schedules.
Phase 1: Growing Phase
transaction may obtain locks
transaction may not release locks
Phase 2: Shrinking Phase
transaction may release locks
transaction may not obtain locks
The protocol assures serializability. It can be proved that the
transactions can be serialized in the order of their lock points (i.e.
the point where a transaction acquired its final lock).
The Two-Phase Locking Protocol (Cont.)
Two-phase locking does not ensure freedom from deadlocks
Cascading roll-back is possible under two-phase locking. To avoid
this, follow a modified protocol called strict two-phase locking. Here
a transaction must hold all its exclusive locks till it commits/aborts.
Rigorous two-phase locking is even stricter: here all locks are held
till commit/abort. In this protocol transactions can be serialized in the
order in which they commit.
The Two-Phase Locking Protocol (Cont.)
There can be conflict serializable schedules that cannot be obtained if
two-phase locking is used.
However, in the absence of extra information (e.g., ordering of access
to data), two-phase locking is needed for conflict serializability in the
following sense:
Given a transaction T
i
that does not follow two-phase locking, we can
find a transaction T
j
that uses two-phase locking, and a schedule for T
i

and T
j
that is not conflict serializable.
Lock Conversions
Two-phase locking with lock conversions:
First Phase:
can acquire a lock-S on item
can acquire a lock-X on item
can convert a lock-S to a lock-X (upgrade)
Second Phase:
can release a lock-S
can release a lock-X
can convert a lock-X to a lock-S (downgrade)
This protocol assures serializability. But still relies on the programmer to
insert the various locking instructions.
Implementation of Locking
A lock manager can be implemented as a separate process to which
transactions send lock and unlock requests
The lock manager replies to a lock request by sending a lock grant
messages (or a message asking the transaction to roll back, in case of
a deadlock)
The requesting transaction waits until its request is answered
The lock manager maintains a data-structure called a lock table to
record granted locks and pending requests
The lock table is usually implemented as an in-memory hash table
indexed on the name of the data item being locked
Lock Table
Black rectangles indicate granted locks,
white ones indicate waiting requests
Lock table also records the type of lock
granted or requested
New request is added to the end of the
queue of requests for the data item, and
granted if it is compatible with all earlier
locks
Unlock requests result in the request
being deleted, and later requests are
checked to see if they can now be
granted
If transaction aborts, all waiting or
granted requests of the transaction are
deleted
lock manager may keep a list of
locks held by each transaction, to
implement this efficiently
Granted
Waiting
Graph-Based Protocols
Graph-based protocols are an alternative to two-phase locking
Impose a partial ordering on the set D = {d
1
, d
2
,..., d
h
} of all data
items.
If d
i
d
j
then any transaction accessing both d
i
and d
j
must
access d
i
before accessing d
j
.
Implies that the set D may now be viewed as a directed acyclic
graph, called a database graph.
The tree-protocol is a simple kind of graph protocol.
Tree Protocol
1. Only exclusive locks are allowed.
2. The first lock by T
i
may be on any data item. Subsequently, a data Q
can be locked by T
i
only if the parent of Q is currently locked by T
i
.
3. Data items may be unlocked at any time.
4. A data item that has been locked and unlocked by T
i
cannot
subsequently be relocked by T
i

Graph-Based Protocols (Cont.)
The tree protocol ensures conflict serializability as well as freedom from
deadlock.
Unlocking may occur earlier in the tree-locking protocol than in the two-
phase locking protocol.
shorter waiting times, and increase in concurrency
protocol is deadlock-free, no rollbacks are required
Drawbacks
Protocol does not guarantee recoverability or cascade freedom
Need to introduce commit dependencies to ensure recoverability
Transactions may have to lock data items that they do not access.
increased locking overhead, and additional waiting time
potential decrease in concurrency
Schedules not possible under two-phase locking are possible under tree
protocol, and vice versa.
Multiple Granularity
Allow data items to be of various sizes and define a hierarchy of data
granularities, where the small granularities are nested within larger
ones
Can be represented graphically as a tree (but don't confuse with tree-
locking protocol)
When a transaction locks a node in the tree explicitly, it implicitly locks
all the node's descendents in the same mode.
Granularity of locking (level in tree where locking is done):
fine granularity (lower in tree): high concurrency, high locking
overhead
coarse granularity (higher in tree): low locking overhead, low
concurrency
Example of Granularity Hierarchy









The levels, starting from the coarsest (top) level are
database
area
file
record
Intention Lock Modes
In addition to S and X lock modes, there are three additional lock
modes with multiple granularity:
intention-shared (IS): indicates explicit locking at a lower level of
the tree but only with shared locks.
intention-exclusive (IX): indicates explicit locking at a lower level
with exclusive or shared locks
shared and intention-exclusive (SIX): the subtree rooted by that
node is locked explicitly in shared mode and explicit locking is
being done at a lower level with exclusive-mode locks.
intention locks allow a higher level node to be locked in S or X mode
without having to check all descendent nodes.
Compatibility Matrix with
Intention Lock Modes
The compatibility matrix for all lock modes is:
IS
IX S S IX
X
IS
IX
S
S IX
X
















Multiple Granularity Locking Scheme
Transaction T
i
can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in any
mode.
3. A node Q can be locked by T
i
in S or IS mode only if the parent of Q
is currently locked by T
i
in either IX or IS mode.
4. A node Q can be locked by T
i
in X, SIX, or IX mode only if the parent
of Q is currently locked by T
i
in either IX or SIX mode.
5. T
i
can lock a node only if it has not previously unlocked any node
(that is, T
i
is two-phase).
6. T
i
can unlock a node Q only if none of the children of Q are currently
locked by T
i
.
Observe that locks are acquired in root-to-leaf order, whereas they are
released in leaf-to-root order.
Timestamp-Based Protocols
Each transaction is issued a timestamp when it enters the system. If an old
transaction T
i
has time-stamp TS(T
i
), a new transaction T
j
is assigned time-
stamp TS(T
j
) such that TS(T
i
) <TS(T
j
).
The protocol manages concurrent execution such that the time-stamps
determine the serializability order.
In order to assure such behavior, the protocol maintains for each data Q two
timestamp values:
W-timestamp(Q) is the largest time-stamp of any transaction that
executed write(Q) successfully.
R-timestamp(Q) is the largest time-stamp of any transaction that
executed read(Q) successfully.
Timestamp-Based Protocols (Cont.)
The timestamp ordering protocol ensures that any conflicting read
and write operations are executed in timestamp order.
Suppose a transaction T
i
issues a read(Q)
1. If TS(T
i
) W-timestamp(Q), then T
i
needs to read a value of Q
that was already overwritten.
Hence, the read operation is rejected, and T
i
is rolled back.
2. If TS(T
i
) W-timestamp(Q), then the read operation is executed,
and R-timestamp(Q) is set to max(R-timestamp(Q), TS(T
i
)).
Timestamp-Based Protocols (Cont.)
Suppose that transaction T
i
issues write(Q).
1. If TS(T
i
) < R-timestamp(Q), then the value of Q that T
i
is
producing was needed previously, and the system assumed that
that value would never be produced.
Hence, the write operation is rejected, and T
i
is rolled back.
2. If TS(T
i
) < W-timestamp(Q), then T
i
is attempting to write an
obsolete value of Q.
Hence, this write operation is rejected, and T
i
is rolled back.
3. Otherwise, the write operation is executed, and W-timestamp(Q)
is set to TS(T
i
).
Example Use of the Protocol
A partial schedule for several data items for transactions with
timestamps 1, 2, 3, 4, 5

T
1
T
2
T
3
T
4
T
5

read(Y)
read(X)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
abort
read(X)
write(Z)
abort
write(Y)
write(Z)
Correctness of Timestamp-Ordering Protocol
The timestamp-ordering protocol guarantees serializability since all the
arcs in the precedence graph are of the form:





Thus, there will be no cycles in the precedence graph
Timestamp protocol ensures freedom from deadlock as no transaction
ever waits.
But the schedule may not be cascade-free, and may not even be
recoverable.
transaction
with smaller
timestamp
transaction
with larger
timestamp
Deadlock Handling
Consider the following two transactions:
T
1
: write (X) T
2
: write(Y)
write(Y) write(X)
Schedule with deadlock
T
1
T
2

lock-X on X
write (X)
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
Deadlock Handling
System is deadlocked if there is a set of transactions such that every
transaction in the set is waiting for another transaction in the set.
Deadlock prevention protocols ensure that the system will never
enter into a deadlock state. Some prevention strategies :
Require that each transaction locks all its data items before it
begins execution (predeclaration).
Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified by the
partial order (graph-based protocol).
More Deadlock Prevention Strategies
Following schemes use transaction timestamps for the sake of deadlock
prevention alone.
wait-die scheme non-preemptive
older transaction may wait for younger one to release data item.
Younger transactions never wait for older ones; they are rolled back
instead.
a transaction may die several times before acquiring needed data
item
wound-wait scheme preemptive
older transaction wounds (forces rollback) of younger transaction
instead of waiting for it. Younger transactions may wait for older
ones.
may be fewer rollbacks than wait-die scheme.
Deadlock prevention (Cont.)
Both in wait-die and in wound-wait schemes, a rolled back
transactions is restarted with its original timestamp. Older transactions
thus have precedence over newer ones, and starvation is hence
avoided.
Timeout-Based Schemes :
a transaction waits for a lock only for a specified amount of time.
After that, the wait times out and the transaction is rolled back.
thus deadlocks are not possible
simple to implement; but starvation is possible. Also difficult to
determine good value of the timeout interval.
Deadlock Detection
Deadlocks can be described as a wait-for graph, which consists of a
pair G = (V,E),
V is a set of vertices (all the transactions in the system)
E is a set of edges; each element is an ordered pair T
i
T
j
.
If T
i
T
j
is in E, then there is a directed edge from T
i
to T
j
, implying
that T
i
is waiting for T
j
to release a data item.
When T
i
requests a data item currently being held by T
j
, then the edge
T
i
T
j
is inserted in the wait-for graph. This edge is removed only when
T
j
is no longer holding a data item needed by T
i
.
The system is in a deadlock state if and only if the wait-for graph has a
cycle. Must invoke a deadlock-detection algorithm periodically to look
for cycles.
Deadlock Detection (Cont.)
Wait-for graph without a cycle
Wait-for graph with a cycle
Deadlock Recovery
When deadlock is detected :
Some transaction will have to rolled back (made a victim) to break
deadlock. Select that transaction as victim that will incur minimum
cost.
Rollback -- determine how far to roll back transaction
Total rollback: Abort the transaction and then restart it.
More effective to roll back transaction only as far as necessary
to break deadlock.
Starvation happens if same transaction is always chosen as
victim. Include the number of rollbacks in the cost factor to avoid
starvation
Three classic problems
PB: two (more) transactions read / write on the same part of the db.
Although transactions execute correctly, results may interleave in diff ways =>
3 classic problems.

Lost Update
Uncommitted Dependency
Inconsistent Analysis


34
Lost Update problem
Time
User 1 (Trans
A)
User2 (Trans B)
1 Retrieve t
2 Retrieve t
3 Update t
4 Update t
5
6
7
t : tuple in a table. Trans A loses an update at t4.
The update at t3 is lost (overwritten) at t4 by B.
35
Uncommitted Dependency
Time User 1 (Trans A) User 2 (Trans B)
1 Update t
2 Retrieve t
3 Rollback
4
5
6 Update t
7 Update t
8 Rollback
2 PBs (T1-3 ; T6-8). One trans is allowed to retrieve/update) a
tuple updated by another, but not yet committed.
Trans A is dependent at time t2 on an uncommitted change
made by Trans B, which is lost on Rollback.
36
Inconsistent Analysis
Time User 1 (Trans A) User 2 (Trans B)
1 Retrieve Acc 1 :
Sum = 40
2 Retrieve Acc2 :
Sum = 90
3 Retrieve Acc3 :
4 Update Acc3:
30 20
5 Retrieve Acc1:
6 Update Acc1:
40 50
7 commit
8 Retrieve Acc3:
Sum = 110 (not
120)
Initially: Acc 1 = 40; Acc2 = 50; Acc3 = 30;
Trans A sees
inconsistent
DB state
after B
updated
Accumulator

=> performs
inconsistent
analysis.
37
Why these problems?
Retrieve : read (R)
Update : write (W).
interleaving two transactions => 3 PBS:
RR no problem
WW lost update
WR uncommitted dependency
RW inconsistent analysis

38
How to prevent such problems?
locking protocol
Other approaches : serializability, time-stamping, and shadow-paging.
See books.
IF risk of interference = low => two-phase locking ~ common approach
although it requires deadlock avoidance!!
Lock applies to a tuple :
exclusive (write; X) or
shared(read; S).
39
Lost Update solved
Time User 1 (Trans A) User2 (Trans B)
1 Retrieve t (get S-lock on t)
2 Retrieve t (get S-lock on t)
3 Update t (request X-lock on
t)
4 wait Update t (request X-lock on
t)
5 wait wait
6 wait wait
7
No update lost but => deadlock
40
Uncommitted Dependency solved
Time User 1 (Trans A) User 2 (Trans B)
1 Update t (get X-lock on t)
2 Retrieve t (request S-lock on
t)
-
3 wait -
4 wait -
5 wait Commit / Rollback
(releases X-lock on t)
6 Resume: Retrieve t
(get S-lock on t)
7 -
8
41
Chapter 17: Recovery System
Chapter 17: Recovery System
Failure Classification
Storage Structure
Recovery and Atomicity
Log-Based Recovery
Shadow Paging
Recovery With Concurrent Transactions

Failure Classification
Transaction failure :
Logical errors: transaction cannot complete due to some internal
error condition
System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure
causes the system to crash.
Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
Database systems have numerous integrity checks to prevent
corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of
disk storage
Destruction is assumed to be detectable: disk drives use
checksums to detect failures
Recovery Algorithms
Recovery algorithms are techniques to ensure database consistency
and transaction atomicity and durability despite failures
Focus of this chapter
Recovery algorithms have two parts
1. Actions taken during normal transaction processing to ensure
enough information exists to recover from failures
2. Actions taken after a failure to recover the database contents to a
state that ensures atomicity, consistency and durability
Storage Structure
Volatile storage:
does not survive system crashes
examples: main memory, cache memory
Nonvolatile storage:
survives system crashes
examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
Stable storage:
a mythical form of storage that survives all failures
approximated by maintaining multiple copies on distinct
nonvolatile media
Stable-Storage Implementation
Maintain multiple copies of each block on separate disks
copies can be at remote sites to protect against disasters such as
fire or flooding.
Failure during data transfer can still result in inconsistent copies: Block
transfer can result in
Successful completion
Partial failure: destination block has incorrect information
Total failure: destination block was never updated
Protecting storage media from failure during data transfer (one
solution):
Execute output operation as follows (assuming two copies of each
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write
successfully completes.
Stable-Storage Implementation (Cont.)
Protecting storage media from failure during data transfer (cont.):
Copies of a block may differ due to failure during output operation. To
recover from failure:
1. First find inconsistent blocks:
1. Expensive solution: Compare the two copies of every disk block.
2. Better solution:
Record in-progress disk writes on non-volatile storage (Non-
volatile RAM or special area of disk).
Use this information during recovery to find blocks that may be
inconsistent, and only compare copies of these.
Used in hardware RAID systems
2. If either copy of an inconsistent block is detected to have an error (bad
checksum), overwrite it by the other copy. If both have no error, but are
different, overwrite the second block by the first block.
Data Access
Physical blocks are those blocks residing on the disk.
Buffer blocks are the blocks residing temporarily in main memory.
Block movements between disk and main memory are initiated
through the following two operations:
input(B) transfers the physical block B to main memory.
output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
Each transaction T
i
has its private work-area in which local copies of
all data items accessed and updated by it are kept.
T
i
's local copy of a data item X is called x
i
.
We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.
Data Access (Cont.)
Transaction transfers data items between system buffer blocks and its
private work-area using the following operations :
read(X) assigns the value of data item X to the local variable x
i
.
write(X) assigns the value of local variable x
i
to data item {X} in
the buffer block.
both these commands may necessitate the issue of an input(B
X
)
instruction before the assignment, if the block B
X
in which X
resides is not already in memory.
Transactions
Perform read(X) while accessing X for the first time;
All subsequent accesses are to the local copy.
After last access, transaction executes write(X).
output(B
X
) need not immediately follow write(X). System can perform
the output operation when it deems fit.
Example of Data Access
X
Y
A
B
x
1

y
1

buffer
Buffer Block A
Buffer Block B
input(A)
output(B)
read(X)
write(Y)
disk
work area
of T
1

work area
of T
2

memory
x
2

Recovery and Atomicity
Modifying the database without ensuring that the transaction will commit
may leave the database in an inconsistent state.
Consider transaction T
i
that transfers $50 from account A to account B;
goal is either to perform all database modifications made by T
i
or none
at all.
Several output operations may be required for T
i
(to output A and B). A
failure may occur after one of these modifications have been made but
before all of them are made.
Recovery and Atomicity (Cont.)
To ensure atomicity despite failures, we first output information
describing the modifications to stable storage without modifying the
database itself.
We study two approaches:
log-based recovery, and
shadow-paging
We assume (initially) that transactions run serially, that is, one after
the other.

Log-Based Recovery
A log is kept on stable storage.
The log is a sequence of log records, and maintains a record of
update activities on the database.
When transaction T
i
starts, it registers itself by writing a
<T
i
start>log record
Before T
i
executes write(X), a log record <T
i
, X, V
1
, V
2
> is written,
where V
1
is the value of X before the write, and V
2
is the value to be
written to X.
Log record notes that T
i
has performed a write on data item X
j
X
j

had value V
1
before the write, and will have value V
2
after the write.
When T
i
finishes it last statement, the log record <T
i
commit> is written.
We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
Two approaches using logs
Deferred database modification
Immediate database modification
Deferred Database Modification
The deferred database modification scheme records all
modifications to the log, but defers all the writes to after partial
commit.
Assume that transactions execute serially
Transaction starts by writing <T
i
start> record to log.
A write(X) operation results in a log record <T
i
, X, V> being written,
where V is the new value for X
Note: old value is not needed for this scheme
The write is not performed on X at this time, but is deferred.
When T
i
partially commits, <T
i
commit> is written to the log
Finally, the log records are read and used to actually execute the
previously deferred writes.
Deferred Database Modification (Cont.)
During recovery after a crash, a transaction needs to be redone if and
only if both <T
i
start> and<T
i
commit> are there in the log.
Redoing a transaction T
i
( redoT
i
) sets the value of all data items updated
by the transaction to the new values.
Crashes can occur while
the transaction is executing the original updates, or
while recovery action is being taken
example transactions T
0
and T
1
(T
0
executes before T
1
):
T
0
: read (A) T
1
: read (C)
A: - A - 50 C:- C- 100
Write (A) write (C)
read (B)
B:- B + 50
write (B)
Deferred Database Modification (Cont.)
Below we show the log as it appears at three instances of time.








If log on stable storage at time of crash is as in case:
(a) No redo actions need to be taken
(b) redo(T
0
) must be performed since <T
0
commit> is present
(c) redo(T
0
) must be performed followed by redo(T
1
) since
<T
0
commit> and <T
i
commit> are present
Immediate Database Modification
The immediate database modification scheme allows database
updates of an uncommitted transaction to be made as the writes are
issued
since undoing may be needed, update logs must have both old
value and new value
Update log record must be written before database item is written
We assume that the log record is output directly to stable storage
Can be extended to postpone log record output, so long as prior to
execution of an output(B) operation for a data block B, all log
records corresponding to items B must be flushed to stable
storage
Output of updated blocks can take place at any time before or after
transaction commit
Order in which blocks are output can be different from the order in
which they are written.
Immediate Database Modification Example
Log Write Output

<T
0
start>
<T
0
, A, 1000, 950>
T
o
, B, 2000, 2050
A = 950
B = 2050
<T
0
commit>
<T
1
start>
<T
1
, C, 700, 600>
C = 600
B
B
, B
C

<T
1
commit>
B
A

Note: B
X
denotes block containing X.

x
1

Immediate Database Modification (Cont.)
Recovery procedure has two operations instead of one:
undo(T
i
) restores the value of all data items updated by T
i
to their
old values, going backwards from the last log record for T
i

redo(T
i
) sets the value of all data items updated by T
i
to the new
values, going forward from the first log record for T
i

Both operations must be idempotent
That is, even if the operation is executed multiple times the effect is
the same as if it is executed once
Needed since operations may get re-executed during recovery
When recovering after failure:
Transaction T
i
needs to be undone if the log contains the record
<T
i
start>, but does not contain the record <T
i
commit>.
Transaction T
i
needs to be redone if the log contains both the record
<T
i
start> and the record <T
i
commit>.
Undo operations are performed first, then redo operations.
Immediate DB Modification Recovery
Example
Below we show the log as it appears at three instances of time.









Recovery actions in each case above are:
(a) undo (T
0
): B is restored to 2000 and A to 1000.
(b) undo (T
1
) and redo (T
0
): C is restored to 700, and then A and B are
set to 950 and 2050 respectively.
(c) redo (T
0
) and redo (T
1
): A and B are set to 950 and 2050
respectively. Then C is set to 600
Checkpoints
Problems in recovery procedure as discussed earlier :
1. searching the entire log is time-consuming
2. we might unnecessarily redo transactions which have already
3. output their updates to the database.
Streamline recovery procedure by periodically performing
checkpointing
1. Output all log records currently residing in main memory onto
stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint> onto stable storage.
Checkpoints (Cont.)
During recovery we need to consider only the most recent transaction
T
i
that started before the checkpoint, and transactions that started
after T
i
.
1. Scan backwards from end of log to find the most recent
<checkpoint> record
2. Continue scanning backwards till a record <T
i
start> is found.
3. Need only consider the part of log following above start record.
Earlier part of log can be ignored during recovery, and can be
erased whenever desired.
4. For all transactions (starting from T
i
or later) with no <T
i
commit>,
execute undo(T
i
). (Done only in case of immediate modification.)
5. Scanning forward in the log, for all transactions starting
from T
i
or later with a <T
i
commit>, execute redo(T
i
).
Example of Checkpoints








T
1
can be ignored (updates already output to disk due to checkpoint)
T
2
and T
3
redone.
T
4
undone
T
c

T
f

T
1

T
2

T
3

T
4

checkpoint
system failure
Recovery With Concurrent Transactions
We modify the log-based recovery schemes to allow multiple
transactions to execute concurrently.
All transactions share a single disk buffer and a single log
A buffer block can have data items updated by one or more
transactions
We assume concurrency control using strict two-phase locking;
i.e. the updates of uncommitted transactions should not be visible to
other transactions
Otherwise how to perform undo if T1 updates A, then T2 updates
A and commits, and finally T1 has to abort?
Logging is done as described earlier.
Log records of different transactions may be interspersed in the log.
The checkpointing technique and actions taken on recovery have to be
changed
since several transactions may be active when a checkpoint is
performed.
Recovery With Concurrent Transactions (Cont.)
Checkpoints are performed as before, except that the checkpoint log record
is now of the form
< checkpoint L>
where L is the list of transactions active at the time of the checkpoint
We assume no updates are in progress while the checkpoint is carried
out (will relax this later)
When the system recovers from a crash, it first does the following:
1. Initialize undo-list and redo-list to empty
2. Scan the log backwards from the end, stopping when the first
<checkpoint L> record is found.
For each record found during the backward scan:
if the record is <T
i
commit>, add T
i
to redo-list
if the record is <T
i
start>, then if T
i
is not in redo-list, add T
i
to undo-
list
3. For every T
i
in L, if T
i
is not in redo-list, add T
i
to undo-list
Recovery With Concurrent Transactions (Cont.)
At this point undo-list consists of incomplete transactions which must
be undone, and redo-list consists of finished transactions that must be
redone.
Recovery now continues as follows:
1. Scan log backwards from most recent record, stopping when
<T
i
start> records have been encountered for every T
i
in undo-
list.
During the scan, perform undo for each log record that
belongs to a transaction in undo-list.
2. Locate the most recent <checkpoint L> record.
3. Scan log forwards from the <checkpoint L> record till the end of
the log.
During the scan, perform redo for each log record that
belongs to a transaction on redo-list
Unit:7 Security
Discretionary Access Control
Mandatory Access Control
Encryption
Note: Integrity is not Security.

Integrity ensures that the things the users are trying to do are correct.
Security ensures that things users are doing are only what they are
allowed to do.
69
Security vs. Integrity
Integrity: Ensuring what users are trying to do is correct.
Security: Ensuring users are allowed to do things they are trying to
do.
Both require rules that users must not violate.
70
Database Security Approaches
1. Discretionary Control: Named users, Privileges or access rights to data
objects. Distributed control.

2. Mandatory Control: Users have Clearance, Objects have classification
levels. Central control.
71
Security Mechanisms
Security sub-systems which checks IDs against security rules.
SQL Syntax for Security Rules
GRANT [privilege-commalist | ALL PRIVILEGES]
ON object-name
TO [authorisation_id_list | PUBLIC]
[WITH GRANT OPTION]
Each privilege is one of the following:
SELECT
DELETE
INSERT [ (attribute-commalist)]
UPDATE [ (attribute-commalist) ]
REFERENCES [ (attribute-commalist) ]

The REFERENCES allows privileges to be granted on named table(s) in
integrity constraints of CREATE TABLE.
The GRANT OPTION allows the named users to pass the privileges on
to other users.
Grant and Revoke
If a user A grants privileges to user B, then they can also revoke them e.g.

REVOKE ALL PRIVILEGES ON STATS FROM John;

SQL REVOKE syntax
REVOKE [GRANT OPTION FOR]
[privilege_list | ALL PRIVILEGES]
ON object_name
FROM [authorisation_list|PUBLIC] [RESTRICT|CASCADE]

If RESTRICT option is given then the command is not executed if any
dependent rules exist i.e. those created by other users through the
WITH GRANT OPTION.
CASCADE will force a REVOKE on any dependent rules.
Security Summary
A DBMS security-subsystem enforces security
Access is checked against security rules
Discretionary control rules have a users, privileges
and objects
Mandatory controls have clearance and
classification levels
Audit trails are used to record attempted security
breaches
GRANT/ REVOKE syntax in SQL
We have not dealt with data-encryption, which
deals with the storing and transmission of sensitive
data.
Model














Bel-LaPadula model/Mandatory Access
Control
Thus, a mandatory access control technique classifies data and users
based on security classes such as top secret (TS), secret (S),
confidential (C) and unclassified (U).
The DBMS determines whether a given user can read or write a given
object based on certain rules that involve the security level of the
object and the clearance of the user.
The commonly used mandatory access control technique for multi-
level security is known as the Bel-LaPadula model.
The Bel-LaPadula model is described in terms of subjects (for
example, users, accounts, programs), objects (for example, relations
or tables, tuples, columns, views, operations), security classes (for
example, TS, S, C or U) and clearances.
The Bel-LaPadula model classifies each subject and object into one of
the security classifications TS, S, C or U.
Views
Views
A View is a "Virtual Table". It is not like a simple table, but is a virtual
table which contains columns and data from different tables (may be
one or more tables).
A View does not contain any data, it is a set of queries that are applied
to one or more tables that is stored within the database as an object.
After creating a view from some table(s), it used as a reference of
those tables and when executed, it shows only those data which are
already mentioned in the query during the creation of the View.


Advantages of view
Views are used as security mechanisms in databases. Because it
restricts the user from viewing certain column and rows. Views display
only those data which are mentioned in the query, so it shows only
data which is returned by the query that is defined at the time of
creation of the View. The rest of the data is totally abstract from the
end user.
Along with security, another advantage of Views is data abstraction
because the end user is not aware of all the data in a table.
Creation of Views
Syntax:
CREATE VIEW [View_Name] AS [SELECT Statement]

Example
CREATE VIEW SampleView As SELECT EmpID, EmpName FROM
EmpInfo
Encryption
Reason???


Basic terminology
Plaintext: original message to be encrypted

Ciphertext: the encrypted message

Enciphering or encryption: the process of converting plaintext into
ciphertext

Encryption algorithm: performs encryption
Two inputs: a plaintext and a secret key
Symmetric Cipher Model
Deciphering or decryption: recovering plaintext from ciphertext

Decryption algorithm: performs decryption
Two inputs: ciphertext and secret key

Secret key: same key used for encryption and decryption
Also referred to as a symmetric key

Cipher or cryptographic system : a scheme for encryption and decryption

Cryptography: science of studying ciphers

Cryptanalysis: science of studying attacks against cryptographic systems

Cryptology: cryptography + cryptanalysis


Ciphers
Symmetric cipher: same key used for
encryption and decryption
Block cipher: encrypts a block of plaintext at a
time (typically 64 or 128 bits)
Stream cipher: encrypts data one bit or one byte
at a time

Asymmetric cipher: different keys used for
encryption and decryption

Symmetric Encryption
or conventional / secret-key / single-key
sender and recipient share a common key
all classical encryption algorithms are symmetric
The only type of ciphers prior to the invention of asymmetric-key
ciphers in 1970s
by far most widely used
Symmetric Encryption
Mathematically:
Y = E
K
(X) or Y = E(K, X)
X = D
K
(Y) or X = D(K, Y)
X = plaintext
Y = ciphertext
K = secret key
E = encryption algorithm
D = decryption algorithm
Both E and D are known to public
Cryptanalysis
Objective: to recover the plaintext of a
ciphertext or, more typically, to recover
the secret key.

Two general approaches:
brute-force attack
non-brute-force attack (cryptanalytic attack)

Classical Ciphers
Plaintext is viewed as a sequence of
elements (e.g., bits or characters)
Substitution cipher: replacing each
element of the plaintext with another
element.
Transposition (or permutation) cipher:
rearranging the order of the elements of
the plaintext.
Product cipher: using multiple stages of
substitutions and transpositions
Caesar Cipher
Earliest known substitution cipher
Invented by Julius Caesar
Each letter is replaced by the letter three positions further down the alphabet.
Plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
Cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Example: ohio state RKLR VWDWH
Cryptanalysis of Caesar Cipher
Key space: {0, 1, ..., 25}
Vulnerable to brute-force attacks.
E.g., break ciphertext "UNOU YZGZK

Need to recognize it when have the plaintext

Monoalphabetic Substitution Cipher

Shuffle the letters and map each plaintext letter to a
different random ciphertext letter:

Plain letters: abcdefghijklmnopqrstuvwxyz
Cipher letters: DKVQFIBJWPESCXHTMYAUOLRGZN

Plaintext: ifwewishtoreplaceletters
Ciphertext: WIRFRWAJUHYFTSDVFSFUUFYA

What does a key look like?


Monoalphabetic Cipher Security
Now we have a total of 26! = 4 x 10
26
keys.
With so many keys, it is secure against brute-force attacks.
But not secure against some cryptanalytic attacks.
Problem is language characteristics.

The Rotors
96
Enigma Rotor Machine
Enigma Rotor Machine
Transposition Ciphers
Also called permutation ciphers.

Shuffle the plaintext, without altering the actual letters used.
Example: Row Transposition Ciphers

Row Transposition Ciphers
Plaintext is written row by row in a rectangle.

Ciphertext: write out the columns in an order
specified by a key.


Key: 3 4 2 1 5 6 7


Plaintext:


Ciphertext:TTNAAPTMTSUOAODWCOIXKNLYPETZ
a t t a c k p
o s t p o n e
d u n t i l t
w o a m x y z

Potrebbero piacerti anche