Sei sulla pagina 1di 28

Transaction Processing and Concurrency Control

CHAPTER-6
Transaction Processing And Concurrency
Control

 INTRODUCTION
 Transaction is a logical unit of work that represents real-world events of any organisation or an
enterprise whereas concurrency control is the management of concurrent transaction
execution.
 Transaction processing systems execute database transactions with large databases and
hundreds of concurrent users, for example, railway and air reservations systems, banking
system, credit card processing, stock market monitoring, super market inventory and checkouts
and so on.
 Transaction processing and concurrency control form important activities of any database
system.

 TRANSACTION CONCEPTS
 A transaction is a logical unit of work of database processing that includes one or more
database access operations.
 A transaction can be defined as an action or series of actions that is carried out by a single user
or application program to perform operations for accessing the contents of the database.
 The operations can include retrieval, (Read), insertion (Write), deletion and modification.
 A transaction must be either completed or aborted.
 A transaction is a program unit whose execution may change the contents of a database.
 It can either be embedded within an application program or can be specified interactively via a
high-level query language such as SQL.
 Its execution preserves the consistency of the database.
 No intermediate states are acceptable. If the database is in a consistent state before a
transaction executes, then the database should still be in consistent state after its execution.
 Therefore, to ensure these conditions and preserve the integrity of the database a database
transaction must be atomic (also called serialisability). Atomic transaction is a transaction in
which either all actions associated with the transaction are executed to completion or none are
performed.

1
Transaction Processing and Concurrency Control

 In other words, each transaction should access shared data without interfering with the other
transactions and whenever a transaction successfully completes its execution; its effect should
be permanent.
 However, if due to any reason, a transaction fails to complete its execution (for example,
system failure) it should not have any effect on the stored database.
 This basic abstraction frees the database application programmer from the following concerns:
 Inconsistencies caused by conflicting updates from concurrent users.
 Partially completed transactions in the event of systems failure.
 User-directed undoing of transactions.

 A transaction is a sequence of READ and WRITE actions that are grouped together to from a
database access.
 Whenever we Read from and/or Write to (update) the database, a transaction is created.
 A transaction may consist of a simple SELECT operation to generate a list of table contents, or it
may consist of a series of related UPDATE command sequences.
 A transaction can include the following basic database access operations:

 Read_item(X): This operation reads a database item named X into a program variable Y.
 Execution of Read-item(X) command includes the following steps:
 Find the address of disk block that contains the item X.
 Copy that disk block into a buffer in main memory.
 Copy item X from the buffer to the program variable named Y.

 Write_item(X): This operation writes the value of a program variable Y into the database item
named X.
 Execution of Write-item(X) command includes the following steps:
 Find the address of the disk block that contains item X.
 Copy that disk block into a buffer in main memory.
 Copy item X from the program variable named Y into its correct location in the buffer.
 Store the updated block from the buffer back to disk.

 Example of transaction that updates columns (attributes) in several relation (table) rows
(tuples) by incrementing their values by 500:

 BEGIN_TRANSACTION_1:
READ (TABLE = T1, ROW = 15, OBJECT = COL1);
:COL1 = COL1 + 500;
WRITE (TABLE = T1, ROW = 15, OBJECT = COL1, VALUE =:COL1);
READ (TABLE = T2, ROW = 15, OBJECT = COL2);
:COL2 = COL2 + 500;
WRITE (TABLE = T2, ROW = 30, OBJECT = COL2, VALUE =:COL2);
END_OF_TRANSACTION_1;

2
Transaction Processing and Concurrency Control

 As can be seen from the above update operation, the transaction is basically divided into
three pairs of READ and WRITE operations.
 Each operation reads the value of a column from a table and increments it by the given
amount.
 It then proceeds to write to new value back into the column before proceeding to the next
table.

 Fig. example of a typical loan transaction that updates a salary database table of M/s KLY
Assocrates.
 In this example, a loan amount of INR 10000.00 is being subtracted from an already stored loan
value of INR 80000.00. After the update, it leaves INR 70000.00 as loan balance in the database.
 transaction that changes the contents of the database must alter the database from one
consistent state to another.
 A consistent database state is one in which all data integrity constraints are satisfied.
 To ensure database consistency, every transaction must begin with the database in a known
consistent state.

 Transaction Execution and Problems

 A transaction which successfully completes its execution is said to have been committed.
 Otherwise, the transaction is aborted.
 Thus, if a committed transaction performs any update operation on the database, its effect
must be reflected on the database even if there is a failure.
 A transaction can be in one of the following states:
 Active state: After the transaction starts its operation.
 Partially committed: When the last state is reached.
 Aborted: When the normal execution can no longer be performed.
 Committed: After successful completion of transaction.

3
Transaction Processing and Concurrency Control

 A transaction may be aborted when the transaction itself detects an error during execution
which it cannot recover from, for example, a transaction trying to debit loan amount of an
employee from his insufficient gross salary. A transaction may also be aborted before it has
been committed due to system failure or any other circumstances beyond its control.

 A transaction is said to be in a committed state if it has partially committed and it can be


ensured that it will never be aborted.

 Fig a state transition diagram that describes how a transaction moves through its execution
states.
 A transaction goes into an active state immediately after it starts execution, where it can issue
READ and WRITE operations.
 When the transaction ends, it moves to the partially committed state.
 To this point, some recovery protocols need to ensure that a system failure will not result in an
inability to record the changes of the transaction permanently.
 Once this check is successful, the transaction is said to have reached its commit point and
enters the committed state.
 Once a transaction is committed, it has concluded its execution successfully and all its changes
must be recorded permanently in the database.
 However, a transaction can go to an aborted state if one of checks fails or if the transaction is
aborted during its active state.
 The transaction may then have to be rolled back to undo the effect of its WRITE operations on
the database.
 In the terminated state, the transaction information maintained in system tables while the
transaction has been running is removed.
 Failed or aborted transactions may be restarted later, either automatically or after being
resubmitted by the user as new transactions.

4
Transaction Processing and Concurrency Control

 Transaction Execution with SQL

 The American National Standards Institute (ANSI) has defined standards that govern SQL
database transactions.
 Transaction support is provided by two SQL statements namely COMMIT and ROLLBACK.
 The ANSI standards require that, when a transaction sequence is initiated by a user or an
application program, it must continue through all succeeding SQL statements until one of the
following four events occur:

 A COMMIT statement is reached, in which case all changes are permanently recorded
within the database. The COMMIT statement automatically ends the SQL transaction. The
COMMIT operations indicates successful end-of-transaction.

 A ROLLBACK statement is reached, in which case all the changes are aborted and the
database is rolled back to its previous consistent state. The ROLLBACK operation indicates
unsuccessful end-of-transaction.

 The end of a program is successfully reached, in which case all changes are permanently
recorded within the database. This action is equivalent to COMMIT.

 The program is abnormally terminated, in which case the changes made in the database are
aborted and the database is rolled back to its previous consistent state. This action is
equivalent to ROLLBACK.

 Example of COMMIT, which updates an employee's loan balance (EMP_LOAN-BAL) in the table
EMPLOYEE.
UPDATE EMPLOYEE
SET EMP-LOAN-B AL = EMP-LOAN-B AL – 10000
WHERE EMP-ID ='E0001';
COMMIT;

 As shown in the above example, a transaction begins implicitly when the first SQL statement is
encountered.
 Not all SQL implementations follow the ANSI standard. Some SQL statement use following
transaction execution statement to indicate the beginning and end of a new transaction:

BEGIN TRANSACTION_T1,
READ (TABLE = EMPLOYEE, EMP-ID = ' E0001', OBJECT = EMP-LOAN-B AL);
: EMP-LOAN-B AL = EMP-LOAN-B AL - 10000;
WRITE (TABLE = EMPLOYEE, EMP-ID = ' E0001', OBJECT = EMP-LOAN-BAL, VALUE =:
EMP-LOAN-BAL);
END TRANSACTION_T1;

5
Transaction Processing and Concurrency Control

 Transaction Properties

 A transaction must have the following four properties, called ACID properties (also called
ACIDITY of a transaction), to ensure that a database remains stable state after the transaction is
executed:
 Atomicity.
 Consistency.
 Isolation.
 Durability.

 Atomicity:

 The atomicity property of a transaction requires that all operations of a transaction be


completed, if not, the transaction is aborted.
 In other words, a transaction is treated as single, individual logical unit of work.
 Therefore, a transaction must execute and complete each operation in its logic before it
commits its changes.
 As stated earlier, the transaction is considered as one operation even though there are
multiple read and writes.
 Thus, transaction completes or fails as one unit.
 The atomicity property of transaction is ensured by the transaction recovery subsystem of a
DBMS.
 In the event of a system crash in the midst of transaction execution, the recovery
techniques undo any effects of the transaction on the database.

 Consistency:

 Database consistency is the property that every transaction sees a consistent database
instance.
 In other words, execution of a transaction must leave a database in either its prior stable
state or a new stable state that reflects the new modifications (updates) made by the
transaction.
 If the transaction fails, the database must be returned to the state it was in prior to the
execution of the failed transaction.
 If the transaction commits, the database must reflect the new changes.
 Thus, all resources are always in a consistent state.
 The preservation of consistency is generally the responsibility of the programmers who
write the database programs or of the DBMS module that enforces integrity constraints.
 A database program should be written in a way that guarantees that, if the database is in a
consistent state before executing the transaction, it will be in a consistent state after the
complete execution of the transaction, assuming that no interference with other
transactions occur.
 In other words, a transaction must transform the database from one consistent state to
another consistent state.

6
Transaction Processing and Concurrency Control

 Isolation:

 Isolation property of a transaction means that the data used during the execution of a
transaction cannot be used by a second transaction until the first one is completed. This
property isolates transactions from one another.
 In other words, if a transaction T1 is being executed and is using the data item X, that data
item cannot be accessed by any other transaction (T2………..Tn) until T1 ends.
 The isolation property is enforced by the concurrency control subsystem of the DBMS.

 Durability:

 The durability property of transaction indicates the performance of the database's


consistent state.
 It states that the changes made by a transaction are permanent.
 They cannot be lost by either a system failure or by the erroneous operation of a faulty
transaction.
 When a transaction is completed, the database reaches a consistent state and that state
cannot be lost, even in the event of system's failure.
 Durability property is the responsibility of the recovery subsystem of the DBMS.

 Transaction Log (or Journal)

 To support transaction processing, DBMSs maintain a transaction record of every change made
to the database into a log (also called journal).
 Log is a record of all transactions and the corresponding changes to the database.
 The information stored in the log is used by the DBMS for a recovery requirement triggered by
a ROLLBACK statement, which is program's abnormal termination, a system (power or network)
failure, or disk crash.
 Some relational database management systems (RDBMSs) use the transaction log to recover a
database forward to a currently consistent state.

 The DBMS automatically update the transaction log while executing transactions that modify
the database.
 The transaction log stores before-and-after data about the database and any of the tables, rows
and attribute values that participated in the transaction.
 The beginning and the ending (COMMIT) of the transaction are also recorded in the transaction
log.
 The uses of a transaction log increases the processing overhead of a DBMS and the overall cost
of the system.
 For each transaction, the following data is recorded on the log:
 A start-of-transaction marker.
 The transaction identifier which could include who and where information.
 The record identifiers which include the identifiers for the record occurrences.
 The operation(s) performed on the records (for example, insert, delete, modify).
 The previous value(s) of the modified data. This information is required for undoing the
changes made by a partially completed transaction. It is called the undo log. Where the
modification made by the transaction is the insertion of a new record, the previous values
can be assumed to be null.
 The updated value(s) of the modified record(s). This information is required for making sure
that the changes made by a committed transaction are in fact reflected in the database and

7
Transaction Processing and Concurrency Control

can be used to redo these modifications. This information is called the redo part of the log.
In case the modification made by the transaction is the deletion of a record, the updated
values can be assumed to be null.
 A commit transaction marker if the transaction is committed, otherwise an abort or rollback
transaction marker.

 The log is written before any updates are made to the database.
 This is called write-ahead log strategy.
 In this strategy, a transaction is not allowed to modify the physical database until the undo
portion of the log is written to stable database. Table example of a transaction log. SQL
sequences are reflected for database tables EMPLOYEE.
 In case of a system failure, the DBMS examines the transaction log for all uncommitted or
incomplete transactions and restores (ROLLBACK) the database to its previous state based
on the information in the transaction log.
 When the recovery process is completed, the DBMS writes in the transaction log all
committed transactions that were not physically written to the physical database before
the failure occurred.
 The TRNASACTION-ID is automatically assigned by the DBMS.
 If a ROLLBACK is issued before the termination of a transaction, the DBMS restores the
database only for that particular transaction, rather than for all transactions, in order to
maintain the durability of the previous transactions. In other words, committed
transactions are not rolled back.

 CONCURRENCY CONTROL

 Concurrency control is the process of managing simultaneous execution of transactions (such


as queries, updates, inserts, deletes and so on) in a multiprocessing database system without
having them interfere with one another.
 This property of DBMS allows many transactions to access the same database at the same time
without interfering with each other.
 The primary goal of concurrency is to ensure the atomicity of the execution of transactions in a
multi-user database environment.
 Concurrency controls mechanisms attempt to interleave (parallel) READ and WRITE operations
of multiple transactions so that the interleaved execution yields results that are identical to the
results of a serial schedule execution.

8
Transaction Processing and Concurrency Control

 Problems of Concurrency Control

 When concurrent transactions are executed in an uncontrolled manner, several problems


can occur.
 The concurrency control has the following three main problems:
Lost updates.
Dirty read (or uncommitted data).
Unrepeatable read (or inconsistent retrievals)

 Lost Update Problem

 A lost update problem occurs when two transactions that access the same database items
have their operations in a way that makes the value of some database item incorrect.
 In other words, if transactions T1 and T2 both read a record and then update it, the effects of
the first update will be overwritten by the second update.
 Let us consider an example where two accountants in a Finance Department of M/s KLY
Associates are updating the salary record of a marketing manager 'Rahul'.
 The first accountant is giving an annual salary adjustment to 'Rahul' and the second
accountant is reimbursing the travel expenses of his marketing tours to customer
organisation.

 Without a suitable concurrency control mechanism the effect of the first update will be
overwritten by the second.

 Fig. example of lost update in which the update performed by the transaction T2 is
overwritten by transaction T1.
 Let us now consider the example of SQL transaction of section 1 which updates an attribute
called employee's loan balance (EMP_LOAN-BAL) in the table EMPLOYEE.
 Assume that the current value of EMP-LOAN-BAL is INR 70000.
 Now assume that two concurrent transactions T1 and T2 that update the EMP-LOAN-BAL
value for some item in the EMPLOYEE table.
 The transactions are as follows:
 Transaction T1 : take additional loan of INR 20000 
EMP-LOAN-BAL = EMP-LOAN-BAL + 20000
 Transaction T2 : repay loan of INR 30000 
EMP-LOAN-BAL = EMP-LOAN-BAL - 30000

9
Transaction Processing and Concurrency Control

Table1

Table-2
 Table 1 shows the serial execution of these transactions under normal circumstances,
yielding the correct result of EMP-LOAN-BAL = 60000.
 Now, suppose that a transaction is able to read employee's EMP-LOAN-BAL value from the
table before a previous transaction for EMP-LOAN-BAL has been committed.
 Table 2 the sequence of execution resulting in lost update problem.
 It can be observed from this table that the first transaction T1 has not yet been committed
when the second transaction T2 is executed.
 Therefore, transaction T2 still operates on the value 70000, and its subtraction yields 40000
in the memory.
 In the meantime, transaction T1 writes the value 90000 to the storage disk, which is
immediately overwritten by transaction T2.
 Thus, the addition of INR 20000 is lost during the process.

 Dirty Read (or Uncommitted Data) Problem


 A dirty read problem occurs when one transaction updates a database item and then the
transaction fails for some reason.
 The updated database item is accessed by another transaction before it is changed back to
the original value.
 In other words, a transaction T1 updates a record, which is read by the transaction T2.
 Then T1 aborts and T2 now has values which have never formed part of the stable database.
 Let us consider an example where an accountant in a Finance Department of M/s KLY
Associates records the travelling allowance of INR 10000.00 to be given to the marketing
manager 'Rahul' every time he visits customer organisation.

10
Transaction Processing and Concurrency Control

 This value is read by a report-generating transaction which includes it in the report before
the accountant realizes the error and changes the travelling allowance value to INR
10000.00.

 The error arises because the second transaction sees the first's updates before it commits
 Fig. example of dirty read in which T1 uses a value written by T2 which never forms part of
the stable database.
 In dirty read, data are not committed when two transactions T 1 and T2 are executed
concurrently and the first transaction T1 is rolled back after the second transaction T2 has
already accessed the uncommitted data. Thus, it violates the isolation property of
transactions.

 Example of lost update transaction with a difference that this time the transaction T 1 is
rolled back to eliminate the addition of INR 20000.
 Because transaction T2 subtracts INR 30000 from the original INR 70000, the correct answer
should be INR 60000.
 The transactions are as follows:
Transaction T1 : take additional loan of INR 20000 
EMP-LOAN-BAL = EMP-LOAN-BAL + 20000 (Rollback)
Transaction T2 : repay loan of INR 30000 
EMP-LOAN-BAL = EMP-LOAN-BAL - 30000

Table-3

11
Transaction Processing and Concurrency Control

 Table 3 shows the serial execution of these transactions under normal circumstances,
yielding the correct result of EMP-LOAN-BAL = 60000.
 Table 4 illustrates the sequence of execution resulting in dirty read (or uncommitted data)
problem when the ROLLBACK is completed after transaction T2 has begun its execution

Table 4 illustrates the sequence of execution resulting in dirty read (or uncommitted data)
problem when the ROLLBACK is completed after transaction T2 has begun its execution

 Unrepeatable Read (or Inconsistent Retrievals) Problem

 Unrepeatable read (or inconsistent retrievals) occurs when a transaction calculates some
summary (aggregate) function over a set of data while other transactions are updating the
data.
 The problem is that the transaction might read some data before they are changed and
other data after they are changed, thereby yielding inconsistent results.
 In an unrepeatable read, the transaction T1 reads a record and then does some other
processing during which the transaction T2 updates the record.
 Now, if T1 rereads the record, the new value will be inconsistent with the previous value.
 Let us suppose that a report transaction produces a profile of average monthly travelling
details for every marketing manager of M/s KLY Associates whose travel bills are more than
5% different from the previous month's.
 If the travelling records are updated after this transaction has started, it is likely to show
details and totals which do not meet the criterion for generating the report

12
Transaction Processing and Concurrency Control

 Fig. an example of unrepeatable read in which if T1 were to read the value of X after T2 had
updated X, the result of T1 would be different.
Transaction T1 calculates the total loan balance of all employees in the EMPLOYEE table
of M/s KLY Associates.
At a parallel level (at the same time), transaction T2 updates employee's loan balance
(EMP-LOAN-BAL) for two employees (EMP-ID) '106519' and '112233' of EMPLOYEE
table.

 The above two transactions are as follows:

Transaction T 1 : SELECT SUM (EMP-LOAN-ID)


FROM EMPLOYEE

Transaction T 2 : UPDATE EMPLOYEE


SET EMP-LOAN-ID = EMP-LOAN-BAL+ 20000
WHERE EMP-ID = ' 106519'

UPDATE EMPLOYEE
SET EMP-LOAN-ID = EMP-LOAN-BAL-20000
WHERE EMP-ID ='112233'

COMMIT;

Table-5

The initial and final EMP-LOAN-BAL values are shown in Table-5.


Although the final results are correct after the adjustment, inconsistent retrievals are
possible during the correction process as illustrated in Table-6

13
Transaction Processing and Concurrency Control

Table-6

 Degree of Consistency

 Following four levels of transaction consistency have been defined by Gray (1976):

 Level 0 consistency: In general, level 0 transactions are not recoverable since they may have
interactions with the external word which cannot be undone. They have the following
properties:
 The transaction T does not overwrite other transaction's dirty (or uncommitted) data.

 Level 1 consistency: level 1 transaction is the minimum consistency requirement that allows a
transaction to be recovered in the event of system failure. They have the following properties:
 The transaction T does not overwrite other transaction's dirty (or uncommitted) data.
 The transaction T does not make any of its updates visible before it commits.

 Level 2 consistency: Level 2 transaction consistency isolates from the updates of other
transactions. They have the following properties:
 The transaction T does not overwrite other transaction's dirty (or uncommitted) data.
 The transaction T does not make any of its updates visible before it commits.
 The transaction T does not read other transaction's dirty (or uncommitted) data.

 Level 3 consistency: Level 3 transaction consistency adds consistent reads so that successive
reads of a record will always give the same values. They have the following properties:
 The transaction T does not overwrite other transaction's dirty (or uncommitted) data.
 The transaction T does not make any of its updates visible before it commits.

14
Transaction Processing and Concurrency Control

 The transaction T does not read other transaction's dirty (or uncommitted) data.
 The transaction T can perform consistent reads, that is, no other transaction can update
data read by the transaction T before T has committed.

 Permutable Actions

 An action is a unit of processing that is indivisible from the DBMS's perspective.


 In systems where the granule is a page, the actions are typically read-page and write-page.
 The actions provided are determined by the system designers, but in all cases they are
independent of side-effects and do not produce side-effects.
 A pair of actions is permutable if every execution of A, followed by Aj has the same result as
the execution of Aj followed by A, on the same granule. Actions on different granules are
always permutable.

 For the actions read and write we have:


 Read-Read: Permutable
 Read-write: Not permutable, since the result is different depending on whether read is first
or write is first.
 Write-Write: Not permutable, as the second write always nullifies the effects of the first
write.

 Schedule

 A schedule (also called history) is a sequence of actions or operations (for example, reading
writing, aborting or committing) that is constructed by merging the actions of a set of
transactions, respecting the sequence of actions within each transaction.
 As long as two transactions T1 and T2 access unrelated data, there is no conflict and the order of
execution is not relevant to the final result.
 Thus, DBMS has inbuilt software called scheduler, which determines the correct order of
execution.
 The scheduler establishes the order in which the operations within concurrent transactions are
executed.
 The scheduler interleaves the execution of database operations to ensure serialisability (as
explained in next section).
 The scheduler bases its actions on concurrency control algorithms, such as locking or time
stamping methods.

15
Transaction Processing and Concurrency Control

 The schedulers ensure the efficient utilization of central processing unit (CPU) of computer
system.
 Above Fig. shows a schedule involving two transactions.
 It can be observed that the schedule does not contain an ABORT or COMMIT action for
either transaction.
 Schedules which contain either an ABORT or COMMIT action for each transaction whose
actions are listed in it are called a complete schedule.
 If the actions of different transactions are not interleaved, that is, transactions are executed
one by one from start to finish, the schedule is called a serial schedule.
 A non-serial schedule is a schedule where the operations from a group of concurrent
transactions are interleaved.
 A serial schedule gives the benefits of concurrent execution without giving up any
correctness.
 The disadvantage of a serial schedule is that it represents inefficient processing because no
interleaving of operations form different transactions is permitted.
 This can lead to low CPU utilization while a transaction waits for disk input/output (I/O), or
for another transaction to terminate, thus slowing down processing considerably

 Serial sable Schedules

 A serial sable schedule is a schedule that follows a set of transactions to execute in some order
such that the effects are equivalent to executing them in some serial order like a serial
schedule.
 The execution of transactions in a serialisable schedule is a sufficient condition for preventing
conflicts.
 The serial execution of transactions always leaves the database in a consistent state.
 Serialisability describes the concurrent execution of several transactions.
 The objective of Serialisability is to find the non-serial schedules that allow transactions to
execute concurrently without interfering with one another and thereby producing a database
state that could be produced by a serial execution.
 Serialisability must be guaranteed to prevent inconsistency from transactions interfering with
one another.
 The order of Read and Write operations are important in serialisability.
 The Serialisability rules are as follows:
 If two transactions T1 and T2 only Read a data item, they do not conflict and the order is not
important.
 If two transactions T1 and T2 either Read or Write completely separate data items, they do
not conflict and the execution order is not important.
 If one transaction T1 Writes a data item and another transaction T2 either Reads or Writes
the same data item, the order of execution is important.

 Serailisability can also be depicted by constructing a precedence graph.


 A precedence relationship can be defined as, transaction T1 precedes transaction T2 and
between T1 and T2 if there are two non-permutable actions A1and A2 and A1 is executed by T1
before A2 is executed by T2.
 Given the existence of non-permutable actions and the sequence of actions in a transaction it is
possible to define a partial order of transactions by constructing a precedence graph.
 A precedence graph is a directed graph in which:
 The set of vertices is the set of transactions.
 An arc exists between transactions T1 and T2 if T1 precedes T2.

16
Transaction Processing and Concurrency Control

 A schedule is serialisable if the precedence graph is cyclic.


 The serialisability property of transactions is important in multi-user and distributed databases,
where several transactions are likely to be executed concurrently.

 LOCKING METHODS FOR CONCURRENCY CONTROL

 Lock Granularity
 Lock Types
 Deadlocks

 Lock Granularity

 A database is basically represented as a collection of named data items.


 The size of the data item chosen as the unit of protection by a concurrency control program is
called GRANULARITY.
 Locking can take place at the following level
 Database level.
 Table level.
 Page level.
 Row (Tuple) level.
 Attributes (fields) level.

 Database Level Locking


 At database level locking, the entire database is locked.
 Thus, it prevents the use of any tables in the database by transaction T2 while transaction T1
is being executed.
 Database level of locking is suitable for batch processes.
 Being very slow, it is unsuitable for on-line multi-user DBMSs.

 Table level Locking


 At table level locking, the entire table is locked.
 Thus, it prevents the access to any row (tuple) by transaction T2 while transaction T1 is using
the table.
 If a transaction requires access to several tables, each table may be locked.
 However, two transactions can access the same database as long as they access different
tables
 Table level locking is less restrictive than database level.
 Table level locks are not suitable for multi-user DBMS

 Page Level Locking


 At page level locking, the entire disk-page (or disk-block) is locked.
 A page has a fixed size such as 4 K, 8 K, 16 K, 32 K and so on.
 A table can span several pages, and a page can contain several rows (tuples) of one or more
tables.
 Page level of locking is most suitable for multi-user DBMSs.

17
Transaction Processing and Concurrency Control

 Row Level Locking


 At row level locking, particular row (or tuple) is locked.
 A lock exists for each row in each table of the database.
 The DBMS allows concurrent transactions to access different rows of the same table, even if
the rows are located on the same page.
 The row level lock is much less restrictive than database level, table level, or page level
locks.
 The row level locking improves the availability of data. However, the management of row
level locking requires high overhead cost.

 Attribute (or Field) Level Locking


 At attribute level locking, particular attribute (or field) is locked.
 Attribute level locking allows concurrent transactions to access the same row, as long as
they require the use of different attributes within the row.
 The attribute level lock yields the most flexible multi-user data access.
 It requires a high level of computer overhead.

 Lock Types
 The DBMS mainly uses the following types of locking techniques:
 Binary locking.
 Exclusive locking.
 Shared locking.
 Two-phase locking (2PL)
 Three-phase locking (3PL)

 Binary locking
 A binary lock can have two states or values: locked and unlocked (or 1 and 0, for simplicity).
A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item X cannot be accessed by a database operation that
requests the item.
If the value of the lock on X is 0, the item can be accessed when requested. We refer to
the current value (or state) of the lock associated with item X as LOCK(X).

 Two operations, lock_item and unlock_item, are used with binary locking.
 Lock_item(X):
A transaction requests access to an item X by first issuing a lock_item(X) operation. If
LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (the
transaction locks the item) and the transaction is allowed to access item X.

18
Transaction Processing and Concurrency Control

 Unlock_item (X):
When the transaction is through using the item, it issues an unlock_item(X) operation,
which sets LOCK(X) to 0 (unlocks the item) so that X may be accessed by other
transactions. Hence, a binary lock enforces mutual exclusion on the data item ; i.e., at a
time only one transaction can hold a lock.

 If the simple binary locking scheme described here is used, every transaction must obey the
following rules:
A transaction T must issue the operation lock_item(X) before any read_item(X) or
write_item(X) operations are performed in T.
A transaction T must issue the operation unlock_item(X) after all read_item(X) and
write_item(X) operations are completed in T.
A transaction T will not issue a lock_item(X) operation if it already holds the lock on item
X.
A transaction T will not issue an unlock_item(X) operation unless it already holds the
lock on item X.

 Shared/Exclusive (or Read/Write) Locks:

 A read-locked item is also called share-locked, because other transactions are allowed to
read the item, whereas a write-locked item is called exclusive-locked, because a single
transaction exclusively holds the lock on the item.
 There are three locking operations: read_lock(X), write_lock(X), and unlock(X).

19
Transaction Processing and Concurrency Control

 When we use the shared/exclusive locking scheme, the system must enforce the following rules

 A transaction T must issue the operation read_lock(X) or write_lock(X) before any


read_item(X) operation is performed in T.
 A transaction T must issue the operation write_lock(X) before any write_item(X) operation
is performed in T.
 A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X)
operations are completed in T.
 A transaction T will not issue a read_lock(X) operation if it already holds a read (shared) lock
or a write (exclusive) lock on item X. This rule may be relaxed.
 A transaction T will not issue a write_lock(X) operation if it already holds a read (shared)
lock or write (exclusive) lock on item X. This rule may be relaxed.
 A transaction T will not issue an unlock(X) operation unless it already holds a read (shared)
lock or a write (exclusive) lock on item X.

20
Transaction Processing and Concurrency Control

 Two-phase Locking (2PL)

 Two-phase locking (also called 2PL) is a method or a protocol of controlling concurrent


processing in which all locking operations precede the first unlocking operation.
 Thus, a transaction is said to follow the two-phase locking protocol if all locking operations
(such as read_Lock, write_Lock) precede the first unlock operation in the transaction.
 Two-phase locking is the standard protocol used to maintain level 3 consistency
 2PL defines how transactions acquire and relinquish locks.
 The essential discipline is that after a transaction has released a lock it may not obtain any
further locks. In practice this means that transactions hold all their locks they are ready to
commit.

 2PL has the following two phases:


A growing phase, in which a transaction acquires all the required locks without
unlocking any data. Once all locks have been acquired, the transaction is in its locked
point.
A shrinking phase, in which a transaction releases all locks and cannot obtain any new
lock.

 The above two-phase locking is governed by the following rules:


Two transactions cannot have conflicting locks.
No unlock operation can precede a lock operation in the same transaction.
No data are affected until all locks are obtained, that is, until the transaction is in its
locked point.

 Fig. shows a schedule with strict two-phase locking in which transaction T1 would obtain an
exclusive lock on A first and then Read and Write A
 Fig. example of strict two-phase locking with serial execution in which first strict locking is
done as explained above, then transaction T2 would request an exclusive lock on A.
 However, this request cannot be granted until transaction T 1 releases its exclusive lock on
A, and the DBMS therefore, suspends transaction T2.
 Transaction T1 now proceeds to obtain an exclusive lock on B, Reads and Writes B, then
finally commits, at which time its locks are released.

21
Transaction Processing and Concurrency Control

 The lock request of transaction T2 is now granted, and it proceeds

 Deadlocks

 A deadlock is a condition in which two (or more) transactions in a set are waiting
simultaneously for locks held by some other transaction in the set.
 Neither transaction can continue because each transaction in the set is on a waiting queue,
waiting for one of the other transactions in the set to release the lock on an item.
 Thus, a deadlock is an impasse that may result when two or more transactions are each
waiting for locks to be released that are held by the other.
 Transactions whose lock requests have been refused are queued until the lock can be
granted.
 A deadlock is also called a circular waiting condition where two transactions are waiting
(directly or indirectly) for each other.
 Thus in a deadlock, two transactions are mutually excluded from accessing the next record
required to complete their transactions, also called a deadly embrace.

Table-7

22
Transaction Processing and Concurrency Control

 A deadlock exists when two transactions T1 and T2 exist in the following mode
Transaction T1 = access data items X and Y
Transaction T2 = access data items Y and X
If transaction T1 has not unlocked the data item Y, transaction T2 cannot begin.
Similarly, if transaction T2 has not unlocked the data item X, transaction T1 cannot
continue
Transactions T1 and T2 wait indefinitely and each wait for the other to unlock the
required data item.
Table-7 a deadlock situation of transactions T1 and T2.
In this example, only two concurrent transactions have been shown to demonstrate a
deadlock situation.

 Deadlock Detection and Prevention

 Deadlock detection is a periodic check by the DBMS to determine if the waiting line for
some resource exceeds a predetermined limit.
 The frequency of deadlocks is primarily dependent on the query load and the physical
organisation of the database.
 There are following three basic schemes to detect and prevent deadlock:

 Never allow deadlock (deadlock prevention):

Deadlock prevention technique avoids the conditions that lead to deadlocking.


It requires that every transaction lock all data items it needs in advance.
If any of the items cannot be obtained, none of the items are locked.
In other words, a transaction requesting a new lock is aborted if there is the possibility
that a deadlock can occur.
Thus, a timeout may be used to abort transactions that have been idle for too long.
This is a simple but indiscriminate approach.
If the transaction is aborted, all the changes made by this transaction are rolled back
and all locks obtained by the transaction are released.
The transaction is then rescheduled for execution.
Deadlock prevention technique is used in two-phase locking.

 Detect deadlock whenever a transaction is blocked (deadlock detection):

In a deadlock detection technique, the DBMS periodically tests the database for
deadlocks.
If a deadlock is found, one of the transactions is aborted and the other transaction
continues.
The aborted transaction is now rolled back and restarted.
This scheme is expensive since most blocked transactions are not involved in deadlocks.

 Detect deadlocks periodically (deadlock avoidance):

In a deadlock avoidance technique, the transaction must obtain all the locks it needs
before it can be executed.

23
Transaction Processing and Concurrency Control

Thus, it avoids rollback of conflicting transactions by requiring that locks be obtained in


succession.
This is the optimal scheme if the detection period is suitable. The ideal period is that
which, on average, detects one deadlock cycle.
A shorter period than this means that deadlock detection is done unnecessarily and a
longer period involves transactions in unnecessarily long waits until the deadlock is
broken.

 Deadlock in Distributed System

 A deadlock in a distributed system may be either local or global.


 The local deadlocks are handled in the same as the deadlocks in centralised systems.
 Global deadlocks occur when there is a cycle in the global waits-for graph involving cohorts
in session wait and lock wait.
 Figure shows an example of distributed deadlock.

 A distributed deadlock has a number of cohorts, each operating on a separate node of the
system, as shown in Fig.
 A cohort is a process and so may be one of a number of states (for example, processor wait,
execution wait, I/O wait and so on).
 The session wait and lock wait are the states of interest for deadlock detection.
 In session wait, a cohort waits for data from one or more other cohorts.
 The detection of deadlock in distributed system is most difficult problem because there is a
cycle involving several nodes.
 Cycles in a distributed waits-for graph are detected through actions of a designated process
at one node which:
Periodically requests fragments of local waits-for graph from all other distributed sites.
Receives from each site its local graph containing cohorts in session wait.
Constructs the global waits-for graph by matching up the local fragments.

24
Transaction Processing and Concurrency Control

Selects victims until there are no remaining cycles in the global graph.
Broadcasts the result, so that the session managers at the sites coordinating the victims
can abort them.

 TIMESTAMP METHODS FOR CONCURRENCY CONTROL

 Timestamp is a unique identifier created by the DBMS to identify the relative starting time of a
transaction.
 Typically, timestamp values are assigned in the order in which the transactions are submitted to
the system.
 So, a timestamp can be thought of as the transaction start time.
 Therefore, time stamping is a method of concurrency control in which each transaction is
assigned a transaction timestamp
 Timestamps must have two properties namely
 Uniqueness: The uniqueness property assures that no equal timestamp values can exist.
 monotonicity: monotonicity assures that timestamp values always increase.

 Granule Timestamps
 Timestamp Ordering
 Conflict Resolution in Timestamps

 Granule Timestamps

 Granule timestamp is a record of the timestamp of the last transaction to access it.
 Each granule accessed by an active transaction must have a granule timestamp.
 A separate record of last Read and Write accesses may be kept. Granule timestamp may cause.
 Additional Write operations for Read accesses if they are stored with the granules.
 The problem can be avoided by maintaining granule timestamps as an in-memory table.
 The table may be of limited size, since conflicts may only occur between current transactions.
 An entry in a granule timestamp table consists of the granule identifier and the transaction
timestamp.
 The record containing the largest (latest) granule timestamp removed from the table is also
maintained.
 A search for a granule timestamp, using the granule identifier, will either be successful or will
use the largest removed timestamp.

 Timestamp Ordering

 Following are the three basic variants of timestamp-based methods of concurrency control:
 Total timestamp ordering
 Partial timestamp ordering
 Multiversion timestamp ordering

25
Transaction Processing and Concurrency Control

 Total timestamp ordering


 The total timestamp ordering algorithm depends on maintaining access to granules in
timestamp order by aborting one of the transactions involved in any conflicting access.
 No distinction is made between Read and Write access, so only a single value is required for
each granule timestamp

 Partial timestamp ordering


 In a partial timestamp ordering, only non-permutable actions are ordered to improve upon
the total timestamp ordering.
 In this case, both Read and Write granule timestamps are stored.
 The algorithm allows the granule to be read by any transaction younger than the last
transaction that updated the granule.
 A transaction is aborted if it tries to update a granule that has previously been accessed by
a younger transaction.
 The partial timestamp ordering algorithm aborts fewer transactions than the total
timestamp ordering algorithm, at the cost of extra storage for granule timestamps

 Multiversion timestamp ordering


 The multiversion timestamp ordering algorithm stores several versions of an updated
granule, allowing transactions to see a consistent set of versions for all granules it accesses.
 So, it reduces the conflicts that result in transaction restarts to those where there is a
Write-Write conflict.
 Each update of a granule creates a new version, with an associated granule timestamp.
 A transaction that requires read access to the granule sees the youngest version that is
older than the transaction.
 That is, the version having a timestamp equal to or immediately below the transaction's
timestamp.

 Conflict Resolution in Timestamps

 To deal with conflicts in timestamp algorithms, some transactions involved in conflicts are
made to wait and to abort others.
 Following are the main strategies of conflict resolution in timestamps:
 Wait-Die:
 The older transaction waits for the younger if the younger has accessed the granule first.
 The younger transaction is aborted (dies) and restarted if it tries to access a granule
 after an older concurrent transaction.
 WOUND-WAIT:
 The older transaction pre-empts the younger by suspending (wounding) it if the younger
transaction tries to access a granule after an older concurrent transaction.
 An older transaction will wait for a younger one to commit if the younger has accessed a
granule that both want.

 The handling of aborted transactions is an important aspect of conflict resolution algorithm.


 In the case that the aborted transaction is the one requesting access, the transaction must be
restarted with a new (younger) timestamp.
 It is possible that the transaction can be repeatedly aborted if there are conflicts with other
transactions.

26
Transaction Processing and Concurrency Control

 An aborted transaction that had prior access to granule where conflict occurred can be
restarted with the same timestamp.
 This will take priority by eliminating the possibility of transaction being continuously locked out.

 Drawbacks of Timestamp

 Each value stored in the database requires two additional timestamp fields, one for the last
time the field (attribute) was read and one for the last update.
 It increases the memory requirements and the processing overhead of database.

 OPTIMISTIC METHODS FOR CONCURRENCY CONTROL

 The optimistic method of concurrency control is based on the assumption that conflicts of
database operations are rare and that it is better to let transactions run to completion and only
check for conflicts before they commit.
 An optimistic concurrency control method is also known as validation or certification methods.
 No checking is done while the transaction is executing.
 The optimistic method does not require locking or timestamping techniques.
 Instead, a transaction is executed without restrictions until it is committed.
 In optimistic methods, each transaction moves through the following phases:
 Read phase.
 Validation or certification phase.
 Write phase.

 Read phase

 In a Read phase, the updates are prepared using private (or local) copies (or versions) of the
granule.
 In this phase, the transaction reads values of committed data from the database, executes
the needed computations, and makes the updates to a private copy of the database values.
 All update operations of the transaction are recorded in a temporary update file, which is
not accessed by the remaining transactions.
 It is conventional to allocate a timestamp to each transaction at the end of its Read to
determine the set of transactions that must be examined by the validation procedure.
 These set of transactions are those who have finished their Read phases since the start of
the transaction being verified

 Validation or certification phase

 In a validation (or certification) phase, the transaction is validated to assure that the
changes made will not affect the integrity and consistency of the database.
 If the validation test is positive, the transaction goes to the write phase.
 If the validation test is negative, the transaction is restarted, and the changes are discarded.
 Thus, in this phase the list of granules is checked for conflicts.
 If conflicts are detected in this phase, the transaction is aborted and restarted.

27
Transaction Processing and Concurrency Control

 The validation algorithm must check that the transaction has


Seen all modifications of transactions committed after it starts.
Not read granules updated by a transaction committed after its start

 Write Phase

 In a Write phase, the changes are permanently applied to the database and the updated
granules are made public.
 Otherwise, the updates are discarded and the transaction is restarted.
 This phase is only for the Read-Write transactions and not for Read-only transactions

 Advantages of Optimistic Methods for Concurrency Control

 This technique is very efficient when conflicts are rare. The occasional conflicts result in the
transaction roll back.
 The rollback involves only the local copy of data, the database is not involved and thus there
will not be any cascading rollbacks.

 Problems of Optimistic Methods for Concurrency Control

 Conflicts are expensive to deal with, since the conflicting transaction must be rolled back.
 Longer transactions are more likely to have conflicts and may be repeatedly rolled back
because of conflicts with short transactions.

 Applications of Optimistic Methods for Concurrency Control

 Only suitable for environments where there are few conflicts and no long transactions.
 Acceptable for mostly Read or Query database systems that require very few update
transactions

28

Potrebbero piacerti anche