Sei sulla pagina 1di 12

TRANSACTIONS MANAGEMENT AND CONCURRENT CONTROL.

Process-A process (sometimes called a task, or a job) is, informally, a program in execution. A Process is not the same as program .there is a difference between a passive program stored on disk, and an actively executing process. Multiple people can run the same program; each running copy corresponds to a distinct process. The program is only part of a process; the process also contains the execution state.
-Distributed transactions reflect real world transactions that are triggered by events such as buying products, registering for a course, making a deposit to your account e.t.c. A transaction may contain many parts e.g. a sales transaction may require updating the customers account, adjusting the product inventory, updating the customers account. All parts of a transaction must be successfully completed to prevent data integrity problems. Definition. A transaction is a series of actions carried out by a single user or application program which must be treated as a logical unit of work that must be either entirely completed or aborted and no intermediates states are accepted. It results from execution of user program delimited by statement (function calls) of the form begin transaction and end transactions. Transaction States A transaction that changes the contents of the database must transform the database from one consistent state to another. Database state is a collection of all the store data items (values) in the database at a green point and time. A consistent database store is one in which all data integrity constraints are satisfied. If a transaction completes successfully it is said to have committed and the database reaches a new consistent state. On the other hand, if the transaction does not execute successfully, is aborted. If a transaction is aborted the database is restored to the previous consistent though ROLL BACK (undoing). Particularly committed state occurs after the final state has been executed. The transaction may be aborted due to violation of integrity constraints. Alternatively the system may fall before the data is recorded on secondary storage meaning the transaction go to a failed state can be aborted.

Failed state occurs when the transaction cannot be committed or is aborted while in action. Begin transaction- This marks the beginning of transaction execution. End-transaction - Specifies the READ & WRITE transaction operations have ended. Active state -A transaction goes into an active state immediately after it starts execution, where it can issue read and write operations. Partially committed state-At this point, the recovery protocol checks if the transaction execution violates the integrity constraints if not the updates are committed otherwise the transactions it is aborted. Terminated state -Corresponds the transaction leaving the system. Commit transaction- Signals a successful end of a transaction so that in any changes (updates) executed by a transaction can be safely be committed to the database and will not be lost. Abort (Roll back)-This signals that a transaction has ended unsuccessfully and that any changes or effects that the transaction may have applied to the database must be unclosed. Undo: similar to ROLLBACK but it applies to a single operation rather than to a whole transaction. Redo: specifies that certain transaction operations must be redone to ensure that all the operations of a committed transaction have been applied successfully to the database.

Transaction execution

PROPERTIES OF A TRANSACTION (ACID) (i) ATOMICITY (All or nothing) property- This means a transaction is performed in its entirely or not performed at all. This requires that all operations (parts) of the transaction to be reflected in the transaction properly otherwise it is aborted. A transaction is treated as atomic work & it is the responsibility or recovery of DBMS to ensure atomicity. (ii) CONSISTENCY (SERIALIZIBILITY)-A transaction is consistency preserving it its complete execution takes the DB from one consistent state to another. Concurrent transactions are treated as though they were executed in several orders

(one after another). Thus execution of a transaction in isolation preserves the consistency of a database. So a transaction should always transform DB from one consistent state to another. It is the responsibility of DBMS module to enforce integrity & consistency. (iii) ISOLATION (Independence)-States that the execution of a transaction should not be interfered with in any way by other transactions executing concurrently i.e. the data used during execution by transaction cannot be sued by another transaction while the first one is completed. It is the responsibility of the concurrency control system to ensure isolation. (iv) DURABILITY OR PERSISTENCE -Ensures that changes (updates) applied to DB by a committed translation persists and cannot be lost even in the event of system failure. This indicates the permanence of DB consistent state. It is the responsibility of receiving system to ensure durability. CONCURRENCY, SERIALIZABILITY AND DEADLOCK Concurrency Control The process of managing simultaneous operations on the database without having them interfere with one another. (Connolly) Required because Many users wish to access the same data. Accessing the same data can lead to errors. -In a single user database only one user is accessing the data at any time. This means that the DBMS does not have to be concerned with how changes made to the database will affect other users. In a multi-user database many users may be accessing the data at the same time. The operations of one user may interfere with other users of the database. -The DBMS uses concurrency control to manage multi-user databases. McFadden et al define concurrency control as being concerned with preventing loss of data integrity due to interference between users in a multi-user environment. Concurrency control provides a mechanism for avoiding and managing conflict between users. -Conditions for conflict include: 1. Transactions that begin at the same time. 2. Transactions that operate independently of each other. That is, transactions that do not co-ordinate their access to the database. 3. Transactions reading and/or writing the same data items: Conflicting Transactions 1. Record for product 10 has value 50. 2. Transaction Y reads record for product 10. 3. Transaction X reads record for product 10. 4. Transaction Y increments product 10s value by 15. 5. Transaction X decrements product 10s value by 20. 6. Transaction Y writes new record for product 10 to disc. 7. Transaction X writes new record for product 10 to disc. Transactions Y and X has been executed at the same time: Transaction Y -Read product 10s value as 50. Added 15 to produce a value of 65. Wrote the updated product 10 record with a value of 65. Transaction X Read product 10 s value as 50. Subtracted 20 to produce a value of 30. Wrote the updated product 10 record with a value of 30. The result of this execution is that product 10 has a value of 30. Transaction X has overwritten the result of transaction Y. Product 10 should have a value of 45. i.e. 50+15-20 = 45

Problem -Both transactions read and updated the same value for product 10. There are three common types of conflict problem: 1. The lost update problem 2. The uncommitted dependency problem 3. The inconsistent analysis problem The Lost Update Problem

-In this example, transaction Y has read the value of bal at time t2 as 100 and transaction X has read the value of bal at time t3 as 100. At time t4, transaction Y writes the new value of bal (200) to disc. But at time t4, transaction X has subtracted 10 from its value of bal (100) to produce 90. Transaction X updates the value of bal on disc at time t5. The result of this operation is that the update performed by transaction Y (bal+100=200) has been lost. Transaction X has overwritten the result of transaction Y. This problem is avoided by not allowing transaction X to read bal until transaction Y has committed its update. The Uncommitted Dependency Problem

-In this example, transaction Y reads and updated the value of bal (100+100=200) and writes the result at time t4.At time t5, transaction X reads the value of bal (200) written by transaction Y and updates it. However, at time t6 transaction Y has failed and rolled back. This means that the update it made to bal is undone and value of bal is returned to 100. Therefore, transaction X is updating the incorrect value of bal (200). Transaction X should be updating bal=100 because transaction Y has been rolled back and its changes undone. Transaction X has used the result of transaction Y but this result was incorrect as transaction Y failed. This problem is avoided by not allowing transaction X to read the value of bal until transaction Y either commits or rolls back. The Inconsistent Analysis Problem -In this example, transaction Y is summing the values of balx, baly and balz. However, at the same time, transaction X is transferring 10 pounds between balx and balz. As transaction Y has used the old balances of balx and balz its final result is incorrect. This problem would be solved by preventing transaction X from transferring the money between accounts before transaction Y has committed.

Schedules Schedule A sequence of reads/writes by a group of transactions. Types of schedules Serial Schedule A schedule where transactions are executed consecutively. Non-serial Schedule-A schedule where the operations of a transaction are interleaved. -The lost update, uncommitted dependency and inconsistent analysis problems are caused by executing two or more transactions at the same time. It is possible to avoid all problems by executing the transactions one at a time. Each transaction is committed before the next begins. However, it is frequently possible to interleave the execution of transactions. That is, it is possible for the operations of two transactions to overlap as they execute. -The sequence of operations performed by a set of transactions is called a schedule. -When transactions are run consecutively, the schedule is a serial schedule. When the operations of a transaction overlap, the schedule is a non-serial schedule. A serial schedule guarantees that the transactions will not conflict because the transactions are run at different times. However, different serial schedules may produce different results. A non-serial schedule does not guarantee that transactions will not conflict. Serial Schedule

-Transactions Y and X are executed one after another. Therefore, they cannot interfere with each other. This is a serial schedule. Non-Serial Schedule -Transactions X and Y are interleaved. That is, the operations of transaction X overlap with the operations of transaction Y. This is a non-serial schedule. This schedule produces a conflict because at the same time transaction X is transferring 10 from bal1 to bal2, transaction Y is also transferring money between bal1 and bal2.

Serialisable schedule. -A non-serial schedule that produces the same result as some serial schedule is called a serialisable schedule. For instance, consider the figure below:

- Transactions X and Y are interleaved and, therefore, this is a non-serial schedule. However, this schedule does not cause conflict. The result of this schedule is the same as executing transaction X before transaction Y. Serialisability Serialisable schedule A non-serial schedule that produces the same result as some serial schedule. Executing a serialisable schedule is equivalent to executing some serial schedule. However, the serialisable schedule may make better use of the computing resources. Serialisability Example

-Schedule 1, above, is a non-serial schedule that produces the same result as the serial schedule 2.

Therefore, schedule 1 is a serialisable schedule. That is, there is an equivalent serial schedule which may be used. Schedule 1 would not produce the same result as executing transaction Y before transaction X. But it produces the same result as executing transaction Y after transaction X. Concurrency Control Techniques Locking Controls concurrent access to data. Read lock Allows a transaction to read a data item but not to update it. Write lock- Allows a transaction to read and update a data item. -A DBMS can ensure that a schedule for a set of transactions is a serialisable transaction by requiring the transactions to lock data items before they use them. -Connolly et al defines locking as a procedure used to control concurrent access to data. When one transaction is accessing the database, a lock may deny access to other transactions to prevent incorrect updates. -A lock is used by a transaction to notify the DBMS that the transaction is about to read or write a particular data item. The DBMS may then take steps to avoid conflict with other transactions. -There are two main types of locks: Read Lock- allows a transaction to read a data item but not to change its contents. Write Lock- allows a transaction to read or write a data item. -More than one transaction may have a read lock on a data item. This is because none of the transactions can change the data item and, therefore, they will all be working with the same value. Only one transaction may have a write lock on a data item at any one time. In addition, other transactions may not hold read locks on the data item. Other transactions will conflict if they try to use the data items value as it is being updated. Using Locks -Locks determine how a transaction may use a data item. When a transaction wishes to access a data item it must lock it first. When a transaction requests a lock on a data item, the DBMS checks if the data item is already locked. If the data item is not locked then the transaction is allowed to lock the data item. If the data item is read locked then the transaction may also have a read lock on the data item. If the data item is write- locked then the transaction must wait until the lock is released. Hence, when two transactions wish to update the same data item, one of them will be given a write lock and the other will be required to wait until the first transaction finishes. Granularity of Lock -Locks can be categorized according to their level of granularity. There are three major levels of granularity: Database Locks A database lock stops access to the whole database. Relation Locks A table lock stops access to a single relation. Tuple Locks A record lock stops access to a single tuple in a relation. -Selecting the correct granularity of locks used by a transaction is important. For example, a transaction that locks the database will exclude all other transactions from accessing the database. (A database lock creates a single user database). A database lock is normally used to perform operations that require exclusive access to the database, for example, a complete database backup. Simple reads and updates will normally use tuple locks. TYPES OF LOCKS 1. Binary locks -A binary lock has two states (values) i.e. locked (1) or unlocked (0). Suppose that is a data item: if the value of the lock of the values if 1 then item cannot be accessed by a database operation that requests the data flow. If the value of the lock in is 0 the item can be accessed when request this lock and unlock feature eliminate feature system but its the considered too restricted to yield to optimal concurrent result box at most one transaction can hold a lock at any time. 2. Shared Locks - Used during read operations since they cannot conflict i.e. transactions can be allowed to accept same data then if they are all access for reading purposes only. A shared lock is issued when a transaction want to read data on the database while no other database is updating the same data.

3. Exclusive Locks -If a transaction is to write a database then it must have exclusive right to the database. This gives a transaction exclusive lock no other transaction can read or update in that duration. Exclusive locks must be used when potential for conflicts exist so exclusive lock is granted if a transaction wants to update (write) the database and no other locks are held for the data. Two Phase Locking -To be able to guarantee serialisability, DBMSs require transactions to use the two-phase locking protocol. Two-phase locking is the procedure used by transactions to obtain and release locks on data items. -Transactions that use the two-phase locking protocol can be safely interleaved with other transactions that also use the two-phase locking protocol. -A transaction that uses two-phase locking has two parts to it: 1. The Growing Phase -during the growing phase a transaction obtains all the locks it will require during its processing. 2. The Shrinking Phase- during the shrinking phase a transaction releases all the locks it obtained during the growing phase. -Transactions must obtain locks on all data items they will use. When a transaction releases a lock it is not allowed to obtain any other locks. Using the two-phase locking protocol, a transaction will obtain all the locks it requires (growing), process the data and release all the locks (shrinking). All locking operations (read lock, write_lock) precede the first unlock operation in the transactions. _ Two phases: _ Expanding phase: new locks on items can be acquired but none can be released. _ Shrinking phase: existing locks can be released but no new ones can be acquired. There are types of two- phase locking:

Solving the Lost Update Problem

Solving the Uncommitted Dependency Problem

Deadlock

-When two or more transactions request a lock on a data item that is already locked it is possible for the transactions to become deadlocked. In the example above, transaction X is waiting for a lock on record 20 while holding a lock on record 10. At the same time, transaction Y is waiting for a lock on record 10 while holding a lock on record 20. The transactions are waiting for each other to release a lock before they can continue. This situation is called deadlock. Connolly et al define deadlock as an impasse that may result when two (or more) transactions are each waiting for locks held by the other to be released. Conditions for Deadlock

-Deadlock occurs when a circular chain of transactions have write locks on data items and are waiting for locks held by the next transaction in the chain. A transaction cannot force another transaction to release a lock. Avoiding Deadlock Only one resource is locked at one time by each transaction. Not useful when more than one record must be updated. Resource locks are obtained in one order e.g. locks on record 10 are always obtained before locks on record 20. Obtain all locks before updating begins Transactions cannot start until all locks are available Three general techniques for handling deadlock: Timeouts. Deadlock prevention. Deadlock detection and recovery. (i) Timestamp methods- Transactions ordered globally so that older transactions, transactions with smaller timestamps, get priority in the event of conflict. Conflict is resolved by rolling back and restarting transaction. Timestamp A unique identifier created by DBMS that indicates relative starting time of a transaction. Can be generated by using system clock at time transaction started or by incrementing a logical counter every time a new transaction starts. Read/write proceeds only if last update on that data item was carried out by an older transaction. Otherwise, transaction requesting read/write is restarted and given a new timestamp.

(ii) Optimistic methods- Based on assumption that conflict is rare and more efficient to let transactions proceed
without delays to ensure serializability. At commit, check is made to determine whether conflict has occurred.

If there is a conflict, transaction must be rolled back and restarted. Three phases: Read: from start to just before commit; read from database into local variables and update local data. Validation: check to ensure serializability is not violated; if violated transaction aborted and restarted. Write (for update transactions): updates made to local variables then applied to database. (iii) Recovery- Process of restoring database to a correct state in the event of a failure. Need for Recovery Control Two types of storage: volatile (main memory) and nonvolatile. Volatile storage does not survive system crashes. Stable storage represents information that has been replicated in several nonvolatile storage media with independent failure modes. Types of Failure System crashes, resulting in loss of main memory. Media failures, resulting in loss of parts of secondary storage. Application software errors. Natural physical disasters. Carelessness or unintentional destruction of data or facilities. Sabotage. Transactions and recovery Transactions represent basic unit of recovery. Recovery manager responsible for atomicity and durability. If failure occurs between commit and database buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transactions updates. If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity. Partial undo - only one transaction has to be undone. Global undo - all transactions have to be undone.

DBMS starts at time t0, but fails at time tf. Assume data for transactions T2 and T3 have been written to secondary
storage. T1 and T6 have to be undone. In absence of any other information, recovery manager has to redo T2, T3, T4, and T5. Log file -Contains information about all updates to database: Transaction records. Checkpoint records. Often used for other purposes (for example, auditing). Recovery facilities DBMS should provide following facilities to assist with recovery:

Backup mechanism, which makes periodic backup copies of database. Logging facilities, which keep track of current state of transactions and database changes. Checkpoint facility, which enables updates to database in progress to be made permanent. Recovery manager, which allows DBMS to restore database to consistent state following a failure.

Potrebbero piacerti anche