Sei sulla pagina 1di 31

Distributed Database Systems

Distributed Reliability Protocols


Ozsu - Chapter 12 David Silberberg

Introduction

Goal: maintain atomicity and durability of distributed transactions that execute over multiple databases Relevant commands

Begin transaction Read Write Abort Commit

Commands that are not problematic

Begin transaction executes just like that of centralized databases Read and Write executes using Read Once Write All (ROWA) algorithm

At each site

Commands are executed as in a centralized database Abort is executed by undoing the databases effects as in the centralized case

D. Silberberg

Distributed Database Systems Reliability Protocols

Actors and Algorithms

Abstraction of actors

Coordinator responsible for coordinating the states of the other active members of the distributed database for a particular transaction Participant a process that carries out one of the components of a distributed transaction on one machine

Termination protocols

Commit and Recovery Need special execution in a distributed system If one site fails in its part of a transaction, we want the other sites to terminate as well

D. Silberberg

Distributed Database Systems Reliability Protocols

Termination and Recovery

Termination and Recovery

Opposite sides of the recovery problem Termination how do sites deal with failure? Recovery how do sites recover state once it is restarted?

Maintain atomicity of transaction

Some sites may fail However, the transaction must be all-or-nothing

Nonblocking protocols

Transaction termination at one site does not need to wait for failed site(s) to recover

Independent recovery protocols

Recover without having to consult other sites in a transaction Reduces message exchange during recovery Independence implies nonblocking, but not visa versa
Distributed Database Systems Reliability Protocols 4

D. Silberberg

Two-Phase Commit (2PC) Protocol

Ensures atomic commitment of distributed transactions All sites must agree to commit before any permanent effects take place Synchronization is necessary

Depending on the type of concurrency algorithm used, some schedulers may not be willing to terminate a transaction Another reason a transaction may not be ready to commit is because deadlocks prevent it from committing

D. Silberberg

Distributed Database Systems Reliability Protocols

2PC Protocol (that does not consider failures)

One coordinator controls the process Others actors are participants Coordinator starts by asking participants to prepare for distributed transaction, then enters wait state Participants determine whether or not they can commit or not Participants vote

If one votes to commit, it enters ready state and waits for coordinators response If one votes abort, it forgets the transaction

Coordinator decides whether transaction should continue

If all participates vote commit, it sends a global-commit to all participants If just one aborts, it sends a global-abort to all participants

Participants act

If global-commit, they commit their transactions If global-abort, they abort their transactions
Distributed Database Systems Reliability Protocols 6

D. Silberberg

Two Phase Commit Algorithm


Participant Initial prepare no write abort in log vote-abort vote-commit yes write abort in log global commit abort Abort ACK Abort Commit ACK type of message? commit write abort in log write commit in log global abort Ready write ready in log ready to commit? yes

Coordinator

Initial

write begin commit in log

Wait

any no?

no

write commit in log

Commit

write end of Xaction in log

D. Silberberg

Distributed Database Systems Reliability Protocols

Observations of Simple 2PC

Each participant can unilaterally abort Once a participant aborts, it cannot change its vote If a participant is ready to commit, it can still abort or commit depending on the final determination of the coordinator The coordinator makes the final decision based on an allor-nothing vote (this is atomicity) Both the coordinator and participants enter states where they wait for each other

Both set timers If they do not hear from each other within a certain time limit, they end the transaction without committing
Distributed Database Systems Reliability Protocols 8

D. Silberberg

Algorithm Concept Centralized


Participants Coordinator Participants Coordinator

Coordinator

prepare

vote abort or commit

global commit or abort phase 2

committed or aborted

phase 1

D. Silberberg

Distributed Database Systems Reliability Protocols

Linear 2PC Algorithm

Centralized 2PC has some drawbacks

There are many messages that are sent between the coordinator and participants All participants must be synchronized

Linear 2PC addresses some of these issues

There is an ordering of sites for communications for those participating in the distributed transaction (1, 2, , N) The coordinator is first

Algorithm

Coordinator (#1) sends prepare to participant (#2) Participant (#2) decides to abort or commit

If abort, (#2) sends abort message to (#3) If commit, (#3) sends commit message to (#3)

All subsequent participants

If receive abort, they send abort message to next If receive commit, they determine if they can commit and send their vote onward

Last participant sends global-commit or global-abort back from (#N) to (#N-1) to to (#1)

Drawback is that the Linear 2PC is slow


Distributed Database Systems Reliability Protocols 10

D. Silberberg

Linear 2 Phase Commit

prepare

VC/VA

VC/VA

VC/VA

1
GC/GA GC/GA

3
GC/GA

GC/GA

D. Silberberg

Distributed Database Systems Reliability Protocols

11

Distributed 2 Phase Commit

Advantages of Distributed 2PC

All participants communicate with each other All participants independently arrive at conclusion Eliminates the second phase of communication

Disadvantages of Distributed 2PC

Each participant must be aware of every other participant N2 messages must be sent

D. Silberberg

Distributed Database Systems Reliability Protocols

12

Distributed 2PC Diagram


Participants Coordinator + Participants

Coordinator

C P P P P P P P P
vote abort or commit global commit or abort decision made independently

prepare

D. Silberberg

Distributed Database Systems Reliability Protocols

13

Variations for Performance Improvement

Presumed 2PC Protocols

Reduce messages among coordinator and participants Reduce log files written through presumption of operations

Presumed Abort 2PC

Useful for READ and partial UPDATE transactions Algorithm

If coordinator decides to abort, it forgets about the transaction Participants poll the coordinator for the its commit/abort decision If there is no entry, the participants presume that the transaction has aborted If the coordinator meant to commit, it would not have forgotten about the transaction

Advantages

Coordinator can forget about transaction after abort no need to write to the transaction log Participants do not acknowledge an abort The coordinator abort record does not need to be forced to permanent storage If a participant fails and recovers, the lack of an abort record is enough to tell the participant to abort Participants do not need to write aborts either

D. Silberberg

Distributed Database Systems Reliability Protocols

14

Presumed 2PC Commit

Since most transactions are presumed to commit, we presume a lack of information implies a commit Coordinator during the prepare phase

Forces a collecting write to its stable storage listing all the participants

Participants enter collecting state Coordinator sends prepare statement and enters wait state Participants decide what to do

Write abort or commit records Send vote message to coordinator

Coordinator makes decision

If abort, it writes a global-abort record and sends it to participants If commit, it writes commit record and sends global-commit then it forgets about the transaction

Participants

If they receive global-commit, they write the record and do not acknowledge If they receive global-abort, they abort and acknowledge
Distributed Database Systems Reliability Protocols 15

D. Silberberg

Site Failures

Need protocols to recover when a site fails Desired characteristics of protocol

Independent each site performs its own recovery Non-blocking each site can proceed without waiting for other sites operations Independent implies non-blocking

It is possible to design such protocols for a single site failure

It is not possible to design independent protocols for multiple site failures

Next slides will demonstrate termination and recovery algorithms for 2PC

However, they are inherently blocking Three Phase Commit (3PC) will make our termination and recovery algorithms non-blocking
Distributed Database Systems Reliability Protocols 16

D. Silberberg

Termination Protocols

Performed when coordinator and/or participates time out

Coordinator and/or participants do not receive message before the timeout We assume that the reason for the lack of message is due to a site failure

Termination action dependent on state

Different actions are performed at different points in the 2PC algorithm State transition diagrams help us understand the state and corresponding actions

INITIAL

2PC state transition diagrams


Participant
INITIAL prepare/vote-abort prepare/vote-commit READY vote-abort/ global-abort global-commit/ ack global-abort/ ack

Coordinator

commit/prepare WAIT

vote-commit/ global-commit

COMMIT

ABORT

COMMIT

ABORT

D. Silberberg

Distributed Database Systems Reliability Protocols

17

Coordinator Timeouts
INITIAL commit/prepare WAIT vote-commit/ global-commit vote-abort/ global-abort

This is straightforward only two cases to consider If the timeout occurs in the WAIT state

Coordinator is waiting for participant decisions Cannot unilaterally commit without unanimous vote However, it can unilaterally abort

Writes abort message in log Sends participants abort message

COMMIT

ABORT

If the timeout occurs in the COMMIT or ABORT states

INITIAL commit/prepare WAIT vote-commit/ global-commit vote-abort/ global-abort

It does not know if participants have completed tasks Coordinator repeatedly sends out global-commit or global-abort messages until it receives responses It will eventually hear from all (we hope)
Distributed Database Systems Reliability Protocols

COMMIT

ABORT

D. Silberberg

18

Participant Timeouts
INITIAL prepare/vote-commit READY global-commit/ ack global-abort/ ack prepare/vote-abort

Participants can timeout in both the INITIAL and READY states INITIAL state timeout

Waiting for prepare message Assume that the coordinator failed in the INITIAL state Participant unilaterally aborts after the timeout If prepare arrives after this point

Participant votes abort Or, ignores prepare message coordinator then would timeout

COMMIT

ABORT

READY state timeout


INITIAL prepare/vote-commit READY global-commit/ ack global-abort/ ack prepare/vote-abort

Participant voted to commit, but does not know of the global vote Cannot make unilateral decision or change its vote Participant must remain blocked until it receives the global decision

If coordinator failed, it will remain blocked (bad) However, it can ask other participants for information Distributed Database Systems Reliability Protocols

COMMIT

ABORT

D. Silberberg

19

Other Participants Address the Blocking

If participants communicate with each other, they can help determination a course of action

Participant that times out (PT) can ask other participants (PO) for their state

If some PO is in the INITIAL state

PO has not voted and may not have even received a prepare message

PO unilaterally aborts and sends vote-abort to PT

If some PO is in the READY state

PO has voted to commit, but has not receive word on the global vote

It cannot help PT

If some PO is in the ABORT or COMMIT states

PO either unilaterally aborted or it received a global vote from the coordinator

It sends PT out either the global-abort or global-commit vote


Distributed Database Systems Reliability Protocols 20

D. Silberberg

How the Participant Interprets the Responses

PT receives vote-abort from all other participants PO

They have all aborted Thus, PT can abort as well

PT receives vote-abort from some participants PO, but others are in the READY state

At least one participant aborted, so coordinator would have to abort PT can abort as well

PT finds that all PO are in the READY state

No participant can proceed Stuck waiting for coordinator PT is back to where it started

PT gets global-abort or global-commit from all other participants

PT can proceed accordingly All participants must be in agreement it is impossible otherwise

PT gets global-abort or global-commit from some participants

Other PO are in the READY state PT can proceed with either an ABORT or COMMIT because the coordinator must have made the abort or commit decision
Distributed Database Systems Reliability Protocols 21

D. Silberberg

How to Handle the Blocked State

In the third condition, there is still blocking

It must be that the coordinator has failed All participants elect a new coordinator New coordinator restarts the commit process

If the coordinator and one participant fails, this process is more difficult In the end, 2PC is a blocking protocol because of this case

D. Silberberg

Distributed Database Systems Reliability Protocols

22

Recovery Protocols

Algorithms that help recover state when coordinator or participants fail and restart

Would like protocol to ensure independence However, it is not possible to design a protocol to be independent and maintain atomicity at the same time Due to the fact that 2PC is a blocking protocol

Assumptions

Writing action in log and sending messages is an atomic operation

Not fully realistic Can be handled in a straightforward way

State transition occurs after sending a message

D. Silberberg

Distributed Database Systems Reliability Protocols

23

Coordinator Site Failure

Fails in the INITIAL state

Nothing has been sent yet Just restarts transaction when the site restarts

Fails in the WAIT state

The prepare message has already been sent Not all participants may have received the prepare message Thus, when the site restarts, it resends the prepare message

Fails in the COMMIT or ABORT states

It has already informed participants of its decision Upon restart

If all participants sent acknowledge, coordinator does nothing If some participants have not responded, then the coordinator starts the termination protocol
Distributed Database Systems Reliability Protocols 24

D. Silberberg

Participant Site Failures

Fails in INITIAL state

Upon recovery, the participant should abort the transaction unilaterally Coordinator will either be in the INITIAL or WAIT states

If in the INITIAL state, it will send the prepare message and move to the WAIT state

Coordinator will either timeout or receive abort message from failed participant Either way, the transaction will be aborted by the coordinator

Fails in READY state

Participant already voted to commit Upon recovery, the participant can handle this as a timeout and transfer this to the termination protocol Many cases can be handled without blocking, but there is the potential for blocking

Fails in ABORT or COMMIT states

These are termination states Upon recovery, no special action need be taken
Distributed Database Systems Reliability Protocols 25

D. Silberberg

Three Phase Commit (3PC) Protocol

Non-blocking protocol to address the limitations of 2PC Conditions for non-blocking

No state is adjacent to both a COMMIT and an ABORT state No non-committable state is adjacent to a COMMIT state

WAIT and READY states of 2PC are problematic

Both are adjacent to both ABORT and COMMIT Both non-committable, but next to COMMIT
INITIAL

Coordinator

Participant

INITIAL prepare/vote-commit READY prepare/vote-abort

commit/prepare WAIT vote-abort/ global-abort

vote-commit/ global-commit

global-commit/ ack

global-abort/ ack

COMMIT

ABORT

COMMIT

ABORT

D. Silberberg

Distributed Database Systems Reliability Protocols

26

Modified State Transition

Solve the problem by adding a new (3rd) state before COMMIT


Participant
prepare/vote-commit READY vote-abort/ global-abort prepare-to-commit/ ready-to-commit global-abort/ ack

Coordinator
INITIAL

INITIAL prepare/vote-abort

commit/prepare WAIT

vote-commit/ prepare-to-commit

PRECOMMIT ABORT

PRECOMMIT global-commit/ ack COMMIT

ABORT

ready-to-commit/ global-commit

COMMIT

D. Silberberg

Distributed Database Systems Reliability Protocols

27

3PC Protocol
prepare write abort in log vote-abort vote-commit global-abort abort write abort in log Abort commit ACK Pre-Commit write commit in log Commit no ready to commit? yes write ready in log Ready type of message? prepare-to-commit write prepare to commit in log Initial

Initial

write begin commit in log

Wait

any no? yes prepare-tocommit Abort ACK ready-to-commit

write abort in log

no write prepare to commit in log

Pre-Commit

write commit in log

Commit

write end of Xaction in log

D. Silberberg

Distributed Database Systems Reliability Protocols

28

3PC Coordinator Timeout Protocol

Timeout in WAIT state

Same as Coordinator timeout in WAIT state for 2PC Unilaterally decides to abort the transaction Sends global-abort message to all participants

Timeout in PRECOMMIT state

Coordinator does not know if non-respondents have moved to a PRECOMMIT state It knows that they are at least in the READY state they have all voted to COMMIT Sends prepare-to-commit message

Globally commits the transaction Updates log Sends global-commit message to all participants Moves all participants to PRECOMMIT state

Timeout in COMMIT or ABORT states

Coordinator does not know if participants have performed their COMMITS or ABORTS They are at least in the PRECOMMIT state (or READY state for an ABORT). They need to follow termination protocol (described soon)
Distributed Database Systems Reliability Protocols 29

D. Silberberg

3PC Participant Timeout Protocol

Timeout in INITIAL state

Same as timeout in 2PC INITIAL state Waiting for prepare message Assume that the coordinator failed in the INITIAL state Participant unilaterally aborts after the timeout If prepare arrives after this point

Participant votes abort Or, ignores prepare message coordinator then would timeout

Timeout in READY state

Already voted to COMMIT Does not know global decision Elects new coordinator New coordinator terminates transaction

Timeout in PRECOMMIT state

Already received prepare-to-commit message Waiting for final global-commit Elects new coordinator New coordinator terminates transaction

D. Silberberg

Distributed Database Systems Reliability Protocols

30

3PC Termination Protocols

New coordinator is selected

Sends its own state to all participants asking them to assume the state

All the participants that have passed that state ignore the message The rest make state transitions and send back appropriate messages

Non-blocking

Protocol

If coordinator in WAIT state

Globally aborts the transaction If participant in the PRECOMMIT state


There is no transition to the ABORT state New transition from PRECOMMIT to ABORT needs to be added

If coordinator in PRECOMMIT state

No participant can be in the ABORT state (since they have all voted to commit) Sends out global-commit

If coordinator in ABORT state

Moves all participants to ABORT state

3PC Recovery Protocol very similar to 2PC with some variations


Distributed Database Systems Reliability Protocols 31

D. Silberberg

Potrebbero piacerti anche