Sei sulla pagina 1di 10

Update Propagation Protocols (1)

Introduction DAG(WT)

Chapter 7: Lazy Replication

DAG(T) BackEdge

Alternatives to update replicas: synchronous vs. asynchronous: u synchronous within transaction (eager replication), u asynchronous separate transaction to update replicas (lazy replication). Synchronous updates of replicas: no scalability, in particular if each peer issues transactions. Each transaction must lock quorum or primary copy.

Klemens Bhm

Distributed Data Management: Lazy Replication - 1

Klemens Bhm

Distributed Data Management: Lazy Replication - 2

Update Propagation Protocols (2)


l

Update Propagation Protocols (3)


l

Introduction DAG(WT) DAG(T) BackEdge

Furthermore: u With synchronous updates, we must wait for the slowest one. Illustration. u Failure of nodes similar problem.

Introduction DAG(WT) DAG(T) BackEdge

Lazy replication underlying idea: u If update has been successful on one node, it will eventually be successful on all other nodes as well. Illustrated on subsequent slide. u I.e., transaction may already commit without having updated all replicas.

Klemens Bhm

Distributed Data Management: Lazy Replication - 3

Klemens Bhm

Distributed Data Management: Lazy Replication - 4

Lazy Replication & Primary-Copy Technique Illustration


a
Introduction DAG(WT) DAG(T) BackEdge T1 T2 T1 s1 s2 s3

Update Propagation Protocols (4)


l

ab
T3 T2 T1 Introduction DAG(WT) DAG(T) BackEdge

ba

Three transactions: T1 on Site 1 updates a. T2 on Site 2 reads a and writes b. T3 on Site 3 reads a and b. Lazy propagation of updates possible sequence of executions: u on Site 2: w1[a] r2[a] w2[b] u on Site 3: w2[b] r3[a] r3[b] w1[a]
Distributed Data Management: Lazy Replication - 5

Eager vs. lazy: u Lazy performance tends to be better (but not by orders of magnitude, depending on the distribution scheme of the replicas), u serializability typically not guaranteed with lazy.

Klemens Bhm

Klemens Bhm

Distributed Data Management: Lazy Replication - 6

Update Propagation Protocols (5)


l

System Model/Assumptions
l

Introduction DAG(WT) DAG(T) BackEdge

Our topic in what follows: lazy protocols guaranteeing serializability. 1. Protocols making assumptions regarding the arrangement of replicas. 2. Hybrid protocol without such assumptions that is lazy whenever possible.

Introduction DAG(WT) DAG(T) BackEdge

l l

l l l

Each data object has a primary site. primary copy, secondary copies/replicas. Transaction has originating site. Transaction can only modify data objects whose primary site = originating site; but may read any data object. Nodes (sites) use 2PL. Network is reliable; delivery of messages in FIFO order. Primary subtransaction, secondary subtransaction.

Klemens Bhm

Distributed Data Management: Lazy Replication - 7

Klemens Bhm

Distributed Data Management: Lazy Replication - 8

Copy Graph
Copy Graph: l nodes sites. l Edge from si to sj iff primary copy of a data object is on si, and secondary copy on sj. l Example: three sites, two data objects a and b.
a
s1 s2 s3

Example of Non-serializable Execution


a
Introduction DAG(WT) DAG(T) BackEdge T1 T2 T1 s1 s2 s3

ab
T3 T2 T1

Introduction DAG(WT) DAG(T) BackEdge

ba

ab

ba

Backedges := set of edges s.t. deletion of such edges makes the graph cycle-free.
Distributed Data Management: Lazy Replication - 9

Three transactions: T1 on Site 1 updates a. T2 on Site 2 reads a and writes b. T3 on Site 3 reads a and b. Lazy propagation of updates possible sequence of executions: u on Site 2: w1[a] r2[a] w2[b] u on Site 3: w2[b] r3[a] r3[b] w1[a]
Distributed Data Management: Lazy Replication - 10

Klemens Bhm

Klemens Bhm

DAG(WT) Protocol (1)


l l
Introduction DAG(WT) DAG(T) BackEdge

DAG(WT) Protocol (2)


l l

l l

DAG without Timestamps Prerequisite: copy graph is cycle-free. Generate Tree T from copy graph: si successor of sj in T. si child of sj. Example:
s1 a s2 a b s3 ba s1 a s2 a b s3 ba

Introduction DAG(WT) DAG(T) BackEdge

Node forwards update transactions to children in T. Commit order = order in which transactions arrive at node = order in which transactions are forwarded to children.

Klemens Bhm

Distributed Data Management: Lazy Replication - 11

Klemens Bhm

Distributed Data Management: Lazy Replication - 12

DAG(WT) Protocol (3)


l

DAG(WT) Protocol (4)


l

Introduction DAG(WT) DAG(T) BackEdge

Point that is still open: u Local deadlocks feasible, i.e., transaction does not necessarily commit. u Local deadlock feasible because of conflict of T with transaction Tx that has not occurred at other sites. Illustration on subsequent slide. Or interaction with transaction that has already occurred at other sites. No handshaking, only commit order is given. u But local transaction must commit. u Thus, victim selection policy should be fair, e.g.: last transaction.
Distributed Data Management: Lazy Replication - 13

Point that is still open (cont.): u Local deadlocks illustration:


ac
s1 s2 s3

Introduction DAG(WT) DAG(T) BackEdge

ab

bac

T1: r1(c) w1(c) w1(a), T2: r2(a) r2(c) w2(b) Possible execution at s2: w1(c) r2(a) r2(c) w1(a)

Klemens Bhm

Klemens Bhm

Distributed Data Management: Lazy Replication - 14

Example of Non-serializable Execution


a
Introduction DAG(WT) DAG(T) BackEdge T1 T2 T1 s1 s2 s3

DAG(WT) Protocol Observation


l
Introduction DAG(WT) DAG(T) BackEdge

ab
T3 T2 T1

Transaction is routed to nodes that are not relevant.


ad
s1 s2 s3

ab

ba

Three transactions: T1 on Site 1 updates a. T2 on Site 2 reads a and writes b. T3 on Site 3 reads a and b. Lazy propagation of updates possible sequence of executions: u on Site 2: w1[a] r2[a] w2[b] u on Site 3: w2[b] r3[a] r3[b] w1[a]

bd

Delays.

z
Klemens Bhm Distributed Data Management: Lazy Replication - 16

Klemens Bhm

Distributed Data Management: Lazy Replication - 15

DAG(T) Protocol
l l
Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Timestamps (1)
l

DAG with Timestamps Propagate update transactions along edges of copy graph. Primary subtransactions have timestamp that specify execution order. Outline of the following: u Structure of timestamp (total order necessary), u protocol itself. Structure of timestamps is interesting, since their generation is decentralized.

Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

In what follows: u auxiliary notion local timestamp counter, u auxiliary notion timestamp of a site, u timestamp of a transaction. Acyclicity results in total order of sites; si < si+k auxiliary construct tuple corresponding to Node si : (si, LTSi) LTS = Local Timestamp Counter; counts the primary subtransactions that have committed there.

l l

Klemens Bhm

Distributed Data Management: Lazy Replication - 17

Klemens Bhm

Distributed Data Management: Lazy Replication - 18

Local Timestamp Counters Example


a
Introduction DAG(WT) DAG(T) BackEdge T1 T2 T1 s1 s2 s3

Timestamps (2)
l

ab
T3 T2 T1 Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol

ba

(s1, 0), (s2, 0), (s3, 0) (s1, 1) (s2, 1)

BackEdge

Timestamp of a site si: u vector of tuples, u a tuple for si and further tuples for predecessors of si in copy graph. u Important: tuple in vector ordered by sites. Timestamp of a site reflects how many primary subtransactions and how many secondary subtransactions have committed there.

Klemens Bhm

Distributed Data Management: Lazy Replication - 19

Klemens Bhm

Distributed Data Management: Lazy Replication - 20

Timestamps (3)
Example:
a
T1 Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge T1 T2 s2 Tx s1 s3

Timestamps (4)
l
ab
Introduction DAG(WT) DAG(T) - Introduction

Timestamp TS(Ti) of a transaction: timestamp of the site of the primary transaction immediately after the point of time of commit. Example:
a
s1 T1

ab
s3 T1 T3

ba

- Timestamps - Protocol BackEdge

1 Site Timestamp

<(s1, 0)> s1 s2 <(s1, 0), (s2, 0)>

2 Site Timestamp <(s1,1)> s1 s2 <(s1,0), (s2,0)> 4 Site Timestamp

T1: < (s1, 1)> T2: <(s1, 1), (s2, 1)> T3: <(s1, 1), (s3, 1)>

s2 T1 T2

ba

3 Site Timestamp

<(s1, 1)> s1 s2 <(s1, 1), (s2, 0)>

<(s1, 1)> s1 s2 <(s1, 1), (s2, 1)>


Distributed Data Management: Lazy Replication - 21 Klemens Bhm Distributed Data Management: Lazy Replication - 22

Klemens Bhm

Timestamps (5)
l l
Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Timestamps Explanation (1)


l l
Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Order of timestamps shall reflect commit order. Lexicographic ordering < of timestamps: TS1<TS2 : u TS1 is prefix of TS2, or u TS1=X (si, LTSi) Y1, TS2=X (sj, LTSj) Y2, and 1. si>sj, or 2. si=sj, and LTSi<LTSj.

Order of timestamps shall reflect commit order. TS1<TS2 TS1 is prefix of TS2. Example: <(s1, 1)> describes state before <(s1, 1), (s2, 1)>. TS1<TS2 TS1=X (si, LTSi) Y1, TS2=X (sj, LTSj) Y2, and si=sj, und LTSi<LTSj. Example: <(s1, 1)> describes state before <(s1, 2)>.

Klemens Bhm

Distributed Data Management: Lazy Replication - 23

Klemens Bhm

Distributed Data Management: Lazy Replication - 24

Timestamps Example, Continued (1)


Example:
a
T1 Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge T1 T2 s2 s1 s3 T1 T2 T3

Timestamps Example, Continued (2)


Example:
a
T1 Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge s1 s3 T1 T3 s2 T1 T2 T2

abd

ab

ba

Hence, <(s1, 1), (s2, 1)> > <(s1, 1), (s3, 1)>.
1 Site Timestamp

ba

1 Site Timestamp

<(s1, 0)> s1 s2 <(s1, 0), (s2, 0)>

Timestamp? 2 Site Timestamp <(s1,1)> s1 s2 <(s1,0), (s2,0)> 4 Site Timestamp

<(s1, 0)> s1 s2 <(s1, 0), (s2, 0)>

Timestamp? 2 Site Timestamp <(s1,1)> s1 s2 <(s1,0), (s2,0)> 4 Site Timestamp

3 Site Timestamp

<(s1, 1)> s1 s2 <(s1, 1), (s2, 0)>

<(s1, 1)> s1 s2 <(s1, 1), (s2, 1)>


Distributed Data Management: Lazy Replication - 25

3 Site Timestamp

<(s1, 1)> s1 s2 <(s1, 1), (s2, 0)>

<(s1, 1)> s1 s2 <(s1, 1), (s2, 1)>


Distributed Data Management: Lazy Replication - 26

Klemens Bhm

Klemens Bhm

Data Structures
Data structure for each node: l timestamp vector of the node (= timestamp of the last committed secondary subtransaction + tuple of the node), l waiting queues, one for each predecessor.

Primary Transaction
Primary transaction commits at Node si: 1. increment LTSi (Local Timestamp Counter), 2. TS(Ti) := TS(si) 3. Send secondary (sub-)transaction to children of si.
l

Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Timestamp of a site reflects how many primary subtransactions and how many secondary subtransactions have committed there.
Distributed Data Management: Lazy Replication - 28

Klemens Bhm

Distributed Data Management: Lazy Replication - 27

Klemens Bhm

Secondary Transaction (1)


l l l l l

Secondary Transaction (2)


l l
Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

Assumption: only one secondary transaction at a time. One waiting queue for each direct predecessor of the site in the copy graph. Choose transaction from queues with minimal timestamp. There must be at least one transaction in each queue before computing the min timestamp. Idle nodes should commit dummy transactions from time to time.

After commit: TS(si) := TS(Ti)(si, LTSi) How do we know that next transaction in commit order is not being propagated through the network? Corresponding queue would be empty, furthermore: FIFO order.

Klemens Bhm

Distributed Data Management: Lazy Replication - 29

Klemens Bhm

Distributed Data Management: Lazy Replication - 30

Continuation of Example
a
T1 Introduction DAG(WT) DAG(T) - Introduction - Timestamps - Protocol BackEdge

BackEdge Protocol Motivation


l l

T1
s1
T1 T2

s3 s2

ab
Introduction DAG(WT) DAG(T) BackEdge

T2
ba

l l

T1 has Timestamp (s1, 1), T2 has Timestamp (s1, 1), (s2, 1). (When T1 commits on s2, the site timestamp is set to (s1, 1), (s2, 0).) Site s3: timestamp of T1 is prefix of the one of T2. T1 is executed there before T2.
z
Distributed Data Management: Lazy Replication - 31

Copy graph must be acyclic for DAG(WT) and DAG(T) protocols. Example: u two sites s1 and s2. u s1 holds primary copy of a and copy of b, s2 vice versa. u T1 at Node s1 reads b and updates a, T2 at Node s2 reads a and updates b. u Both transactions execute concurrently and commit. Illustration. u No serializability.

Klemens Bhm

Klemens Bhm

Distributed Data Management: Lazy Replication - 32

BackEdge Protocol Overview


l l l

BackEdge Protocol Terminology (1)


l l l

Introduction DAG(WT) DAG(T) BackEdge

Hybrid: for some replicas eager updates, for other ones lazy. Can be described both as extension of DAG(WT) and DAG(T). Gdag results from G by removing backedges.

Introduction DAG(WT) DAG(T) BackEdge

Backedge from si to sj. sj is predecessor of si in Gdag. Ti primary subtransaction with node si. Backedge subtransactions: transactions S1, ..., Sj at predecessor si1, ..., sij of si in T. si1 most distant of si etc.
ab ab
s1 s3

ab
s1 s3

ab

most distant predecessor of s3


s2

Tx writing b

ab
Klemens Bhm Distributed Data Management: Lazy Replication - 33 Klemens Bhm

s2

ab
Distributed Data Management: Lazy Replication - 34

BackEdge Protocol Terminology (2)


l

BackEdge Protocol
1. 2. 3. 4.

Illustration:
ab
s2 s1

ab ab
s3 Introduction

Introduction DAG(WT) DAG(T) BackEdge

most distant predecessor of s3

Tx writing b

DAG(WT) DAG(T) BackEdge

s2

ab

After execution of Ti, S1 is sent to s1 no commit, locks are not released. S2, ..., Sj as before, but without commit and without releasing locks. 2PC for Ti, S1, ..., Sj. Remaining secondary subtransactions: lazy, as in DAG(T).
lock b write b unlock b
ab
s1 s3

How many predecessors does s3 have?

ab

lock b write b unlock b

lock b write b unlock b


Klemens Bhm Distributed Data Management: Lazy Replication - 35 Klemens Bhm

s2

ab
Distributed Data Management: Lazy Replication - 36

BackEdge Protocol Discussion


l l
Introduction DAG(WT) DAG(T) BackEdge

Literature
l

No magic here. Better than primary-copy schemes, according to simulations. Upto Factor 5, if there are many readers and few backedges. Implementation on top of commercial DBMSs relatively easy.

Yuri Breitbart et al.: Update Propagation Protocols for Replicated Databases. Proceedings SIGMOD99.

Klemens Bhm

Distributed Data Management: Lazy Replication - 37

Klemens Bhm

Distributed Data Management: Lazy Replication - 38

Potential Exam Questions


l

l l

Illustrate that lazy replication schemes, when designed carelessly, may lead to inconsistencies. Why is the topology of the copy graph important in the context of lazy replication? Explain the different approaches from the lecture that ensure consistency with lazy replication.

Klemens Bhm

Distributed Data Management: Lazy Replication - 39

Potrebbero piacerti anche