Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Problem
statement-based replication!
Problem
Master!
Slave!
Target scenario What kind of errors? How to nd errors Results on production systems
Target Scenario
Master! Slave!
Dont interfere with workload Minimize communication Detect when master slave Deal with replication lag Use vanilla MySQL
Easy Errors
Table does not exist Different schema Database ofine
Kinds of Errors
!
Wrong Data!
Kinds of Errors
!
Wrong Data!
!
Slave Missing Row!
Kinds of Errors
!
Wrong Data!
!
Slave Missing Row!
!
Slave Extra Row!
First thoughts
DB C onte
nts!
Compare!
DB1!
DB
! tents Con
DB2!
Second thoughts
Fingerprint!
Fing er print s!
Compare!
DB1! Fingerprint!
Fing rints! erp
DB2!
A New Plan
1. Fast pass to narrow search to blocks 2. CM-ngerprint narrows search to rows 3. Third pass gives denite answers
Block Boundaries
1! 2! 3! 4! 5! 6! 7! 8! 9! 10! 11! 12! 13! 14! 15! 16! 17! 18! 19! 20!
Rows start - 4! Rows 4 - 7! Rows 7 - 10! Rows 10 - 13! Rows 13 - 16! Rows 16 - end!
Now What? We know which blocks may have inconsistencies Which rows in those blocks have inconsistencies?
Compare!
DB1! CM-Fingerprint!
-Fing CM rints! e rp
DB2!
CM-ngerprinting: encoding
0! 1! 2! 3! 4! 5! 6! 7!
fp0! fp1! fp2! fp3! fp4! fp5! fp6! fp7!
Bad Block! 0 1
fp0! fp0!
fp1! fp1!
CM-ngerprinting: decoding
x00 = i:binary(i)=0** fpi = fp0 fp1 fp2 fp3!
x00 x01 x10 x11 x20 x21
x01 = i:binary(i)=1** fpi = fp4 fp5 fp6 fp7! x10 = i:binary(i)=*0* fpi = fp0 fp1 fp4 fp5! x11 = i:binary(i)=*1* fpi = fp2 fp3 fp6 fp7! x
20
CM-ngerprinting: decoding
x00 x01 x10 x11 x20 x21 0 0 0 0
0 0
CM-ngerprinting: decoding
x00 x01 x10 x11 x20 x21
0 0
CM-ngerprinting: decoding
x00 x01 x10 x11 x20 x21 ? ? ? ?
? ?
CM-ngerprinting: analysis
0! 1! 2! 3! 4! 5! ! n!
log2n!
Blocks of 1000 rows require CM-ngerprints of size! !2log21000 * 32 bits = 640 bits!
Snapshot! Slave!
Comparing Rows
Master Snapshot Slave Snapshot
Compare!
print !
Fingerprint!
n erpri Fing
t!
Decode!
nger print !
CM-ngerprint!
n CM-
! print ger
Compare!
shot!
Snapshot!
o apsh Sn
t!
Results
On Facebooks User Databases Rate of inconsistency: 0.0056% - Strange Tables Rate of inconsistency: 0.0027% What did we nd at what cost?
Finding Inconsistencies
(log scale)
100% Pass 1: Checksum 1.12% Pass 2: CM-ngerprint 0.014% Pass 3: Consistent Snapshot 0.0027%
Future Work
Master-master mode No consistent snapshot! Measure growth rate Evaluate blocksize vs. data trafc