Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Michael Nygard
2
Tuesday, April 13, 2010
Agenda
Domain of Applicability
3
Tuesday, April 13, 2010
Agenda
Domain of Applicability
Technical Foundations
Amdahl’s Law
The Universal Scalability Law
3
Tuesday, April 13, 2010
Agenda
Domain of Applicability
Technical Foundations
Amdahl’s Law
The Universal Scalability Law
Reducing Contention
Reducing Coherence
3
Tuesday, April 13, 2010
Agenda
Domain of Applicability
Technical Foundations
Amdahl’s Law
The Universal Scalability Law
Reducing Contention
Reducing Coherence
Some Specific Techniques
3
Tuesday, April 13, 2010
Questions Wide of the Mark
“Is it scalable?”
My personal favorite,
My personal favorite,
10000
1000
Requests
1 M / day 1 M / hour 10 M / hour 10 B / hour
8
Tuesday, April 13, 2010
Nodes
10000
Requests
1 M / day 1 M / hour 10 M / hour 10 B / hour
9
Tuesday, April 13, 2010
Nodes
Extreme Scale
10000
Operations centric
Distributed & non-relational
data storage
Ubiquitous caching
Ubiquitous partitioning
Sharding
Self-managing infrastructure
1000 Large Scale Build own CDN
Data centric
Multiple datastores
Heavy use of
async messaging
Caching servers
Automated operations
100 Medium Scale Much CDN use
Requests
1 M / day 1 M / hour 10 M / hour 10 B / hour
10
Tuesday, April 13, 2010
Technical Foundation
serial parallelizable
serial
serial
T1
S(p) =
1 + σ(p − 1)
T1
S(p) =
1 + σ(p − 1)
Amdahl’s Law
10
σ = 10%
0
1 21 41 61 81 p
10
σ = 10%
0
1 21 41 61 81 p
18
Tuesday, April 13, 2010
Contention and Coherency
p
C(p) =
1 + σ(p − 1) + κp(p − 1)
σ = Contention
Degree of serialization on shared writable data, contention for resources.
κ = Coherency
Penalty for maintaining consistency of shared writable data.
10
σ = 10%
κ = 0.0025
0
1 21 41 61 81 p
10
σ = 10%
κ = 0.0025
0
1 21 41 61 81 p
Corollary:
Slower response time means you
need more hardware to serve the
same capacity.
Faster response time means more
capacity on the same hardware.
Row 3
Example:
Availability lookups handled separately from reservations.
Client side
Load balancer/content switch AS AS AS AS
Example:
Akamai DNS responds with nearest
point-of-presence.
Static content is
inherently parallel!
Consistency:
There exists a total ordering on all
operations, and all nodes in the system
agree on that ordering at every point in
time.
Availability:
Every request received by a non-failing
node must result in a response. (Every
algorithm must terminate.)
Partition-tolerance:
The network may lose arbitrarily many
messages from any subset of nodes to any
other subset of nodes.
Formal definitions from Gilbert, Lynch. “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services”
ACM SIGACT News, 2002.
Approve copy
1 hour
Editor
With 2 hours of
Approve copy
delay (minimum)
1 hour built-in, does the last
Editor
nanosecond really
matter?
Tuesday, April 13, 2010
Always ask yourself:
transfer occur or
neither do.
Database 1 Database 2
transactions
asynchronously.
Database 1 Database 2
Give money
Send
to user B
Reconcile
Post-relational databases
SimpleDB, BigTable, Hypertable
Michael Nygard
michael.nygard@n6consulting.com
www.michaelnygard.com/blog