Sei sulla pagina 1di 55

Part 1: Hadoop

A demo about setting up a


hadoop cluster with PoolParty
Service oriented
deployment
• I want to deploy a service
(http/mail/etc.)
• I don’t care about the platform
• I want it to work
• I want to do it a million times
• I want it now
Self managing
Gruntwork
Why?
Let’s be

• Cutting-edge
• Intuitive
• Tool-driven
• Lazy
Tools
Because we are human
And not apes
Big frontier
lots of settlements
• Shell scripts
• Capistrano
• Package managers (apt, yum,
tpkg)
• Chef
• Puppet
PoolParty
Enjoyable cloud infrastructure
Demo
Part 2
Distributed Algorithms
Discussion of distributed algorithms, the
Hermes project and “nosql”
What?

• A distributed algorithm is an
algorithm designed to run on
computer hardware constructed
from interconnected processors.
(Wikipedia)
Why?
• Because scale is becoming
increasingly important
• “Datacenters” are becoming
accessible
• Commodity hardware is cheap
• Network is cheaper
When?

• Now
Assorted types
• MapReduce
• Atomic Commit
• Consensus
• Mutual exclusion (distributed
mutex)
• Distributed search
Why it’s easy

• Math is fun
Why it’s hard

• Account for failure


• Unsafe networks
• Data sharding
• Job distribution
Decentralization
assumptions
• Nodes are prone to failure
• Nodes are homogenous*
• The dataset is large
• Nodes are cheap (easy to
add/remove)
• Network is unowned
Scale

• Greater utilization of hardware


• Inexpensive
• Cooperative application space
• And it’s green
NoSql

• Scaling relational databases is not


easy and seriously no fun at all
• Key/Value stores are easier to
scale
NoSql

• BigTable (column oriented


database)
• Cassandra
• Voldemort
• Scalaris
Paxos
• An algorithm for deciding
consensus within a network of
unreliable nodes.
• “Transaction” layer
• Atomic commits
• Strong data consistency
Paxos

• Devised by Leslie Lamport in 1990


• Published in 1998
• Based on viewsource replication
(published 2 years earlier)
Big names

• Google’s Chubby (and BigTable)


• IBM San Volume Controller
• Microsoft
What?
What? (cont’d)
What? (cont’d)
What? (cont’d)
What? (cont’d)
What? (cont’d)
What? (cont’d)
Hermes
Open-source internode communication
project
What

• Erlang-y
• Consensus algorithms
• Distributed mutex
• Mapping/Reduction
Where? (almost)
http://github.com/auser/hermes/tree/master
Thanks
arilerner@mac.com
Thanks
• Ari Lerner
• AT&T CloudTeam
• And all the various funny image
sources
• irc.freenode.net/#poolpartyrb
• You

Potrebbero piacerti anche