Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
com
Architecture
Lessons
Learned
from
the
Trials
of
Scaling
a
High
Traffic
Website
• Founded
in
2005
• 3rd
Largest
Social
Network
in
United
States
• Teenage
Demographic
• 60+
Employees
January
2007
• 100M
Pageviews
• 1
Database
Server
• 1
Web
ApplicaOon
Server
• Daily
issues
with
load
and
site
availability
September
2008
• 2.5B
Pageviews
• 30
Database
Servers
• 120
Web
ApplicaOon
Servers
• 99.94%
UpOme
as
measured
by
pingdom.com
Key
Architecture
Components
• PHP5,
APC
• LighYpd
• Apache
hYpd
• Isilon
IQ
Clustered
NAS
• PostgreSQL
• Message
Systems
• Memcached
eCelerity
• Apache
AcOveMQ
• Subversion
Web
ApplicaOon
Architecture
• 2005-‐2007:
Monolithic
Code
Base
• 2008:
MigraOng
to
a
Services
Oriented
Architecture
– ApplicaOons
get
own
resources
– Loosely
Coupled
architecture
• MVC
ApplicaOon
using
XSLT
Web
ApplicaOon
Architecture
• Why
SOA?
– Monolithic
app
wastes
hardware
– Cross
Data-‐Center
OperaOons
– SelecOve
Maintenance
Scaling
Postgres
Rules
for
Scaling
1. Plan
for
Growth
2. Know
the
internals
3. Bigger
Hardware
is
BeYer
Our
Postgres
Scaling
History
• Quarter
1,
2007
– Monolithic
database
with
one
schema,
many
complex
joins
and
poor
opOmizaOon
– No
plan
for
growth
– No
DBA
Our
Postgres
Scaling
History
• Quarter
3,
2008
– Horizontal
“Sharded”
Data
– VerOcal
ParOOoning
– 5000
ConnecOons/sec
Avg
Scaling
Postgres:
Lessons
Learned
• Scaling
web
servers
means
many
database
connecOons,
needed
pooling
– Started
with
pgPool
moved
to
pgBouncer
• Started
with
Slony
replicaOng
read-‐only
slaves
– High
IO/CPU
Overhead
Scaling
Postgres:
Lessons
Learned
• Began
scaling
verOcally
by
separaOng
applicaOon
data
by
database
servers
and
removed
read
only
slaves
• Needed
few
small
tables
replicated
that
could
be
slightly
inaccurate
and
eventually
consistent
(BASE)
Scaling
Postgres:
Lessons
Learned
• Enter
plProxy
– Database
parOOoning
language
by
Skype
uOlizing
PostgreSQL
funcOons
– Trigger
based
plProxy
funcOons
replicate
needed
tables
without
the
Queue
overhead
– NOT
TRANSACTION
SAFE
Scaling
Postgres:
Lessons
Learned
• Standard
Use
of
plProxy
– Horizontal
parOOoning
of
data
by
ID
across
mulOple
servers
– Example:
Messaging
System
• 8
Servers
store
actual
parOOoned
message
data
• Rule
#1
–
Plan
for
Growth
Scaling
Postgres:
Lessons
Learned
• Knowing
internals
– pg_catalog
• pg_stat_user_tables
• pg_stat_user_indexes
Scaling
Postgres:
Knowing
Internals
Scaling
Postgres:
Lessons
Learned
• Database
Ecosystem
– Performance
Factors
• Index
bloat
• Usage
changes
– Abuse
• Cache
uOlizaOon
contenOon
Scaling
Postgres:
Lessons
Learned
• Bigger
is
BeYer
– More
RAM
– More
Disks
– Faster
and
More
CPU
Scaling
Postgres:
Lessons
Learned
Scaling
Across
CPU
Cores
Before
and
A=er
Upgade
• PostgreSQL
Scales
to
32
Cores
• Extensive
Benchmarking
@
MYB
Scaling
Postgres:
Future
Plans
• More
ParOOoning
• SOA
Data
DistribuOon
– Golconde
• Python
Based
• Apache
AcOveMQ
Apache
AcOveMQ
• Java
based
Message
Broker
soqware
• Client
language
neutral
• Implements
JMS
1.1,
Stomp,
XMPP,
REST
and
Others
AcOveMQ
@
myYearbook.com
Out-‐of-‐band
Processing
Targeted
Workload
• Uploaded
content
processing
• Message
Queues
allow
for
the
– Image
Resize
– Content
analysis
(R&D)
right
server
for
the
job
– AnO-‐Virus
Scans
• BeYer
distribuOon
of
CPU
• Comment
and
Message
processing
intensive
tasks
without
– Spam
Processing
negaOvely
impacOng
the
user
• Email
spooling
from
web
experience
applicaOon
• Anywhere
we
can
that
makes
sense
• Clusterable,
Scalable
Memcached:
Key
for
Success
• Valuable
Scaling
Tool
– Over
250k
get
requests
second
during
peak
– Over
750GB
of
cached
data
– Easy
to
Deploy
– The
more
distributed
the
cache
becomes
the
less
impacOng
cache
failures
become
-‐
more
boxes
are
beYer
than
fewer
Memcached:
PotenOal
Problems
• Large
scale
implementaOons
can
have
some
hidden
problems
– Lots
of
network
traffic
– Non-‐parOOon
or
evenly
distributed
data
• What
to
do
for
data
that
is
not
evenly
distributed?
–
Implemented
a
round-‐robin
cluster
of
memcache
servers
that
contain
the
same
data
Research
and
Development
• Copyr
– Copy-‐on-‐Write
Filesystem
ReplicaOon
• Framewerk
– PHP5
OO
Development
Framework
• Golconde
– Queue
Based
Data
DistribuOon
for
PostgreSQL
• Lightr
– PHP5
XMPP
Class
Library
• mod_xsltd
– LighYpd
XSL
TransformaOon
module
• Playr
– PostgreSQL
Log
Replay
• Staplr
– STAOsical
Package
Logically
engineered
Right
Tools
for
Success
• OperaOons
Portal
– ExecuOve
Level
Overview
of
OperaOonal
Status
and
ProducOon
Change
Log
• Staplr
– Trending
&
AnalyOs
System
OperaOons
Portal
Trending
and
Analysis:
Staplr
• Version
0.6
– PHP
Based
– Process
forking
– Shelled
RRD
Commands
• Version
2.0
– Python
Based
– Threaded
– Python
wrappers
to
librrd
Trending
and
Analysis:
Staplr
• Polls
for:
– Apache
hYpd
– Apache
AcOveMQ
– lighYpd
– memcached
– MySQL
– pgBouncer
– PostgreSQL
– SNMP
Data
• APC,
Isilon,
F5,
Xiotech,
Others
– SysStat
QuesOons?