Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
9.1 Introduction
In this lecture we are going to derive Chernoff bounds. We will then look at applications of
Chernoff bounds to coin flipping, hypergraph coloring and randomized rounding.
E[X]
Pr[X > λ] <
λ
E[f (X)]
Pr[f (X) > f (λ)] <
f (λ)
E[f (X)]
Pr[X > λ] = Pr[f (X) > f (λ)] <
f (λ)
1
First we will state our assumptions and definitions. Let X be a sum of n independent random
variables {Xi }, with E[Xi ] = pi . We assume for simplicity that Xi ∈ {0, 1} for all i ≤ n. Similar
bounds hold for the case when Xi s are arbitrary bounded random variables (see Homework 3).
Let µ denote the expected value of X. Then we have
hX i X X
µ=E Xi = E[Xi ] = pi
To establish our bound we pick f (X) = e tX , and compute the probability that X deviates signifi-
icantly from µ:
h
tX (1+δ)tµ
i E etX
Pr[X > (1 + δ)µ] = Pr e >e ≤ (1+δ)tµ (9.4.1)
e
We will know establish a bound on E etX .
Q
E etX = E et Xi =E i etXi
Q
= iE etXi (by independence)
Q t
= i (pi e + (1 − pi ) · 1)
Q
= i (1 + pi (et − 1))
Q pi (et −1)
E etX ≤ ie
= e i pi (et −1)
t −1)µ
= e(e
t
Substituting E etX ≤ e(e −1)µ into Equation 9.4.1, we get that for all t ≥ 0,
t
e(e −1)µ
Pr[X > (1 + δ)µ] ≤ (1+δ)tµ
e
In order to make the bound as tight as possible, we find the value of t that minimizes the above
expression — t = ln (1 + δ). Substituting this into the above expression, we obtain, for all δ ≥ 0:
2
ln (1+δ) −1)−(1+δ) ln (1+δ))µ
Pr[X > (1 + δ)µ] ≤ e((e = [eδ−(1+δ) ln (1+δ) ]µ
Assuming that 0 ≤ δ < 1, and thereby ignoring the higher order terms, we get
δ2 δ3 δ2
(1 + δ) ln (1 + δ) > δ + − ≥δ−
2 6 3
Plugging this into our original expression we obtain:
−δ 2 µ
Pr[X > (1 + δ)µ] ≤ e 3 (0 < δ < 1)
−δ 2 µ
Pr[X < (1 − δ)µ] ≤ e 2 (0 < δ < 1)
−δ 2
δ − (1 + δ) ln (1 + δ) ≤
2+δ
Hence we obtain the following bound, which works for all positive values of δ.
−δ 2 µ
Pr[X > (1 + δ)µ] ≤ e 2+δ (δ ≥ 0)
−δ 2 µ
Pr[X < (1 − δ)µ] ≤ e 2+δ (δ ≥ 0)
3
9.5 Examples
9.5.1 Coin-flipping
We will now use Chernoff bounds to analyze n coin flips of an unbiased coin.
P
Let Xi = 1 if the ith flip is heads and 0 otherwise. Let X = Xi be the number of heads in n
flips. It is immediate that we expect to see µ = n/2 heads. We now compute the deviation using
Chernoff bounds.
2
λ − λ µ λ2
Pr[X ≥ µ + λ] = Pr X ≥ µ 1 + ≤ e µ 3 = e− 3µ
µ
h i
If λ2 = O(n), we get that Pr X ≥ µ(1 + µλ ) ≤ e−O(1) . To obtain a lower probability of error, we
√ h i
can use λ = O( n log n) getting Pr X ≥ µ(1 + µλ ) ≤ n1C .
√
Hence, w.h.p., X ∈ µ ± O( n log n).
Let us now compare this with the bound given by Chebyshev’s inequality. Note that σ 2 (Xi ) =
p(1 − p) = 41 . Therefore, σ 2 (X) = n4 . So we get
σ2 1 p
P r[X ≥ µ + λ] ≤ 2
= with λ = O( n log n)
λ log n
Chernoff gives a much stronger bound on the probability of deviation than Chebyshev. This is
because Chebyshev only uses pairwise independence between the r.v.s whereas Chernoff uses full
independence. Full independence can some times imply exponentially better bounds.
4
P
Let Y = v∈Si Yv . Then Disc(Si ) = 2|k/2 − Y |, where k = |Si |. Note that E[Y ] = k2 . Applying
Chernoff bounds:
k 2λ2
Pr |Y − | ≥ λ ≤ e− 3k
2
q
3
Substituting λ = k log m,
h p i 1
Pr Disc(Si ) ≥ 12k log m ≤ 2
m
h p i X h p i 1 1
Pr ∃i | Disc(Si ) ≥ 12n log m ≤ Pr Disc(Si ) ≥ 12n log m ≤ m 2 =
m m
i
5
Solving this program is NP-Complete, so a natural way to relax the problem is to allow f Pi ∈
[0, 1] ∀i, P , transforming the above program into a linear program.
Note that the congestion of the optimal solution to the LPless than or equal to the best possible
congestion. Therefore an approximation to the congestion of this solution implies an approximation
to the problem.
We solve the LP to obtain a fraction solution and then round it to obtain an integral solution. The
solution to the LP is a unit flow between every pair of vertices in D. Suppose that this solution
has congestion C ∗ . In order to round this, for every i, we pick a path P i ∈ Pi with probability fPi i .
P i
Now let us analyze the congestion on an edge e. Note that flow i picksPe w.p. {P ∈Pi , e∈P } fP .
e e
Let Xi = 1 if flow i uses e. Then the congestion on e is given by X = i Xi . e
X
E[Xie ] = fPi
{P ∈Pi , e∈P }
" #
X X X X
e
E[X ] = E Xie = E[Xie ] = fPi ≤ C
i i i {P ∈Pi , e∈P }
1
We want to pick a value of k such that P r[X e > kC ∗ ] < n3 . Then, taking a union bound over all
edges, we get a k-approximation. (Think about this!)
Claim 9.5.2 If C >> log n then the above rounding gives a (1 + )-approximation.
2 C
Proof: If k = 1 + then using the simpler form of Chernoff, we get Pr[X e > kC] ≤ e− 3 ≤
1
nconst .
When C is small, we do not get a 1 + approximation with a high probability. Instead, we know
δ2 C
that for any k = 1 + δ, δ > 0, Pr[Xe > kC] ≤ e− 2+δ ' e−δC for δ >> 2. δ = O(log n) suffices and
we get an O(log n) approximation.
In order to get a better approximation, we use the stronger form of Chernoff. For any k = 1 + δ:
C
eδ eδ
Pr[Xe > kC] ≤ ≤ as C ≥ 1
(1 + δ)(1+δ) (1 + δ)(1+δ)
Now if we pick δ = O( logloglogn n ), then (1 + δ) ln (1 + δ) = O(ln n), and therefore, (1 + δ) (1+δ) > nc
for some constant c. Therefore we get that with a high probability, the congestion is at most
1 + δ = O( logloglogn n ).
This is essentially the best possible bound achievable using randomized rounding.