Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Neal, WKU
MATH 329
Chebyshevs Inequality
Let X be an arbitrary random variable with mean and variance 2 . What is the probability that X is within t of its average ? If we knew the exact distribution and pdf of X , then we could compute this probability: P( X t) = P( t X + t ). But there is another way to find a lower bound for this probability. For instance, we may obtain an expression like P( X 2) 0.60 . That is, there is at least a 60% chance for an obtained measurement of this X to be within 2 of its mean. Theorem (Chebyshevs Inequality). Let X be a random variable with mean variance
2
and
and
P( X
t) 1 2 . t
t2 Proof. Consider Y = 0
if X >t X otherwise
. Then
2
t 2 P( X thus, P( X
> t) = E[Y ] E[ X
] = Var( X ) =
> t) 2 / t 2 . Therefore, P( X P( X t) = 1 P( X
Chebyshevs Inequality is meaningless when t . For instance, when t = it is simply saying P( X > t) 1 and P( X t) 0 , which are already obvious. So we must use t > to apply the inequalities. We illustrate next with some standard distributions. Example. (a) Let X ~ Poi(9) . Give a lower bound for P( X (b) Let X ~ N(100, 15) . Give a lower bound for P( X =9=
2
5) .
20) .
; so
= 3. Then 9 5) 1 2 = 1 = 0.64 . 25 5
2
P(4 X 14) = P( X 9 5) = P( X
(b) For X ~ N(100, 15) , we have P(80 X 120) = P( X 100 20) = P( X 20) 1 152 = 0.4375 20 2
Note: Using a calculator, we obtain P(80 X 120) 0.817577. From these examples, we see that the lower bound provided by Chebyshevs Inequality is not very accurate. However, the inequality is very useful when applied to the sample mean x from a large random sample. Recall that if X is an arbitrary measurement with mean is the sample mean from random samples of size n , then = and
2 x
and variance
, and x
Applying Chebyshevs Inequality, we obtain a lower bound for the probability that x is within t of :
2 x 2 2
P( x
t) = P( x x t ) 1
=1
nt 2
Suppose X is an arbitrary measurement with unknown mean and variance but with known range such that c X d . Then (d c ) / 2 and 2 (d c)2 / 4. Thus, (d c)2 4nt 2
P( x
t) 1
= p and
2 p
p(1 p) 0.25 . n n
P( p p t) 1
Example. Let X ~ N(100, 15) . Let x be the sample mean from random samples of size 400. Give a lower bound for P( x 2) . Solution. For random samples of size 400, we have P(98 x 102) = P( x 100 2) = P( x 2) 1 152 = 0.859375 400 22
Thus, for samples of size 400, there is a relatively high chance that x will be within 2 of the average = 100. Example. Let X be an arbitrary measurement with unknown distribution but with known range such that 10 X 30 . For random samples of size 1000, give a lower bound for P( x 1). Solution. Here
2
and
30 10 = 10 so that 2
100 = 0.90 . So there is at least a 1000 1 1000 12 90% chance that a sample mean x will be within 1 of the unknown mean . 100 . Then P( x 1
Example. Let p be an unknown proportion that we are estimating with sample proportions p from computer simulations with samples of size 4000. Give a lower bound for P( p p 0.02) . Solution. For the proportion p and trials of size 4000, we have P( p p 0.02) 1 0.25 0.25 = 0.84375 . 2 =1 nt 4000 0.02 2
Law of Large Numbers (a.k.a. Law of Averages) Let x be the sample mean from random samples of size n for a measurement with mean , and let p be the sample proportion for a proportion p . As the sample size n increases, the probability that x is within t of increases to 1, and
the probability that p is within t of p increases to 1. So for very large n and small t , we can say that virtually all x are good approximations of and virtually all p are good approximations of p .
Exercises 1. Let X ~ exp(20) . (a) Use Chebyshevs Inequality to give a lower bound for P( X 25) . (b) Use the cdf of X to give a precise value for P( X 25) . 2. Let X be a measurement with range 2 X 10. For random samples of size 400, give a lower bound for P( x 0.5) . 3. With samples of size 1200, let p be the sample proportion for an unknown proportion p . Give a lower bound for P( p p 0.03) .