Sei sulla pagina 1di 36

Discrete Random Variables and

Probability Distributions
Random Variables
• Random Variable (RV): A numeric outcome that results
from an experiment
• For each element of an experiment’s sample space, the
random variable can take on exactly one value
• Discrete Random Variable: An RV that can take on only
a finite or countably infinite set of outcomes
• Continuous Random Variable: An RV that can take on
any value along a continuum (but may be reported
“discretely”)
• Random Variables are denoted by upper case letters (Y)
• Individual outcomes for an RV are denoted by lower
case letters (y)
Probability Distributions
• Probability Distribution: Table, Graph, or Formula that
describes values a random variable can take on, and its
corresponding probability (discrete RV) or density
(continuous RV)
• Discrete Probability Distribution: Assigns probabilities
(masses) to the individual outcomes
• Continuous Probability Distribution: Assigns density at
individual points, probability of ranges can be obtained by
integrating density function
• Discrete Probabilities denoted by: p(y) = P(Y=y)
• Continuous Densities denoted by: f(y)
• Cumulative Distribution Function: F(y) = P(Y≤y)
Discrete Probability Distributions
Probabilit y (Mass) Function :
p ( y )  P (Y  y )
p ( y )  0 y
 p( y)  1
all y

Cumulative Distributi on Function (CDF) :


F ( y )  P (Y  y )
b
F (b)  P (Y  b)   p( y)
y  

F ()  0 F ()  1
F ( y ) is monotonica lly increasing in y
Example – Rolling 2 Dice (Red/Green)
Y = Sum of the up faces of the two die. Table gives value of y for all elements in S

Red\Green 1 2 3 4 5 6

1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
Rolling 2 Dice – Probability Mass Function & CDF

y p(y) F(y)
2 1/36 1/36 # of ways 2 die can sum to y
3 2/36 3/36
p( y) 
# of ways 2 die can result in
4 3/36 6/36
y
5 4/36 10/36
F ( y )   p (t )
6 5/36 15/36
t 2
7 6/36 21/36
8 5/36 26/36
9 4/36 30/36
10 3/36 33/36
11 2/36 35/36
12 1/36 36/36
Rolling 2 Dice – Probability Mass Function
Dice Rolling Probability Function

0.18

0.16

0.14

0.12

0.1
p(y)

0.08

0.06

0.04

0.02

0
2 3 4 5 6 7 8 9 10 11 12
y
Rolling 2 Dice – Cumulative Distribution Function
Dice Rolling - CDF

0.9

0.8

0.7

0.6
F(y)

0.5

0.4

0.3

0.2

0.1

0
1 2 3 4 5 6 7 8 9 10 11 12 13
y
Expected Values of Discrete RV’s

• Mean (aka Expected Value) – Long-Run average


value an RV (or function of RV) will take on
• Variance – Average squared deviation between a
realization of an RV (or function of RV) and its mean
• Standard Deviation – Positive Square Root of
Variance (in same units as the data)
• Notation:
– Mean: E(Y) = m
– Variance: V(Y) = s2
– Standard Deviation: s
Expected Values of Discrete RV’s
Mean : E (Y )  m   yp( y )
all y

Mean of a function g (Y ) : E g (Y )   g ( y ) p ( y )
all y

  
Variance : V (Y )  s 2  E (Y  E (Y )) 2  E (Y  m ) 2  
  ( y  m )2 p ( y )    y  2 ym  m  p ( y ) 
2 2

all y all y

  y p ( y )  2 m  yp( y )  m
2 2
 p( y ) 
all y all y all y

 
 E Y 2  2 m ( m )  m 2 (1)  E Y 2  m 2  
Standard Deviation : s   s 2
Expected Values of Linear Functions of Discrete RV’s
Linear Functions : g (Y )  aY  b (a, b  constants )
E[aY  b]   (ay  b) p ( y ) 
all y

 a  yp ( y )  b p ( y )  am  b
all y all y

V [aY  b]   (ay  b)  (am  b)  p ( y ) 


2

all y

 ay  am 
2

p( y )   a  y  m  p( y ) 
2 2

all y all y

a 2
 ( y  m)
all y
2
p( y )  a s 2 2

s aY b  a s
Example – Rolling 2 Dice

y p(y) yp(y) y2p(y)


2 1/36 2/36 4/36 12
3 2/36 6/36 18/36 m  E (Y )   yp( y )  7.0
4 3/36 12/36 48/36 y 2

s 2  E Y 2  m 2   y 2 p( y )  m 2
12
5 4/36 20/36 100/36
6 5/36 30/36 180/36 y 2
7 6/36 42/36 294/36
 54.8333  (7.0) 2  5.8333
8 5/36 40/36 320/36
9 4/36 36/36 324/36
s  5.8333  2.4152
10 3/36 30/36 300/36
11 2/36 22/36 242/36
12 1/36 12/36 144/36
Sum 36/36 252/36 1974/36=
=1.00 =7.00 54.833
Tchebysheff’s Theorem/Empirical Rule
• Tchebysheff: Suppose Y is any random variable
with mean m and standard deviation s. Then:
P(m-ks ≤ Y ≤ m+ks) ≥ 1-(1/k2) for k ≥ 1
– k=1: P(m-1s ≤ Y ≤ m+1s) ≥ 1-(1/12) = 0 (trivial result)
– k=2: P(m-2s ≤ Y ≤ m+2s) ≥ 1-(1/22) = ¾
– k=3: P(m-3s ≤ Y ≤ m+3s) ≥ 1-(1/32) = 8/9
• Note that this is a very conservative bound, but
that it works for any distribution

• Empirical Rule (Mound Shaped Distributions)


– k=1: P(m-1s ≤ Y ≤ m+1s)  0.68
– k=2: P(m-2s ≤ Y ≤ m+2s)  0.95
– k=3: P(m-3s ≤ Y ≤ m+3s)  1
Proof of Tchebysheff’s Theorem
Breaking real line into 3 parts :
i ) (-,( μ-ks )  ] ii ) [( μ-ks ), ( μ  ks )] iii ) [( μ  ks )  , )
Making use of the definition of Variance :

V (Y )  s   ( y  m ) 2 p ( y ) 
2


( μ-ks )  ( μ  ks ) 

 ( y  m) 2
p( y)  s ( y  m ) 2
p( y )   ( y  m) 2
p( y)
 ( μ-k ) ( μ  ks ) 

In Region i ) : y  m  ks  ( y  m ) 2  k 2s 2
In Region iii ) : y  m  ks  ( y  m ) 2  k 2s 2
( μ  ks )
 s  k s P (Y  m  ks ) 
2 2 2
 (
( μ-ks )
y  m ) 2
p ( y )  k s P(Y  m  ks )
2 2

 s 2  k 2s 2 P (Y  m  ks )  k 2s 2 P(Y  m  ks ) 
 k 2s 2 1  P ( m  ks  Y  m  ks )
s2
 2 2  2  1  P( m  ks  Y  m  ks )  P( m  ks  Y  m  ks )  1  2
1 1
ks k k
Moment Generating Functions (I)
Consider t he series expansion of e x :
 i 2 3
x x x
e x    1  x    ...
i  0 i! 2 6
Note that by taking derivative s with respect to x, we get :
de x 2 x 3x 2 x2
 0 1   ...  1  x   ...  e x
dx 2! 3! 2!
d 2e x 2x
2
 0 1  ...
dx 2!
Now, Replacing x with tY , we get :
 i 2 3
(tY ) (tY ) (tY )
e tY    1  tY    ... 
i 0 i! 2 6
t 2Y 2 t 3Y 3
 1  tY    ... 
2 6
Moment Generating Functions (II)
Taking derivative s with respect to t and evaluating at t  0 :
de tY 2tY 2 3t 2Y 3 t 2Y 3
 0Y    ...  Y  tY 
2
 ...  Y  0  0  ...  Y
dt t 0
2! 3! t 0
2! t 0

d 2 e tY
 0  Y 2  tY 3  ...  Y 2  0  ...  Y 2
dt 2 t 0
t 0

Taking the expected value of e tY , and labelling function as M (t ) :

M (t )  E e tY     e p ( y )   
ty
 
ty i

 p( y)
all y all y  i  0 i! 
 
 M ' (t ) t 0  E (Y ), M ' ' (t ) t 0  E Y 2 , ... M ( k ) (t )
t 0
 
 E YK

M(t) is called the moment-generating function for Y, and can be used to


derive any non-central moments of the random variable (assuming it
exists in a neighborhood around t=0).
Also, useful in determining the distributions of functions of random
variables
Probability Generating Functions

Consider the function t Y and its derivatives : P(t) is the


dt Y probability
 Yt Y 1 generating
dt
function for Y
d 2t Y Y 2
 Y (Y  1)t
dt 2
d k tY Y k
 Y ( Y  1)...( Y  ( k  1))t k 3
dt k
Let P (t )  E  t Y  :
 P '(t ) t 1  E (Y )
 P ''(t ) t 1  E Y (Y  1) 
 P ( k ) (t )  E Y (Y  1)...(Y  (k  1))  k 3
t 1
Discrete Uniform Distribution
• Suppose Y can take on any integer value between a and b
inclusive, each equally likely (e.g. rolling a dice, where a=1
and b=6). Then Y follows the discrete uniform distribution.
1
f ( y)  a yb
b  (a  1)
0 ya
 int ( y )  (a  1)
F ( y)   a  y  b int( x)  integer portion of x
 b  ( a  1)
1 yb
b
 1  1 b a 1 
1  b(b  1) (a  1)a  b(b  1)  a (a  1)
E (Y )   y    y   y   2  2   2(b  (a  1))
y a  b  ( a  1)  b  ( a  1)  y 1 y 1  b  ( a  1)
   b 2 a 1 2   b(b  1)( 2b  1) (a  1)a (2a  1) 
 
b
1 1 1
E Y 2   y 2     y   y      
y a  b  ( a  1)  b  ( a  1)  y 1 y 1  b  ( a  1) 6 6
b(b  1)( 2b  1)  a (a  1)( 2a  1)

6(b  (a  1))
2
b(b  1)( 2b  1)  a (a  1)( 2a  1)  b(b  1)  a (a  1) 
 
 V (Y )  E Y  E (Y ) 
2 2

6(b  (a  1))
 
 2(b  (a  1)) 
Note : When a  1 and b  n :
n 1 (n  1)( n  1) (n  1)( n  1)
E (Y )  V (Y )  s
2 12 12
Bernoulli Distribution
• An experiment consists of one trial. It can result in one of
2 outcomes: Success or Failure (or a characteristic being
Present or Absent).
• Probability of Success is p (0<p<1)
• Y = 1 if Success (Characteristic Present), 0 if not

p y 1
p( y)  
1  p y0
1
E (Y )   yp ( y ) 0(1  p )  1 p  p
y 0

 
E Y 2  0 2 (1  p )  12 p  p
 
 V (Y )  E Y 2  E (Y )  p  p 2  p (1  p )
2

s  p (1  p )
Binomial Experiment
• Experiment consists of a series of n identical trials
• Each trial can end in one of 2 outcomes: Success or
Failure
• Trials are independent (outcome of one has no
bearing on outcomes of others)
• Probability of Success, p, is constant for all trials
• Random Variable Y, is the number of Successes in
the n trials is said to follow Binomial Distribution with
parameters n and p
• Y can take on the values y=0,1,…,n
• Notation: Y~Bin(n,p)
Binomial Distribution
Consider outcomes of an experiment with 3 Trials:
SSS  y  3 P ( SSS )  P (Y  3)  p (3)  p 3
SSF , SFS , FSS  y  2 P ( SSF  SFS  FSS )  P (Y  2)  p (2)  3 p 2 (1  p )
SFF , FSF , FFS  y  1 P( SFF  FSF  FFS )  P (Y  1)  p (1)  3 p (1  p ) 2
FFF  y  0 P( FFF )  P(Y  0)  p(0)  (1  p)3
In General:
n n!
1) # of ways of arranging y S s (and (n  y) F s ) in a sequence of n positions    
 y  y !(n  y )!
2) Probability of each arrangement of y S s (and (n  y ) F s )  p y (1  p) n  y
n
3)  P (Y  y )  p ( y )    p y (1  p) n  y y  0,1,..., n
 y
EXCEL Functions:
p ( y ) is obtained by function:  BINOM.DIST(y, n, p, 0)
F ( y ) is obtained by function:  BINOM.DIST(y, n, p,1)
n
n
Binomial Expansion: ( a  b)     a i b n i
n

i 0  i 

n n
n y
  p ( y )     p (1  p ) n  y   p  (1  p )   1  "Legitimate" Probability Distribution
n

y 0 y 0  y 
Binomial Distribution (n=10,p=0.10)

0.5

0.45

0.4

0.35

0.3
p(y)

0.25

0.2

0.15

0.1

0.05

0
0 1 2 3 4 5 6 7 8 9 10
y
Binomial Distribution (n=10, p=0.50)

0.5

0.45

0.4

0.35

0.3
p(y)

0.25

0.2

0.15

0.1

0.05

0
0 1 2 3 4 5 6 7 8 9 10
y
Binomial Distribution(n=10,p=0.8)

0.35

0.3

0.25

0.2
p(y)

0.15

0.1

0.05

0
0 1 2 3 4 5 6 7 8 9 10
y
Binomial Distribution – Expected Value
n!
f ( y)  p y q n y y  0,1,..., n q  1  p
y!(n  y )!
n
 n! y n y 
n
 n! 
E (Y )   y  p q    y p y q n y 
y  0  y!( n  y )!  y 1  y!(n  y )! 
(Summand  0 when y  0)
n
 yn! y n y 
n
 n! y n y 
 E (Y )    p q    p q 
y 1  y ( y  1)! ( n  y )!  y 1  ( y  1)!(n  y )! 
Let y *  y  1  y  y *  1 Note : y  1,..., n  y *  0,..., n  1
n 1
n(n  1)! n 1
(n  1)!
 E (Y )   * y*1 n  ( y*1)
  y* ( n 1)  y*


y * 0 y ! n  ( y  1) !
*

p q np

y * 0 y ! ( n  1)  y !
* *
p q

 np ( p  q ) n 1  np p  (1  p ) 
n 1
 np(1)  np
Binomial Distribution – Variance and S.D.
n!
f ( y)  p y q n y y  0,1,..., n q  1  p
y!(n  y )!

   
Note : E Y 2 is difficult (impossibl e?) to get, but E Y (Y  1)   E Y 2  E (Y ) is not :
n
 y n y 
n
 
E Y (Y  1)    y ( y  1) 
n! n!
p q    y ( y  1)  p y q n y 
y 0  y!(n  y )!  y 2  y!(n  y )! 
(Summand  0 when y  0,1)
n
 E Y (Y  1)   
n!
p y q n y
y  2 ( y  2)! ( n  y )!

Let y **  y  2  y  y **  2 Note : y  2,..., n  y **  0,..., n  2


n2
n(n  1)( n  2)! y** 2 n ( y** 2 ) n2
(n  2)!
 E Y (Y  1)      2
 p y**q ( n  2 )  y** 
y** 0 
y ! n  ( y  2) !
** **
p
 q n ( n 1) p

y** 0 y ! ( n  2)  y !
** *

 n(n  1) p 2 ( p  q ) n  2  n(n  1) p 2  p  (1  p ) 
n2
 n(n  1) p 2
 
 E Y 2  E Y (Y  1)   E (Y )  n(n  1) p 2  np  np[( n  1) p  1]  n 2 p 2  np 2  np  n 2 p 2  np (1  p )
 
 V (Y )  E Y 2  E (Y )  n 2 p 2  np (1  p )  (np ) 2  np (1  p )
2

 s  np (1  p )
Binomial Distribution – MGF & PGF
 n  y 
 
n
M (t )  E e tY
  e   p (1  p )  
ty n y

y 0  y  
n
  (1  p)  
n
    pe t
y n y n
 pe t  (1  p )
y 0  y 


M ' (t )  n pe t  (1  p ) 
n 1

pe t  np pe t  (1  p )  n 1
et
 
M ' ' (t )  np ( n  1) pe t  (1  p ) 
n2
 
pe t e t  pe t  (1  p )  e 
n 1 t

 E (Y )  M ' (0)  np  p (1)  (1  p ) 


n 1
(1)  np
  
 E Y 2  M ' ' (0)  np (n  1) p (1)  (1  p ) 
n2

p (1) (1)   p (1)  (1  p ) 
n 1

[1] 
 np ( n  1) p  1  n 2 p 2  np 2  np  n 2 p 2  np (1  p )
 
 V (Y )  E Y 2  E (Y )  n 2 p 2  np (1  p )  ( np ) 2  np (1  p )
2

s  np (1  p )

 n  y 
 
n
P (t )  E t Y
  t   p (1  p )  
y n y

y 0  y  
n
n
    pt  (1  p ) n  y   pt  (1  p ) 
y n

y 0  y 
Geometric Distribution
• Used to model the number of Bernoulli trials needed until
the first Success occurs (P(S)=p)
– First Success on Trial 1  S, y = 1  p(1)=p
– First Success on Trial 2  FS, y = 2  p(2)=(1-p)p
– First Success on Trial k  F…FS, y = k  p(k)=(1-p)k-1 p

p ( y )  (1  p ) y 1 p y  1,2,...
  

 p( y)   (1  p)
y 1 y 1
y 1
p  p  (1  p )
y 1
y 1

Setting y *  y  1 and noting that y  1,2,...  y *  0,1,...


 
 1  p
  p ( y )  p  (1  p )  p  y*
  1
y 1 y * 0 1  (1  p )  p
Geometric Distribution - Expectations
 
dq y d  y d   y 1 
E (Y )   y  q p   p 
y 1
 p  q  p q q  
y 1 y 1 dq dq y 1 dq  y 1 
d  q   (1  q )(1)  q (1)  p  (1  q )  q  p 1
p    p     2 
dq 1  q   (1  q ) 2
 (1  q) 2
p p

 
d 2q y d2 
d2   y 1 
E Y (Y  1)    y ( y  1)  q y 1
p   pq  2
 pq 2  y
q  pq 2 q q  
y 1 y 1 dq dq y 1 dq  y 1 
d2  q 
 pq 2 1  q   pq
d 1
 pq  2(1  q ) 3
( 1)  
2 pq

2 pq 2q
 2
dq (1  q) 1  q 
2 3 3
dq   p p
2q 1 2(1  p)  p 2  p
 E Y 2   E Y (Y  1)   E (Y )  2
  2
 2
p p p p
2
2 p 1  2  p 1 1  p q
 V (Y )  E Y 2    E (Y )      2  2
2
2   2
p  p p p p
q
s 
p2
Geometric Distribution – MGF & PGF
   
  
p p
  e ty q y 1 p   ety q y   qet
y
M (t )  E e tY

y 1 q y 1 q y 1

 qe 
t  t t
pqe t y 1 pe pe
  
q y 1 1  qe 1  (1  p )e t
t

 
  
  t q p   t q   tq  
y y 1 p p
P (t )  E t Y y y y

y 1 q y 1 q y 1
ptq 
  tq 
y 1 pt

pt
q y 1 1  tq 1  (1  p )t
Negative Binomial Distribution
• Used to model the number of trials needed until the rth
Success (extension of Geometric distribution)
• Based on there being r-1 Successes in first y-1 trials,
followed by a Success

 y  1 r
p ( y )    p (1  p ) y  r y  r , r  1,...
 r 1
r
E (Y )  (Proof Given in Chapter 5)
p
r (1  p )
V (Y )  2
(Proof Given in Chapter 5)
p
Poisson Distribution
• Distribution often used to model the number of
incidences of some characteristic in time or space:
– Arrivals of customers in a queue
– Numbers of flaws in a roll of fabric
– Number of typos per page of text.
• Distribution obtained as follows:
– Break down the “area” into many small “pieces” (n pieces)
– Each “piece” can have only 0 or 1 occurrences (p=P(1))
– Let l=np ≡ Average number of occurrences over “area”
– Y ≡ # occurrences in “area” is sum of 0s & 1s over “pieces”
– Y ~ Bin(n,p) with p = l/n
– Take limit of Binomial Distribution as n  with p = l/n
Poisson Distribution - Derivation
n y
l  l
y
n! n!
p( y )  p y (1  p ) n  y    1  
y!(n  y )! y!(n  y )!  n   n 
Taking limit as n   :
n y y
l  l ly n(n  1)...( n  y  1)( n  y )!  l   n  l 
y n
n!
lim p ( y )  lim   1    lim 1     
n  y!( n  y )! n n y (n  y )!
n 
   n y! n  n  n 
ly n(n  1)...( n  y  1)  l  ly  n  n  1   n  y  1  l 
n n

 lim 1    lim   ... 1  


y! n (n  l ) y
 n y! n n  l  n  l   n  l  n 
 n   n  y 1
Note : lim    ...  lim    1 for all fixed y
n  n  l
  n 
 nl 
ly  l
n

 lim p ( y )  lim 1  
n  y! n n 
n
 a
From Calculus, we get : lim 1    e a
n 
 n
ly e l l y
 lim p ( y )  e l  y  0,1,2,...
n  y! y!


xi
Series expansion of exponentia l function : e x  
x 0 i!
l
 
e l y 
l y
  p( y )    e l   e l e l  1  " Legitimate " Probabilit y Distributi on
y 0 y 0 y! y  0 y!

EXCEL Functions :
p ( y ) :  POISSON(y, l ,0)
F ( y ) :  POISSON(y, l ,1)
Poisson Distribution - Expectations
el ly
f ( y)  y  0,1,2,...
y!


 e l l y    e l l y   e l l y 
l y 1
E (Y )   y    
 y  
  l e l
  l e l l
e l
y 0  y!  y 1  y!  y 1 ( y  1)! y 1 ( y  1)!


 e l l y    e l l y   e l l y
E Y (Y  1)    y ( y  1)     y ( y  1)   
y 0  y!  y  2  y!  y  2 ( y  2)!

ly 2
 l2 e l   l2 e l e l  l2
y 2 ( y  2)!

 
 E Y 2  E Y (Y  1)   E (Y )  l2  l
 
 V (Y )  E Y 2  E (Y )  l2  l  [l ]2  l
2

s  l
Poisson Distribution – MGF & PGF

   e 

e l  e l y  l
le t y
M (t )  E e  
tY ty

y 0  y!  y 0 y!

le  t y
 
e l

y 0 y!
e e  l le t
e l e t 1

e l  e lt 
l l
 
 y  y
P(t )  E t Y
 t  
y

y 0  y!  y 0 y!

lt  y
e l

y 0 y!
 l lt
e e e l ( t 1)
Hypergeometric Distribution
• Finite population generalization of Binomial Distribution
• Population:
– N Elements
– k Successes (elements with characteristic if interest)
• Sample:
– n Elements
– Y = # of Successes in sample (y = 0,1,,,,,min(n,k)

 k  N  k 
  
y n y 
p ( y )    y  0,1,..., min( n, k )
N
 
n
k
E (Y )  n  (Proof in Chapter 5)
N
 k  N  k  N  n 
V (Y )  n    (Proof in Chapter 5)
 N  N  N  1 

Potrebbero piacerti anche