Sei sulla pagina 1di 4

Revision Bayesian Inference

You are interested in the proportion of students that smoke in class. A recent survey carried by the
Ministry of Health reveals that 40 % of all students at the university smoke. Actually, you believe that
smoking prevalence among actuarial students is lower than among other university students but you are
not very sure of your point. You think that the beta (2,10) distribution represents well your prior
expectation and uncertainty about .
You also decide to collect some data though a random sample (of actuarial students) and it turns out
that 5 out the 20 students you interviewed are active smokers.
(a)
(i) Derive the classical estimate of the proportion of actuarial students who smoke
This is just that actual probability of obtaining 5 students that smoke out of a sample that
includes 20 students. If we denote

as an indicator variable such that

=1 if the i
th
student
smokes and 0 otherwise, then the experiment is similar to a series of Bernoulli trial with a
probability of success (success = student smoke), i.e. P(

and
the joint distribution of

is just

for a fixed order of occurrences. Since


the series of successes or failures can occur in (

) ways where

, the joint
distribution of the

s is just (

= (

. The classical
method of estimation involves obtaining the value of which maximizes the likelihood of
the actual sample - which is often but not always achieve by differentiating the likelihood or
the log likelihood and setting it to zero, which gives:



(ii) Obtain the posterior estimate or the Bayesian estimate of the proportion of students that
smoke in class.

The Bayesian approach involves combining the prior information about the unknown
parameter with data or put another way, the Bayesian approach is about updating your
prior belief about with data to produce what is called a posterior estimate.

Our prior belief about is represented by beta(2,10) distribution and we write :

. So we write
= (


Updating our prior belief about means calculating the conditional distribution of , given
that we observed data Y. In class, we saw that , follows the beta(5+2, 20-5+10) = the
beta(7,25) distribution.

Graphically:


Someone wrote a nice set of codes to obtain the posterior distribution of a proportion with
a beta prior and data that follows a binomial distribution.- sent by mail. Discussion about
the graph to follow in class tomorrow.

(iii) Contrast your answer in part (i) and part (ii)
In the classical approach, is viewed as a constant and Y is a random variable which varies from sample
to sample.
In the Bayesian approach, is viewed as a random variable and Y is a constant. In the Bayesian world,
parameters are said to be dynamic. This does not imply that has no fixed value - it simply means that
as long as is unknown, we can only postulate about the possible values that can take with an
appropriate probability distribution which is the posterior probability distribution. The Bayesian
approach refutes such ideas as re-sampling or if we were to take several samples etc as the sampling is
done once; Bayesian Statisticians believe that one way to limit sampling variability is to combine the
sample information with a priori information rather than postulating on unobserved eventual samples.




0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
beta( 2 , 10 ) prior, B( 20 , 5 ) data, beta( 7 , 25 ) posterior
theta
D
e
n
s
i
t
y
Prior
Likelihood
Posterior

(b) Now consider the posterior density in (ii), suppose that you are asked to tabulate all the possible
values of and their associated frequencies proposed by this distribution, just like in a frequency table,
or a frequency density,
(i) Which value is has the highest probability of occurrence or is most likely to occur?
This is posterior mode

(ii) In you were to list all the values of is ascending order or descending order (where each
value would be listed as many times as described by its frequency density), which value
would be in the middle?
This is the posterior median
The median is the value m such that P( m/y) and P( m/y) 1/2
It is the value m satisfying:

=

= 0.1932795

(Using R : qbeta(0.5, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE)

(iii) What is the mean value of
This is the posterior mean = 7/ (7+25) = 0.218

(iv) You are now asked to find the shortest range such that the probability that lies within this
range is 90%. This is highest posterior density. Will be discussed in class.

(v) You are asked to find another interval [a,b] such that there is a 90 % that lies in this
interval, but with the additional requirement that P( a/y) 5 % and that P(> b/y) 5 %
This is a central posterior density.

i.e. We are looking for a, such that 0.05=

=0.111
a = 0.1111
(Using R : qbeta(0.05, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE)

and b such that 0.05=


b= 0.3466525
(Using R : qbeta(0.05, 7, 25, ncp = 0, lower.tail = FALSE, log.p = FALSE)
Check : Calculate the area under the density between point a and point b,

i.e in R : pbeta(0.1111, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE) = 0.05002301

and pbeta(0.3466525, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE) = 0.9499999
and .9499999 - 0.05002301 = 0.90

(vi) Test the hypothesis that :
H
o
= v/s
H
1
=
P
0
= = =

= 0.03056312
pbeta(0.1, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE)
P
1
= =

= 0.1346445,
P
1
is 0.1346445/0.03056312 = 4.4 as likely as P
0
posteriori (after observing the data)
pbeta(0.3, 7, 25, ncp = 0, lower.tail = FALSE, log.p = FALSE)
or 1- pbeta(0.3, 7, 25, ncp = 0, lower.tail = TRUE, log.p = FALSE)

0
= = =

= 0.3026431

1
= = =

= 0.1129901

Potrebbero piacerti anche