Sei sulla pagina 1di 6

MATH5820M Bayesian Statistics and Causality

Coursework
Walter Peralta
ID 201074957
Dr John Paul Gosling

Answer 1. Proposal for a data generating model:


Before proceeding for any proposal, we need to take into account all what we know about this
problem. First of all, we are given an interval of 56 minutes long of video recording (rounded
to the nearest 10 seconds) due to the way the system is set up. We also know that the length
of the videos tend to be as long as it is possible, due to the fact that every lecture takes almost
one hour. We also consider that every lecture is recorded only if it lasts more than 15 minutes.
This sounds plausible, because no lecture starts to last less than 15 minutes. The range we
start to define is over (15, 56).
We will start to considering a normal distribution as a reasonable likelihood for the data
generating process. We think that most of the lectures tend to have a similar length. The
truncated version of this distribution is applied, because we have defined the length of videos
over a range. Xi | T N (, ) is then defined, with parameters as the mode and as the
spread of the data.
Prior to starting our analysis, we need to make a set of assumptions over the parameters. The
first one is that the length of a video does not affect the length of the next recordings, which
means that each length is independent on every recording. Furthermore, every lecturer has
enough time to start their recordings with independence on the previous one. This assumption
implies that the length of video recordings X is a sequence of i.i.d. variables. Moreover, as we
have already noticed, most of the lectures will tend to finish their recordings almost up to the
limit in which the recording is set up.
Thus, if we define X as the data over the length of videos and as the parameter of our
beliefs over the data, a construction for the likelihood is expressed in (1).

(
)
1
1
l(; X) = (Xi |) exp 2 (x )2

(1)

This is going to be our proposal for the data generating process.

Answer 2. Elicitation of our beliefs about the parameters and a prior specification:
Let start by defining the parameter over the range (15,56). We consider that the value
that splits the mode into two parts with equal probability is 35. A reasonable explanation for
accepting this value is that most lectures tend to finish after half an hour. We assume that
lecturers tend to approximate as far as possible up to the half hour.
1

Then, we want to know which value splits the mode into two over the range (15, 30), such
that there is an equal chance of being above or below this value. There is no chance that this
value is close to 15, because it is unlikely that a lecture would start if only records around 15
minutes. Moreover, when a lecture start, whatever the topic might be, there is an introduction
and a conclusion of the main topic. Thus, we will give an answer of 25. Finally, the value which
splits the mode over the upper range (30, 56) is going to be 40, considering that many lectures
tend to finish their recordings as close as the time is up.
Our judgements for the parameters are expressed below:
Range (15, 56)
Median = 35
LQ = 25
UQ = 40
Using the MATCH elicitation software tool, a Beta distribution is fitted according to our
judgements: ( 15)/41 Be(1.49407, 1.773119) in Figure 1-left.

Figure 1:
Regarding the spread value , we are uncertain about it. As far as we know, every lecture
takes one hour long, but we assume that all of them has different timing, due to the difference
in topics and lecturers. Therefore, we will define a range not too small, but not too big, being
likely to be around 15 and 5 minutes.
We are now able to build the preposterior distribution for the data in order to check our
beliefs. Figure 1-right reflects more or less our judgements about the data on video lengths.

Answer 3. Non-informative prior:


We want a prior that encodes our ignorance. Using Jeoffreys prior over our likelihood expressed in equation (1), we get the non-informative prior expressed in equation (2).

L(; x) =

1
(x )2 + C
2
2

Setting derivatives over


1
2 2
1
I(; x) = 2
2
1
()
2 2
=

(2)

Answer 4. Numerical integration approach:


The expression for our posterior density is expressed below in (3).

(, |Xi ) (, )(Xi |, )
!0.49407
!0.773119
15
15
1

41
41

!
! Z56
(
)
1
1
1
1
1
exp 2 (Xi )2
9
9

2
15

(3)
Looking at our data, we can know start numerical integration approach using the MetropolisHastings algorithm. We use logs over the values to ease the analysis. Using a sample chain of
100000, we get the following plot with respect to in Figure 2-left.

Figure 2:
The plot shows a pattern that moves around the value = 31.43583. This pattern may be seen
as a a white noise process. However, in Figure 2-right, there is a some autocorrelation between
the chains with memory between them. A similar pattern is observed in Figure 3 with respect
to .
3

Figure 3:
Looking at the chain after a burn-in of 5000 in Figure (4), we obtain the following statistics:

Post-mean = 31.43639
95%CI : (29.17826, 33.41413)

Post-variance = 1.24441
95%CI : (1.035652, 1.676541)

Figure 4:

Answer 5. Predictive sampling:


Comparing the preposterior with the predictive distribution in Figure 5, we can observe how
our beliefs have changed in the light of the data. The statistics for the predictive distribution
are Mean = 31.43432 and Variance = 3.200284.
4

Figure 5: Preposterior and Predictive Distributions


We will expect to have data on length of videos with shorter spread than we first thought.
However, our first beliefs about the mode were almost closer to the mean related to the predictive distribution. This means that we expect to have a mean around 30 minutes.

Answer 6. Networks:
We add the following explanatory variables to the DAG in Figure 6:
Z9 = late bus
Z10 = preparation of the lecture

Figure 6: DAG
a) The lecturers bedtime as well as his condition of being tired affects the start of the lecture.
Being tired can also be affected by his late bedtime. Commuting to the university has another
influence over being on time and thus over the lecture starting time. Moreover, if the lecturer is
tired, his condition has influence over the preparation of the lecture, which can also be affected
5

by the difficulty of the topic.


In turn, the preparation of the class may affect the interest of the students over the lecture.
If a class is not well prepared, this has influence over the students interest, which in turn may
also be affected whether or not they went out last night.
The students interest has also an influence over the lecturers, just because if they dont
show any enthusiasm on listening to him, he will also lose interest. But if he is thinking on
the late night TV show, he might as well loose interest in the topic and will tend to finish the
lecture and try to get home before his favourite show starts.
b) Given our proposed causal network, let see which adjustment we are able to use.
We observe that
X Z4 Y Z8
There is one chain X Z4 Y and one collider Z4 Y Z8 . The back-door adjustment
cannot be used, because there is no indirect path from X to Y.
If we use the front-door adjustment, we observe that all directed paths between X and Y
are blocked by Z4 . We also observe that there is no unblocked back-door path from X to S, as
Y
/ S. Finally, there is no back-door paths from S to Y to observe, as we have already seen
with the back-door adjustment. Therefore, S satisfies the front-door condition.

Potrebbero piacerti anche