Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1 authentication
Rteacher 2.1
Load
Lecture 2
Last saved at:
NA
Save
2. Probability
2.1. Introducing probability
Probability theory is a branch of mathematics which rigorously describes uncertain or random systems and processes.
It has its root in the 16th and 17th century with work of Cardano, Fermat and Pascal, but it is also an area of modern
development and application.
Statistical theory employs probability theory in designing methods for learning about the real world through data.
Modern applications of probability and statistical theory are enormously varied. For example, determining the location
of genes in the human genome which predispose to diseases such as diabetes, heart disease and cancer; evaluating
insurance risks and pension benefits; predicting global climate change and extreme weather conditions; designing
optimal strategies for trading in the financial markets.
Put simply, probability measures the likelihood of some event occurring:
Probability 0 means that the event is impossible, whereas
Probability 1 means that the event is certain.
The larger the probability, the more likely the event.
http://127.0.0.1:6149/ 1/14
23/10/2016 Rteacher 2.1 authentication
Event Notation
Further definitions will follow, but for now consider:
An event describing some outcome of the
A
experiment
The probability of event A occurring,
Pr(A)
0 ≤ Pr(A) ≤ 1
Experiment 1: Roll three dice. Let
A = {There is at least one 6 rolled}.
By symmetry, we argue that Pr(A) = 0.42 .
Experiment 2: Measure the height of a randomly selected student. Let
B = {Height less than 1.6m}.
Empirically, we might find that Pr(B) ≈ 0.25 .
Experiment 3: Observe the weather in Leeds tomorrow. Let
C = {It will rain at some point during the day}.
Subjectively, I predict Pr(C) ≈ 0.2 .
Here we see several ways to assign probability:
i. By symmetry : considering all the possible outcomes from the experiment and computing the proportion which
meet condition A , assuming all outcomes are equally likely.
http://127.0.0.1:6149/ 2/14
23/10/2016 Rteacher 2.1 authentication
ii. Empirically : repeatedly performing the random experiment under constant conditions and computing the relative
frequency with which B occurs.
iii. Subjectively : based on my experience, some informal reasoning, and possibly also past weather data.
2.2. Set theory
2.2.1. Sample space and events
Let the sample space, Ω, the Greek letter capital "omega", be the set of all possible outcomes of an experiment, and
let ω, small "omega", be an element of Ω, ω ∈ Ω . We call ω an elementary event. Let |Ω| denote the number of
elements in Ω.
More generally, an event is a set of outcomes of an experiment. We often denote an event with an uppercase letter
A , B , C, … The set A can be empty, A = ∅ , giving an impossible event, Pr(∅) = 0 , or can equal the sample
Example
Experiment: Toss three coins.
Ω = {(H, H, H), (H, H, T), (H, T, H), (H, T, T), (T, H, H), (T, H, T), (T, T, H), (T, T, T)}
Thus |Ω| = 8 .
Suppose A = {There are at least two heads} = {(H, H, H), (H, H, T), (H, T, H), (T, H, H)} .
Assuming the coins are fair, that is, they are as likely to land "Heads" as they are to land "Tails", then we can perform
|A|
the following probability calculation: |A| = 4 and so Pr(A) = =
4
8
=
1
2
.
|Ω|
Example
Experiment: Roll two dice.
Ω = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
Thus |Ω| = 36 .
Suppose
Then |A| = 3 and so, assuming the dice are fair,
|A|
Pr(A) =
|Ω|
3
=
36
1
= .
12
Example
Experiment: Measure the height of a randomly selected student. Thus Ω = ?
We shall return to this later in this module.
2.2.2. Practice questions
2.2.2.1.
I roll a dice and flip a coin.
a. What is the sample space, Ω, for this experiment?
b. How many elementary events does it contain?
c. Assuming the dice and coin are both fair:
i. What is the probability of a "Head" and an even number?
ii. What is the probability of a "Head" or an even number, or both?
iii. What is the probability of a "Head" or an even number, but not both?
2.2.2.2.
I roll two dice, one after the other.
a. How many elementary events are there in the set S where the second dice shows a higher number than the first
dice?
b. How many elementary events are there in the set T where one or both of the dice shows a "1" or a "6"?
c. Assuming both dice are fair,
i. what is the probability of the event S , above?
ii. what is the probability of the event T , above?
iii. what is the probability that events S and T both occur?
2.2.3. Venn diagrams
It is useful to show the relationships between events using Venn diagrams, such as the following.
Points represent outcomes, with
the box representing the sample space, and
the shaded region represents the outcomes in an event.
http://127.0.0.1:6149/ 4/14
23/10/2016 Rteacher 2.1 authentication
2.2.4. Operations with events
http://127.0.0.1:6149/ 5/14
23/10/2016 Rteacher 2.1 authentication
c
Note that Ω and ∅ , also A ∪ A and A ∩ A .
c c c
= ∅ = Ω = Ω = ∅
http://127.0.0.1:6149/ 6/14
23/10/2016 Rteacher 2.1 authentication
The operations of union and intersection can be further combined to give various set identities, for example:
A ∪ B = B ∪ A
} Commutative laws
A ∩ B = B ∩ A
A ∪ (B ∪ C) = (A ∪ B) ∪ C
} Associative laws
A ∩ (B ∩ C) = (A ∩ B) ∩ C
(A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C)
} Distributive laws
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C)
We can show, intuitively, that these laws are true by carefully constructing a set of Venn diagrams ( but formal proof is
more rigorous and powerful) . As an example, consider the second distributive law. Drawing Venn diagrams of both
lefthand side and righthand side shows that the law is true. Starting with the lefthandside:
http://127.0.0.1:6149/ 7/14
23/10/2016 Rteacher 2.1 authentication
Now the righthand side:
We see that the areas shaded in panels c and f are the same, and so we can claim that
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C) is a true statement.
2.3. The axioms and basic rules of probability
http://127.0.0.1:6149/ 8/14
23/10/2016 Rteacher 2.1 authentication
2.3.1. Kolmogorov's axioms
Kolmogorov's axioms of probability are:
K1: Pr(A) ≥ 0 for any event A ,
K2: Pr(Ω) = 1 for any sample space Ω and
K3: Pr(A ∪ B) = Pr(A) + Pr(B) for mutually exclusive events A and B ( that is where A ∩ B .
= ∅ )
Clearly, these are very basic statements, but they are sufficient to allow many complex rules to be derived.
Consider the following basic rules:
a. Pr(A c ) = 1 − Pr(A).
Proof: Starting with the set relation:
c
A ∪ A = Ω,
consider the probability on the left and right of:
c
Pr(A ∪ A ) = Pr(Ω).
Applying Kolmogorov axioms K3 and K2 we obtain:
c
Pr(A) + Pr(A ) = 1
which leads to the required result
c
Pr(A ) = 1 − Pr(A)
b. Pr(∅) = 0 .
Proof: Start by noting that ∅ = Ω
c
, then by result ( a) above with A = Ω , we have
K2
Pr(∅) = 1 − Pr(Ω) = 1 − 1 = 0, as required.
Note that in the final step, K2 has been written over the equals sign to show that axiom K2 is needed.
Proof: Again, start with a set relation,
c
B = A ∪ (B ∩ A )
then using K3, since A ∩ (B ∩ A c ) , gives
c
Pr(B) = Pr(A) + Pr(B ∩ A )
and since Pr(B ∩ A c ) ≥ 0 , by K1 we get
Pr(B) ≥ Pr(A),
as required.
http://127.0.0.1:6149/ 9/14
23/10/2016 Rteacher 2.1 authentication
2.3.2. The addition rule for general events
A more important rule which can be derived from the axioms of probability is known as the Additional Rule for General
Events
The key step is to realise that we can subdivide the events in the following two ways.
Start with
c
A ∪ B = B ∪ (A ∩ B ).
Using K3, since B ∩ (A ∩ Bc ) = ∅ , gives
c
Pr(A ∪ B) = Pr(B) + Pr(A ∩ B ). (*)
Also using the set relation,
c
A = (A ∩ B) ∪ (A ∩ B )
using K3 with (A ∩ B) ∩ (A ∩ Bc ) = ∅ , we get
c
Pr(A) = Pr(A ∩ B) + Pr(A ∩ B ). (**)
as required.
http://127.0.0.1:6149/ 10/14
23/10/2016 Rteacher 2.1 authentication
The Addition Rule for General Events generalises to more than two events. For example, consider three events, A , B
and C. Then we can prove the following result:
2.3.3. Practice Questions
2.3.3.1.
In a standard pack of 52 playing cards there are four suits: Clubs, Diamonds, Hearts and Spades, and there are
thirteen cards in each suit: Ace, 2, 3, 4,5, 6,7,8,9,10, Jack, Queen, King. Hearts and Diamonds are red, and Clubs and
Spades are black. The "Royal" cards are the Jacks, Queens and Kings. I draw a card from the pack at random, each
card having an equal probability of being drawn. What are the probabilities of each of the following events:
i. I draw the King of Spades?
ii. I draw a King or a Spade?
iii. I draw a black card?
iv. I draw a royal card?
v. I draw a royal card or a Heart?
vi. I draw a royal Heart?
2.3.3.2.
Suppose I drew the Jack of Diamonds in the previous question. I now draw a second card without replacing the first
card. Does this affect the probabilities calculated above? Select those which decrease.
http://127.0.0.1:6149/ 11/14
23/10/2016 Rteacher 2.1 authentication
Grade
Reset
Answer
Solution
iv v
Grade
NA
Grade Weight
2.4. Classical probability
1
Pr(ωi ) = for i = 1, 2, … , N = |Ω|.
|Ω|
Note that axioms K1 and K2 are clearly true.
Consider an event A , then using K3,
Pr(A) = ∑ Pr(ωi )
i:ωi ∈A
that is, the sum over all outcomes belonging to event A , which can be written as
|A|
Pr(A) =
|Ω|
that is the number of outcomes in the event of interest divided by the number of events in the sample space we saw
examples of this earlier.
So, in the classical approach, it is vital to be able to count the number of outcomes in events and in the sample space
this topic is called combinatorics.
2.4.1. Basic combinatorics
The multiplication principle says that if an experiment has k stages with n1 possible outcomes at the first stage, n2 at
the second, …, and nk at the kth stage, then the total number of possible outcomes of the experiment is
|Ω| = n 1 × n 2 × ⋯ × n k .
http://127.0.0.1:6149/ 12/14
23/10/2016 Rteacher 2.1 authentication
This principle also allows us to breakdown events into stages nd to count the number of outcomes in the event as a
product of the outcomes of the stages.
Suppose we have n distinct objects, then the total number of ordered arrangements, or permutations, is
Pn = n × (n − 1) × (n − 2) × ⋯ × 2 × 1 = n!
and we say "n factorial".
Example
How many different arrangements of the letter a, b , c and d are possible?
Here n = 4 and so there are P4 = 4! = 4 × 3 × 2 × 1 = 24 .
Here they are:
Note that each letter occurs only once in each arrangement.
Suppose now that we only select r of the n objects and permute these. Then the number of permutations of r objects
selected from n objects is
n
n!
Pr = n × (n − 1) × ⋯ (n − r + 1) =
(n − r)!
Example
How many different arrangements of exactly two of the four letters a, b , c and d are possible?
Note that ab and ba both appear because they are distinct arrangements.
Example
Suppose 20 people enter a 100m race. How many ways are there of distributing a Gold, Silver and Bronze?
n!
http://127.0.0.1:6149/ 13/14
23/10/2016 Rteacher 2.1 authentication
n
n!
Cr =
r!(n − r)!
n
n!
Cr = .
r!(n − r)!
Example
How many different ways are there of selecting exactly two of the four letters, a, b , c and d ?
bc, bd,
cd.
Note that ba, selecting b and a, does not appear because it is the same as ab, selecting a and b .
Example
A football captain has to choose four other players to complete his team from a squad of 20. How many possible
combinations are possible?
20! 20 × 19 × 18 × 17
= = 4845.
4!(20 − 4)! 4 × 3 × 2 × 1
Suppose now that we are again interested in permutations of n objects, but that there are r of one type and (n − r)
of a second type. Then, the number of such permutations is again
n
n!
Cr = .
r!(n − r)!
We shall see later in the module that these coefficients also arise when dealing with the binomial distribution, and
hence are sometimes called binomial coefficients.
http://127.0.0.1:6149/ 14/14