Rteacher 2.1

23/10/2016 Rteacher 2.
1 authentication
Rteacher 2.1
Load
Lecture 2
Last saved at:
NA
Save
2. Probability
2.1. Introducing probability
Probability theory is a branch of mathematics which rigorously describes uncertain or random systems and processes.
It has its root in the 16th and 17th century with work of Cardano, Fermat and Pascal, but it is also an area of modern
development and application.
Statistical theory employs probability theory in designing methods for learning about the real world through data.
Modern applications of probability and statistical theory are enormously varied. For example, determining the location
of genes in the human genome which predispose to diseases such as diabetes, heart disease and cancer; evaluating
insurance risks and pension benefits; predicting global climate change and extreme weather conditions; designing
optimal strategies for trading in the financial markets.
Put simply, probability measures the likelihood of some event occurring:
Probability 0 means that the event is impossible, whereas
Probability 1 means that the event is certain.
The larger the probability, the more likely the event.
http://127.0.0.1:6149/ 1/14
23/10/2016 Rteacher 2.1 authentication
Event Notation
Further definitions will follow, but for now consider:
An event describing some outcome of the
A
experiment
The probability of event A occurring,
Pr(A)
0 ≤ Pr(A) ≤ 1
Experiment 1: Roll three dice. Let
A = {There is at least one 6 rolled}.
By symmetry, we argue that Pr(A) = 0.42 .
Experiment 2: Measure the height of a randomly selected student. Let
B = {Height less than 1.6m}.
Empirically, we might find that Pr(B) ≈ 0.25 .
Experiment 3: Observe the weather in Leeds tomorrow. Let
C = {It will rain at some point during the day}.
Subjectively, I predict Pr(C) ≈ 0.2 .
Here we see several ways to assign probability:
i. By symmetry : considering all the possible outcomes from the experiment and computing the proportion which
meet condition A , assuming all outcomes are equally likely.
http://127.0.0.1:6149/ 2/14
ii. Empirically : repeatedly performing the random experiment under constant conditions and computing the relative
frequency with which B occurs.
iii. Subjectively : based on my experience, some informal reasoning, and possibly also past weather data.
2.2. Set theory
2.2.1. Sample space and events
Let the sample space, Ω, the Greek letter capital "omega", be the set of all possible outcomes of an experiment, and
let ω, small "omega", be an element of Ω, ω ∈ Ω . We call ω an elementary event. Let |Ω| denote the number of
elements in Ω.
More generally, an event is a set of outcomes of an experiment. We often denote an event with an uppercase letter
A , B , C, … The set A can be empty, A = ∅ , giving an impossible event, Pr(∅) = 0 , or can equal the sample
space, A = Ω , giving a certain event, Pr(Ω) = 1 . These extremes are not very interesting and so the event will

usually be a nonempty, proper subset of the sample space.
Example
Experiment: Toss three coins.
Ω = {(H, H, H), (H, H, T), (H, T, H), (H, T, T), (T, H, H), (T, H, T), (T, T, H), (T, T, T)}
Thus |Ω| = 8 .
Suppose A = {There are at least two heads} = {(H, H, H), (H, H, T), (H, T, H), (T, H, H)} .
Assuming the coins are fair, that is, they are as likely to land "Heads" as they are to land "Tails", then we can perform
|A|
the following probability calculation: |A| = 4 and so Pr(A) = =
4
8
=
1
2
.
|Ω|
Example
Experiment: Roll two dice.
Ω = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
Thus |Ω| = 36 .
Suppose
A = {The sum equals 4} = {(1, 3), (2, 2), (3, 1)}.
Notice that (1, 3) and (3, 1) are separate events) .
Then |A| = 3 and so, assuming the dice are fair,
http://127.0.0.1:6149/ |A| 3/14

|A|
Pr(A) =
|Ω|
3
=
36
1
= .
12
Example
Experiment: Measure the height of a randomly selected student. Thus Ω = ?
Suppose A = {Height more than 1.6m} . Then |A| = ?
We shall return to this later in this module.
2.2.2. Practice questions
2.2.2.1.
I roll a dice and flip a coin.
a. What is the sample space, Ω, for this experiment?
b. How many elementary events does it contain?
c. Assuming the dice and coin are both fair:
i. What is the probability of a "Head" and an even number?
ii. What is the probability of a "Head" or an even number, or both?
iii. What is the probability of a "Head" or an even number, but not both?
2.2.2.2.
I roll two dice, one after the other.
a. How many elementary events are there in the set S where the second dice shows a higher number than the first
dice?
b. How many elementary events are there in the set T where one or both of the dice shows a "1" or a "6"?
c. Assuming both dice are fair,
i. what is the probability of the event S , above?
ii. what is the probability of the event T , above?
iii. what is the probability that events S and T both occur?
2.2.3. Venn diagrams
It is useful to show the relationships between events using Venn diagrams, such as the following.
Points represent outcomes, with
the box representing the sample space, and
the shaded region represents the outcomes in an event.
http://127.0.0.1:6149/ 4/14
2.2.4. Operations with events
The union of A and B , written A ∪ B and spoken "A or B ", is the set of all outcomes belonging to at least one of

the events A and B .
The intersection of A and B , written A ∩ B and spoken "A and B ", is the set of all outcomes belonging to both A

and B .
http://127.0.0.1:6149/ 5/14
Event A and B are said to be mutually exclusive if they have no outcomes in common, and so A and B cannot occur

at the same time. For mutually exclusive events A and B , we write A ∩ B = ∅ , and we say that sets A and B are
disjoint.
The complement of event A , written A c and spoken "A complement", is the set of all outcomes which are not in A .
c
Note that Ω and ∅ , also A ∪ A and A ∩ A .
c c c
= ∅ = Ω = Ω = ∅
http://127.0.0.1:6149/ 6/14
The operations of union and intersection can be further combined to give various set identities, for example:
A ∪ B = B ∪ A
} Commutative laws
A ∩ B = B ∩ A
A ∪ (B ∪ C) = (A ∪ B) ∪ C
} Associative laws
A ∩ (B ∩ C) = (A ∩ B) ∩ C
(A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C)
} Distributive laws
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C)
We can show, intuitively, that these laws are true by carefully constructing a set of Venn diagrams ( but formal proof is
more rigorous and powerful) . As an example, consider the second distributive law. Drawing Venn diagrams of both
lefthand side and righthand side shows that the law is true. Starting with the lefthandside:
http://127.0.0.1:6149/ 7/14
Now the righthand side:
We see that the areas shaded in panels c and f are the same, and so we can claim that
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C) is a true statement.
2.3. The axioms and basic rules of probability
http://127.0.0.1:6149/ 8/14
2.3.1. Kolmogorov's axioms
Kolmogorov's axioms of probability are:
K1: Pr(A) ≥ 0 for any event A ,
K2: Pr(Ω) = 1 for any sample space Ω and
K3: Pr(A ∪ B) = Pr(A) + Pr(B) for mutually exclusive events A and B ( that is where A ∩ B .
= ∅ )
Clearly, these are very basic statements, but they are sufficient to allow many complex rules to be derived.
Consider the following basic rules:
a. Pr(A c ) = 1 − Pr(A).
Proof: Starting with the set relation:
c
A ∪ A = Ω,
consider the probability on the left and right of:
c
Pr(A ∪ A ) = Pr(Ω).
Applying Kolmogorov axioms K3 and K2 we obtain:
c
Pr(A) + Pr(A ) = 1
which leads to the required result
c
Pr(A ) = 1 − Pr(A)
b. Pr(∅) = 0 .
Proof: Start by noting that ∅ = Ω
c
, then by result ( a) above with A = Ω , we have
K2
Pr(∅) = 1 − Pr(Ω) = 1 − 1 = 0, as required.
Note that in the final step, K2 has been written over the equals sign to show that axiom K2 is needed.
c. If A ⊆ B , then Pr(A) ≤ Pr(B) .
Proof: Again, start with a set relation,
c
B = A ∪ (B ∩ A )
then using K3, since A ∩ (B ∩ A c ) , gives
c
Pr(B) = Pr(A) + Pr(B ∩ A )
and since Pr(B ∩ A c ) ≥ 0 , by K1 we get
Pr(B) ≥ Pr(A),
as required.
http://127.0.0.1:6149/ 9/14
2.3.2. The addition rule for general events
A more important rule which can be derived from the axioms of probability is known as the Additional Rule for General
Events
Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B).
The key step is to realise that we can subdivide the events in the following two ways.
Start with
c
A ∪ B = B ∪ (A ∩ B ).
Using K3, since B ∩ (A ∩ Bc ) = ∅ , gives
c
Pr(A ∪ B) = Pr(B) + Pr(A ∩ B ). (*)
Also using the set relation,
c
A = (A ∩ B) ∪ (A ∩ B )
using K3 with (A ∩ B) ∩ (A ∩ Bc ) = ∅ , we get
c
Pr(A) = Pr(A ∩ B) + Pr(A ∩ B ). (**)
Rearranging (∗∗) as Pr(A ∩ Bc ) = Pr(A) − Pr(A ∩ B) and substituting into (∗) gives
Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B),
as required.
http://127.0.0.1:6149/ 10/14
Note that if A and B are mutually exclusive, then Pr(A ∩ B) = 0 and hence the Addition Rule for General Events

reduce to K3 there is no contradiction. Hence K3 is referred to as the Addition Rule for Mutually Exclusive Events.
The Addition Rule for General Events generalises to more than two events. For example, consider three events, A , B
and C. Then we can prove the following result:
Pr(A ∪ B ∪ C) = Pr(A) + Pr(B) + Pr(C) − Pr(A ∩ B) − Pr(B ∩ C) − Pr(A ∩ C) + Pr(A ∩ B ∩ C).
2.3.3. Practice Questions
2.3.3.1.
In a standard pack of 52 playing cards there are four suits: Clubs, Diamonds, Hearts and Spades, and there are
thirteen cards in each suit: Ace, 2, 3, 4,5, 6,7,8,9,10, Jack, Queen, King. Hearts and Diamonds are red, and Clubs and
Spades are black. The "Royal" cards are the Jacks, Queens and Kings. I draw a card from the pack at random, each
card having an equal probability of being drawn. What are the probabilities of each of the following events:
i. I draw the King of Spades?
ii. I draw a King or a Spade?
iii. I draw a black card?
iv. I draw a royal card?
v. I draw a royal card or a Heart?
vi. I draw a royal Heart?
2.3.3.2.
Suppose I drew the Jack of Diamonds in the previous question. I now draw a second card without replacing the first
card. Does this affect the probabilities calculated above? Select those which decrease.
i ii iii iv v vi
http://127.0.0.1:6149/ 11/14
Grade
Reset
Answer
Solution
iv v
Grade
NA
Grade Weight
2.4. Classical probability
The classical approach to assigning probability is to consider each member of the ( finite) sample space

Ω = {ω1 , ω2 , … , ωN } to have equal probability. Then
1
Pr(ωi ) = for i = 1, 2, … , N = |Ω|.
|Ω|
Note that axioms K1 and K2 are clearly true.
Consider an event A , then using K3,
Pr(A) = ∑ Pr(ωi )
i:ωi ∈A
that is, the sum over all outcomes belonging to event A , which can be written as
|A|
Pr(A) =
|Ω|
that is the number of outcomes in the event of interest divided by the number of events in the sample space we saw
examples of this earlier.
So, in the classical approach, it is vital to be able to count the number of outcomes in events and in the sample space
this topic is called combinatorics.
2.4.1. Basic combinatorics
The multiplication principle says that if an experiment has k stages with n1 possible outcomes at the first stage, n2 at
the second, …, and nk at the kth stage, then the total number of possible outcomes of the experiment is
|Ω| = n 1 × n 2 × ⋯ × n k .
http://127.0.0.1:6149/ 12/14
This principle also allows us to breakdown events into stages nd to count the number of outcomes in the event as a
product of the outcomes of the stages.
Suppose we have n distinct objects, then the total number of ordered arrangements, or permutations, is
Pn = n × (n − 1) × (n − 2) × ⋯ × 2 × 1 = n!
and we say "n factorial".
Example
How many different arrangements of the letter a, b , c and d are possible?
Here n = 4 and so there are P4 = 4! = 4 × 3 × 2 × 1 = 24 .
Here they are:
abcd, abdc, acbd, acdb, adbc, adcb,
bacd, badc, bcad, bcda, bdac, bdca,
cabd, cadb, cbad, cbda, cdab, cdba,
dabc, dacb, dbac, dbca, dcab, dcba.
Note that each letter occurs only once in each arrangement.
Suppose now that we only select r of the n objects and permute these. Then the number of permutations of r objects
selected from n objects is
n
n!
Pr = n × (n − 1) × ⋯ (n − r + 1) =
(n − r)!
and we say "n perm r" or "n − p − r ".
Example
How many different arrangements of exactly two of the four letters a, b , c and d are possible?
Here n and r , so there are 4 P2 arrangements. Here they are:

4!
= 4 = 2 = = 4 × 3 = 12
2!
ab, ac, ad,
ba, bc, bd,
ca, cb, cd,
da, db, dc.
Note that ab and ba both appear because they are distinct arrangements.
Example
Suppose 20 people enter a 100m race. How many ways are there of distributing a Gold, Silver and Bronze?
Here n = 20 and r = 3 , and so we have 20 P3 =

20!
= 20 × 19 × 18 = 6840 .
(20−3)!
Suppose now that the ordering of the r selected objects is not important ( on that we have selected them) , then the

total number of combinations of r objects selected from n objects is
n!
http://127.0.0.1:6149/ 13/14
n
n!
Cr =
r!(n − r)!
and we say "n choose r" or "n − c − r". We will often use an alternative notation:
n
n!
Cr = .
r!(n − r)!
Example
How many different ways are there of selecting exactly two of the four letters, a, b , c and d ?
Here a and r , so there are 4 C2 arrangements. Here they are

4! 3
= 4 = 2 = = 4 × = 6
2!2! 2
ab, ac, ad,
bc, bd,
cd.
Note that ba, selecting b and a, does not appear because it is the same as ab, selecting a and b .
Example
A football captain has to choose four other players to complete his team from a squad of 20. How many possible
combinations are possible?
With n = 20 and when r = 4 , we have
20! 20 × 19 × 18 × 17
= = 4845.
4!(20 − 4)! 4 × 3 × 2 × 1
Suppose now that we are again interested in permutations of n objects, but that there are r of one type and (n − r)
of a second type. Then, the number of such permutations is again
n
n!
Cr = .
r!(n − r)!
We shall see later in the module that these coefficients also arise when dealing with the binomial distribution, and
hence are sometimes called binomial coefficients.
http://127.0.0.1:6149/ 14/14

Rteacher 2.1

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Rteacher 2.1

Caricato da

Copyright:

Formati disponibili

23/10/2016 Rteacher 2.

space, A = Ω , giving a certain event, Pr(Ω) = 1 . These extremes are not very interesting and so the event will

A = {The sum equals 4} = {(1, 3), (2, 2), (3, 1)}.

Notice that (1, 3) and (3, 1) are separate events) .

http://127.0.0.1:6149/ |A| 3/14

Suppose A = {Height more than 1.6m} . Then |A| = ?

The union of A and B , written A ∪ B and spoken "A or B ", is the set of all outcomes belonging to at least one of

The intersection of A and B , written A ∩ B and spoken "A and B ", is the set of all outcomes belonging to both A

Event A and B are said to be mutually exclusive if they have no outcomes in common, and so A and B cannot occur

The complement of event A , written A c and spoken "A complement", is the set of all outcomes which are not in A .

c. If A ⊆ B , then Pr(A) ≤ Pr(B) .

Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B).

Re­arranging (∗∗) as Pr(A ∩ Bc ) = Pr(A) − Pr(A ∩ B) and substituting into (∗) gives

Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B),

Note that if A and B are mutually exclusive, then Pr(A ∩ B) = 0 and hence the Addition Rule for General Events

Pr(A ∪ B ∪ C) = Pr(A) + Pr(B) + Pr(C) − Pr(A ∩ B) − Pr(B ∩ C) − Pr(A ∩ C) + Pr(A ∩ B ∩ C).

i ii iii iv v vi

The classical approach to assigning probability is to consider each member of the ( finite) sample space

abcd, abdc, acbd, acdb, adbc, adcb,

bacd, badc, bcad, bcda, bdac, bdca,

cabd, cadb, cbad, cbda, cdab, cdba,

dabc, dacb, dbac, dbca, dcab, dcba.

and we say "n perm r" or "n − p − r ".

Here n and r , so there are 4 P2 arrangements. Here they are:

ab, ac, ad,

ba, bc, bd,

ca, cb, cd,

da, db, dc.

Here n = 20 and r = 3 , and so we have 20 P3 =

Suppose now that the ordering of the r selected objects is not important ( on that we have selected them) , then the

and we say "n choose r" or "n − c − r". We will often use an alternative notation:

Here a and r , so there are 4 C2 arrangements. Here they are

ab, ac, ad,

With n = 20 and when r = 4 , we have

Potrebbero piacerti anche

Rearranging (∗∗) as Pr(A ∩ Bc ) = Pr(A) − Pr(A ∩ B) and substituting into (∗) gives