Sei sulla pagina 1di 18

Probability and Stochastic Processes for Engineers, 525.

414, 1/4/05
Module 1
The Axioms of Probability

SET THEORY
We begin our discussion of set theory with a number of definitions and specifications of
nomenclature. A set is a collection of objects called elements. The subset, B, of a set A is such
that all of the elements of B are contained in A. The set notation is as follows:
A =

{ 1 , 2 , n } ,

where the Greek letters zeta, , are the elements of the set A. Alternatively one can write

i A ; " i is an element of A"


or

i A ; " i is not an element of A"


We denote the empty set, the one that contains no elements as {} . This is also known as the
null set. If a set, A, contains n elements, then the total number of subsets is 2n. This is without
regard to order. For example, if the set consists of {a, b} , then the possible subsets are

{} , {a} , {b} , {a, b} . Note that the subset {a, b}


indistinguishable from {b, a} .

is contained within A and is

For example, denote the faces of a die by fi. These faces are elements of the set
S = { f1 , f 2 , , f 6 } . Here n = 6 , thus S has 26 = 64 subsets,

{} , { f1} ,

, { f 6 } , { f1 f 2 } ,

, { f1 f 2 f3 } ,

,S .

These subsets are represented by their elements.


Another example is to toss a coin twice. The outcomes are the set S = {hh, ht , th, tt} .
There are 24 = 16 subsets, for example
A =
B =
C

D =

{heads at first toss} = {hh, ht} ;


{only one head} = {th, ht} ;
{head at first toss, tail on second}
{no head or tail} = {} , etc.

{ ht } ;

These subsets are represented by their properties. The above examples dealt with discrete events;
an example of a continuum of events might be the set of all points on the square of dimension
( 0,T ) , where S = {set of all points in square} (see sketch).
y

T
0 x T
0 y T

A possible subset of S might be the points such that x y , the shaded portion in this sketch:
x< y

x= y

Set Operations
We can represent sets graphically in terms of what are known as Venn diagrams. For
example the statement

A B " A is a subset of B "


could be represented in terms of a Venn diagram as shown below

Properties of Set Operations


There are a number of properties associated with set operations, such as
Transitivity:
Equality:

If

C B and
A = B IFF

B A then C A
A B and

B A

You should be able to demonstrate these properties using Venn diagrams. We also define a
number of operations:
Union:

"A + B" or "A B" or "A B" .


S

A B

This is the set whose elements are in set A or the set B, or in both sets A and B. This union
operation is associative and commutative:
Associative:
Commutative:

( A + B) + C

= A + ( B + C) ;

A +B = B+A.

Note that if B A , then B A = A and A A = A,

A {} = A . We also make the

definition:
Intersection:

"A B" or "A B" or "A B" .

A B

This is the set with elements that are common to both A and B. This operation is commutative,
associative, and distributive:
Commutative:
Associative:
Distributive:

AB = BA;
( A B) C = A ( B C) ;

A ( B C) = A B A C .

Note that if A B , then A B = A . As a result, A A = A , {} A = {} , A S = A .


In this course we will be using commas in very specific ways. Depending on the context,
they can mean very different things. See A Note on the Use of Commas on the Mathematical
Reference Tools page.
We now discuss a number of definitions. Two sets, A, and B, are said to be mutually
exclusive or disjoint if they have no common elements. This property is written

AB =

{} ,

or

A B =

{} .

A partition of a set is a collection of mutually exclusive and exhaustive subsets of the original
set. For example, a partition of the set, S, would be denoted

[ A1 , A2 , An ] ,

where
A1 + A2 + + An

= S;

Ai Aj

{}

for i j .

The complement, A , of a set is the set consisting of all elements of S that are not in A. It follows
that
A+ A = S ;

AA =

{}

;S

{} .

An important relationship is known as De Morgans law, the statement of which is

A+ B =

AB ;

AB =

A+B.

You can see a graphical depiction of this relationship in De Morgans Law on the
Mathematical Reference Tools page. It follows directly from this law that if unions and
intersections are interchanged and all sets are replaced by their complements, then the identity is
unchanged.
original
A
A
or +
or
S

replacement
A
A
or
or +

{}

{}

PROBABILITY SPACE

We define the following


space, S

certain event ;

elements of S experimental outcomes ;


subsets of S events .
As an example, for the casting of the die, the possible outcomes are the six faces of the die. The
space is denoted as
S

{ f1 , f 2 , f3 , f 4 , f5 , f 6 } ,

and there are 26 = 64 possible subsets of this space.


As an exercise, you are encouraged to consider the following: For the experiment of tossing a
coin twice, where the possible outcomes are {hh, ht , th, tt} , enumerate all the possible subsets

(there are 24 = 16 such subsets). If we choose to define a subset as {even} = { f 2 , f 4 , f 6 } , where


the event {even} , consists of three outcomes, f 2 , f 4 , f 6 .

We define a trial as a single performance of an experiment. At each trial we observe a single


outcome, i . The event A occurs during this trial if it contains the element (outcome) i .

Axioms of the Theory of Probability


Definition: P ( A ) = probability that we assign to the event A.

The Axioms of Probability


P ( A) 0

P(S ) = 1

II
III

If

AB =

{} ,

then P ( A + B ) = P ( A ) + P ( B )

Note that the last axiom deals with mutual exclusivity of the events A and B.
Two important properties that follow from these axioms are that
P {} = 0 ,
and
for any A, P ( A ) = 1 P ( A ) .
Proof of the second property relies on the fact that P ( A + A ) = 1 , i.e., the certain event (axiom

#2) and P ( A + A ) = P ( A ) + P ( A ) because of the mutual exclusivity of the two events (axiom
#3). Note that P ( A ) = 1 does not imply that A is the certain event and that P ( A ) = 0 does not
imply that A is the empty set, {} (see the Equality of events discussion that follows).
Now in general, the probability of the union of two events is given by
P ( A + B ) = P ( A ) + P ( B ) P ( AB ) .

As an exercise, draw the Venn diagrams for the following proof of this relationship.
The proof is given by writing A+B and B as mutually exclusive events:
A+ B =
B =

A + AB ;

AB + AB .

From axiom #3,


P ( A + B ) = P ( A ) + P ( AB ) ;
P ( B ) = P ( AB ) + P ( AB ) .

Eliminating P ( AB ) from these two equations yields the desired result.


Example
Consider a subset of the event A, B

Then P ( A ) = P ( B ) + P ( AB ) . But from axiom #1, P ( AB ) 0 ,

P ( A) P ( B ) .

Equality of events
The events A and B are equal if A and B consist of the same elements. The events A
and B are equal with probability 1 if P ( AB + AB ) = 0 . What are the implications of this? Well,

if P ( A ) = P ( B ) then A and B are equal in probability. This does not mean that A and B are

equal. In fact, they may be mutually exclusive. Moreover, " P ( A ) = P ( B ) " tells us nothing about
P ( AB ) . But if P ( AB + AB ) = 0 , then we conclude that

P ( A ) = P ( B ) = P ( AB ) = 1.

Class, F , of events
For one reason or another, we may not be interested in all possible subsets of a set. For
example, in the casting of a die, we may be interested only in the showing of an even number of
spots. In this case, the class, F , of subsets that we would consider consists of the events
{} , {even} , {odd } , S . We thus would assign probabilities only to these four events. In other
cases, it may not be possible to assign probabilities to all possible outcomes, e.g., the probability
of choosing a particular real number (the set of real numbers is uncountably infinite). We thus,
make the following definition:
8

Field: A field, F , is an non-empty class of sets such that

If

If A F then A F
A F and B F then A B F .

If

A F

It follows that
and B F

then

A BF .

A field also contains the certain event, S, as well as the impossible event, {} .
Finally, all subsets that can be written as unions or intersections of subsets of F are also in F .
Borel Fields
Suppose A1 , A2 , , An is an infinite sequence of sets in F . If all unions and intersections
of these sets also belong to F , then F is a Borel field. The class of all subsets of S is a Borel
field. Suppose C is a class of subsets of S, but is not a Borel field. If we attach other subsets of
S, we can form a field with C as a subset. In fact there exists a smallest Borel field containing all
the elements of C .
Example
S

{a, b, c, d }

{a} , {b} .

The smallest (Borel) field containing {a} , {b} consists of the sets

{} , {a} , {b} , {a, b} , {c, d } , {b, c, d } , {a, c, d } , S .


In probability theory, events are certain subsets of S forming a Borel field. This allows us to
assign probabilities to finite unions and intersections of events and also to their limits.
Specifically, repeated application of the third axiom leads to
P ( A1 + A2 +

+ An ) = P ( A1 ) + P ( A2 ) +

+ P ( An ) .

Extension to infinitely many sets does not follow directly. Rather, we cite the axiom of infinite
additivity:

Axiom of Infinite Additivity, IIIa:


P ( A1 + A2 +

= P ( A1 ) + P ( A2 ) +

Axiomatic Definition of an Experiment


We define an experiment in terms of the following concepts:
1) the set, S, of all experimental outcomes
2) the Borel field of all events in S
3) the probabilities of these events
=

An example is the coin toss, S

{h, t} . The events are the four sets


{} , {h} , {t} , {h, t} .

Question: Is the a Borel field? Ans: Yes, it contains complements, unions and intersections of the
elementary events. We assign probabilities as follows:
P ( ) = 0;
P ( t ) = q;

P ( h) =
P(S ) =

p
p + q = 1.

These concepts have addressed discrete events, but can be extended to the real line. For example,
suppose ( x ) is a function such that

( x ) dx = 1;

( x) 0 .

We can define the probability of the event { x xi } by the integral


P { x xi } =

xi

( x ) dx .

Likewise,
P { x1 < x x2 } =

x2

x1

( x ) dx .

Moreover, if { x x1} & { x1 < x x2 } are mutually exclusive, and

10

{ x x1} { x1 < x x2 }

{ x x2 } ,

then
P { x x1} + P { x1 < x x2 } = P { x x2 } .

CONDITIONAL PROBABILITY

The probability of a given event provided that another event has occurred is called a conditional
probability. For example, the conditional probability of the event A, given that event M has
occurred is written
P( A| M ) =

P ( AM )
; P(M ) 0 .
P(M )

Some properties of this relationship follow: If M is a subset of A ( M A ) , then P ( A | M ) = 1


(because AM = M ). If on the other hand, A is a subset of M, then
P( A| M ) =

P ( A)
P(M )

P ( A)

(because AM = A ).
Question: Do conditional probabilities satisfy the Axioms of Probability?
1) Because P ( AM ) 0 & P ( M ) > 0

Then P ( A | M ) 0 : 1st Axiom OK

2) Because M S , (certain event)


Then P ( S | M ) = 1 : 2nd Axiom OK
3) Suppose A and B are mutually exclusive events. Then AM and BM are also mutually
exclusive:

11

P (( A + B ) | M ) =

P (( A + B ) M )
P(M )

P ( AM ) + P ( BM )
P(M )

= P( A| M ) + P(B | M )

3rd Axiom OK

We conclude that all results involving probabilities hold also for conditional probabilities.
Total Probability and Bayes Theorem (See biographical note on the Glossary page)
For U

[ A1 , A2 ,

An ] a partition of S and B = an arbitrary event, then


P ( B ) = P ( B | A1 ) P ( A1 ) +

+ P ( B | An ) P ( An ) .

(1)

Proof:
B = BS

= B ( A1 + A2 +

= BA1 + BA2 +

+ An )

+ BAn .

Each of these terms is mutually exclusive because of the partition. Therefore


P ( B ) = P ( BA1 ) + P ( BA2 ) +

+ P ( BAn ) ,

(2)

and Eq. 1 follows because


P ( BAi ) = P ( B | Ai ) P ( Ai ) .

Furthermore, because
P ( BAi ) = P ( Ai | B ) P ( B ) ;
P ( Ai | B ) = P ( B | Ai )

P ( Ai )
,
P ( B)

and from Eq. 2 it follows that

12

P ( BAi ) = P ( Ai | B ) P ( B ) ,

therefore we have:
Bayes Theorem

P ( Ai | B ) =

P ( B | Ai ) P ( Ai )
.
P ( B | A1 ) P ( A1 ) + + P ( B | An ) P ( An )

Nomenclature:
P ( Ai )

a priori probabilities (before the fact)

P ( Ai | B )

a posteriori probabilities (after the fact)

Example
Consider four bins containing the indicated total number of components, some good and some
defective (see sketch). The experiment consists of randomly picking a bin and from it choosing
one component. What is the probability that the chosen component is defective, P ( D ) ?
2000
(1900 g
100 d)

B1

500
(300 g
200 d)

1000
(900 g
100 d)

1000
(900 g
100 d)

B2

B3

B4

Because the chance of choosing each bin is equal,

P ( B1 ) = P ( B2 ) = P ( B3 ) = P ( B4 ) =

1
.
4

From inspection, we see that the conditional probabilities of defective components is

13

P ( D | B1 ) =

100
2000

P ( D | B2 ) =

200
500

P ( D | B3 ) =

100
1000

= 0.10

P ( D | B4 ) =

100
1000

= 0.10 .

= 0.05

= 0.40

And since [ B1 , B2 , B3 , B4 ] forms a partition of S,


P ( D ) = P ( D | B1 ) P ( B1 ) +
= 0.05

1
4

+ 0.04

+ P ( D | B4 ) P ( B4 )
1
4

+ 0.10

1
4

+ 0.10

1
4

P ( D ) = 0.1625 .

Now suppose that we examine the selected component and find that it is defective. What is the
probability that it came from bin #2 ,i.e., what is P ( B2 | D ) ?
P ( D ) = 0.1625

P ( D | B2 ) = 0.40

From the above, we have

P ( B2 ) = 0.25
Furthermore, P ( D ) P ( B2 | D ) = P ( B2 ) P ( D | B2 ) ,
or rearranging this equation, we have
P ( B2 | D ) =

P ( B2 ) P ( D | B2 )
P ( D)

( 0.25)( 0.40 )
( 0.1625 )

= 0.615 .

14

The a priori probability of selecting bin #2 is 0.25, and the a posteriori probability of defective
part having come from bin #2 is 0.615.

Independence
The events A & B are said to be independent if
P ( AB ) = P ( A ) P ( B ) .

Example
Consider two trains, X & Y, arriving in a station between 8AM and 8:20AM. Train X
stops for 4 min and train B for 5 min. We assume that the arrival times are independent. The
space of arrival times is illustrated in the sketch.

space of arrival times

20

X
20

If we define the event A = {train X arrives in interval ( t1 , t2 )} , then it seems intuitive that the
probability of this event be proportional to the length of this interval;
P ( A) =

( t2 t1 )
20

= P {t1 X t2 } .

Similarly for train Y we define B = {train Y arrives in interval ( t3 , t4 )} = {t3 Y t4 } .


And infer that
P ( B) =

( t4 t3 ) .
20

These events are illustrated in the following sketch:

15

20
t4
t3

t1

t2

X
20

Note that this may appear to be a Venn diagram, but it is not. Why? Based on our assumption of
independence, we have the following:
P ( AB ) = P ( A ) P ( B ) =

( t2 t1 )( t4 t3 ) .
400

Question: Why is this probability not the probability that both trains are in the station at the same
time?
Now consider the event C = { X y} , i.e., train X arrives before train Y. This region of the
space of arrival times is illustrated below.
y>x
Y
x=y
20

y<x

X
20

From the preceding, its obvious that P ( C ) =

200
.
400

16

Finally, we ask, What is the probability that the trains meet (the commuter knows from
experience that this probability is zero)? To attack this problem consider the sketches below.
These time lines are in fact, variations on the theme of Venn diagrams.

Train X in station

Train Y in station

x+4

y+5

To have any overlap between these two periods, we must satisfy the two following constraints:
y

y+5

x+4
x+4

y+5

x
y+5

x+4
x

The combination of these two requirements gives the event D = {4 < x y 5} . This region is
shown shaded in the sketch below,

17

Y
y=x+4
20
y=x-5

4
0

20

and the resulting probability of this event is


P {4 < x y 5} =

159.5
.
400

End Module 1

18

Potrebbero piacerti anche