Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1. Random Experiments
The basic notion in probability is that of a random experiment: an experiment whose outcome
cannot be determined in advance, but is nevertheless still subject to analysis. Examples of random
experiments are:
(1) tossing a die,
(2) measuring the amount of rainfall in Brisbane in January,
(3) counting the number of calls arriving at a telephone exchange during a fixed time period,
(4) selecting a random sample of fifty people and observing the number of left-handers,
(5) choosing at random ten people and measuring their height.
Example 1. (Coin Tossing) The most fundamental stochastic experiment is the experiment where a
coin is tossed a number of times, say n times. Indeed, much of probability theory can be based on this
simple experiment, as we shall see in subsequent chapters.
1.1. Sample Space. Although we cannot predict the outcome of a random experiment with certainty
we usually can specify a set of possible outcomes. This gives the first ingredient in our model for a
random experiment.
Definition 1.1. The sample space of a random experiment is the set of all possible outcomes of the
experiment.
Examples of random experiments with their sample spaces are:
(1) Cast two dice consecutively,
= (1, 1), (1, 2), ..., (1, 6), (2, 1), ..., (6, 6).
(2) The lifetime of a machine (in days), = R+ = positive real numbers.
(3) The number of arriving calls at an exchange during a specified time interval,
= 0, 1, ... = Z+ .
(4) The heights of 10 selected people.
10
= (x1 , ..., x10 ), xi 0, i = 1, ..., 10 = R+
.
Here (x1 , ..., x10 ) represents the outcome that the length of the first selected person is x1 , the length
of the second person is x2 , etc.
Notice that for modelling purposes it is often easier to take the sample space larger than necessary.
For example the actual lifetime of a machine would certainly not span the entire positive real axis.
And the heights of the 10 selected people would not exceed 3 metres.
1.2. Events. Often we are not interested in a single outcome but in whether or not one of a group
of outcomes occurs. Such subsets of the sample space are called events. Events will be denoted by
capital letters A, B, C, We say that event A occurs if the outcome of the experiment is one of the
elements in A.
Examples of events are:
(1) The event that the sum of two dice is 10 or more,
A = {(4, 6), (5, 5), (5, 6), (6, 4), (6, 5), (6, 6)}.
(2) The event that a machine lives less than 1000 days,
A = [0, 1000).
(3) The event that out of fifty selected people, five are left-handed, A = 5.
1
RANDOM VARIABLES
Example 2. (Coin Tossing) Suppose that a coin is tossed 3 times, and that we record every head and
tail (not only the number of heads or tails). The sample space can then be written as
= {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T },
where, for example, HTH means that the first toss is heads, the second tails, and the third heads.
An alternative sample space is the set {0, 1}3 of binary vectors of length 3, e.g., HTH corresponds to
(1,0,1), and THH to (0,1,1).
The event A that the third toss is heads is
A = {HHH, HT H, T HH, T T H}.
Since events are sets, we can apply the usual set operations to them:
(1) the set A B (A union B) is the event that A or B or both occur,
(2) the set A B (A intersection B) is the event that A and B both occur,
(3) the event Ac (A complement) is the event that A does not occur,
(4) if A B (A is a subset of B) then event A is said to imply event B.
Two events A and B which have no outcomes in common, that is, A B = , are called disjoint events.
Example 3. Suppose we cast two dice consecutively.
The sample space is = {(1, 1), (1, 2), ..., (1, 6), (2, 1), ..(6, 6)}. Let A = (6, 1), . . . , (6, 6) be the
event that the first die is 6, and let B = (1, 6), ..., (1, 6) be the event that the second dice is 6. Then
A B = (6, 1), ..., (6, 6)(1, 6), ..., (6, 6) = (6, 6) is the event that both die are 6.
2. Probability
The next ingredient in the model for a random experiment is the specification of the probability of
the events. It tells us how likely it is that a particular event will occur.
Definition 2.1. A probability P is a rule (function) which assigns a positive number to each event,
and which satisfies the following axioms:
Axiom 1. P (A) 0.
Axiom 2. P () = 1.
Axiom 3. For any sequence A1 , A2 , of disjoint events we have P (Ai ) =
P (Ai )
RANDOM VARIABLES
is to specify first the probability pi of each elementary event {ai } and then to define P (A) =
pi ,
i:ai A
for all A .
Theorem 2.3. (Equilikely Principle) If has a finite number of outcomes, and all are equally likely,
then the probability of each event A is defined as
P (A) =
|A|
.
||
Thus for such sample spaces the calculation of probabilities reduces to counting the number of outcomes
(in A and ).
When the sample space is not countable, for example = R+ , it is said to be continuous.
Example 5. We draw at random a point in the interval [0, 1]. Each point is equally likely to be drawn.
How do we specify the model for this experiment?
Sol. The sample space is obviously = [0, 1], which is a continuous sample space. We cannot define
P via the elementary events {x}, x [0, 1] because each of these events must have probability 0 (!).
However we can define P as follows:
For each 0 a b 1, let
P ([a, b]) = b a.
This completely specifies P. In particular, we can find the probability that the point falls into any
(sufficiently nice) set A as the length of that set.
3. Random Variables
Specifying a model for a random experiment via a complete description of and P may not always
be convenient or necessary. In practice we are only interested in various observations (i.e., numerical
measurements) of the experiment. We include these into our modelling process via the introduction of
random variables.
Formally a random variable is a function from the sample space to R. Here are concrete examples.
Example 6. Two balls are drawn in succession without replacement from an urn containing 4 red balls
and 3 black balls. The possible outcomes and the values x of the random variable X, where X is the
number of red balls, are
Sample Space x
RR
2
RB
1
BR
1
BB
0
Example 7. (Sum of two dice) Suppose we toss two fair dice and note their sum. If we throw the dice
one-by-one and observe each throw, the sample space is = {(1, 1), ..., (6, 6)}. The function X, defined
by X(i, j) = i + j, is a random variable, which maps the outcome (i, j) to the sum i + j. For example
we take the set of all outcomes whose sum is 8. A natural notation for this set is to write {X = 8}.
Since this set has 5 outcomes, and all outcomes in are equally likely, we have
5
P ({X = 8}) = .
36
This notation is very suggestive and convenient. From a non-mathematical viewpoint we can interpret X as a random variable. That is a variable that can take on several values, with certain
probabilities. In particular it is not difficult to check that
6 |7 x|
, x = 2, , 12.
36
Although random variables are, mathematically speaking, functions, it is often convenient to view
random variables as observations of a random experiment that has not yet been carried out. In other
words, a random variable is considered as a measurement that becomes available once we carry out the
random experiment, e.g., tomorrow. However, all the thinking about the experiment and measurements
P ({X = x}) =
RANDOM VARIABLES
can be done today. For example, we can specify today exactly the probabilities pertaining to the random
variables.
We usually denote random variables with capital letters from the last part of the alphabet, e.g.
X, X1 , X2 , , Y, Z. Random variables allow us to use natural and intuitive notations for certain events,
such as {X = 10}, {X > 1000}, etc.
We give some more examples of random variables without specifying the sample space.
The set of all possible values a random variable X can take is called the range of X. We further
distinguish between discrete and continuous random variables.
3.1. Discrete and Continuous Random Variable. Discrete random variables can only take isolated values. For example: a count can only take non-negative integer values.
Continuous random variables can take values in an interval. For example: rainfall measurements,
lifetimes of components, lengths, ... are (at least in principle) continuous.
4. Probability Distribution
Let X be a random variable. We would like to specify the probabilities of events such as {X = x}
and {a X b}.
If we can specify all probabilities involving X, we say that we have specified the probability distribution of X.
One way to specify the probability distribution is to give the probabilities of all events of the form
{X x}, x R. This leads to the following definition.
Definition 4.1. The cumulative distribution function (cdf ) of a random variable X is the function
F : R [0, 1] defined by
F (x) := P (X x), x R.
The following properties for F are a direct consequence of the three Axioms for Probability.
Theorem 4.2. 1. F is right-continuous: limh0 F (x + h) = F (x),
2. lim F (x) = 0; lim F (x) = 1.
x
3. F is increasing.
4. 0 F (x) 1.
Any function F with the above properties can be used to specify the distribution of a random
variable X. Suppose that X has cdf F . Then the probability that X takes a value in the interval (a, b]
(excluding a, including b) is given by
P (a < X b) = F (b) F (a).
Namely, P (X b) = P ({X a} {a < X b}), where the events {X a} and {a < X b} are
disjoint. Thus, by the sum rule:
F (b) = F (a) + P (a < X b),
which leads to the result above. Note however that
P (a X b) = F (b) F (a) + P (X = a)
= F (b) F (a) + F (a) lim F (a h)
h0
In practice we will specify the distribution of a random variable in a different way, whereby we make
the distinction between discrete and continuous random variables.
RANDOM VARIABLES
1
1 + ex
lim F (x) = 1.
Also
F 0 (x) =
ex
>0
(1 + ex )2
RANDOM VARIABLES
f (x) = 1
x=0
10k 2 + 9k 1 = 0
which gives
k=
1
or k = 1.
10
1
.
10
P (X < 6) = P (X = 0) + P (X = 1) + + P (X = 5) = 81/100.
P (X 6) = 1 P (X < 6) = 1 81/100 = 19/100.
(iii) Distribution Function is given in the following table.
X P (X = x)
F (x) = P (X x)
0
0
0
1
k=1/10
1/10
2
2k
k+2k=3k=3/10
3
2k
3k+2k=5k=5/10
4
3k
5k+3k=8k=4/5
5
k2
8k + k 2 = 81/100
6
2k 2
8k + k 2 + 2k 2 = 8k + 3k 2 = 83/100
2
7
7k + k
9k + 10k 2 = 1
Example 12. If a car agency sells 50% of its inventory of a certain foreign car equipped with side
airbags, find a formula for the probability distribution of the number of cars with side airbags among
the next 4 cars sold by the agency. Also find F (x).
Sol. Since the probability of selling an automobile with side airbags is 0.5, the 24 = 16 points in the
sample space are equally likely to occur. Let X denote the number of car models with side airbags,
the probability distribution is
4
1 4
, for x = 0, 1, 2, 3, 4.
f (x) = P (X = x) =
(1/2)(1/2)(1/2)(1/2) =
16 x
x
Direct calculations give f (0) = 1/16, f (1) = 1/4, f (2) = 3/8, f (3) = 1/4, and f (4) = 1/16. Therefore,
1
F (0) = f (0) = ,
16
5
F (1) = f (0) + f (1) = ,
16
11
F (2) = f (0) + f (1) + f (2) = ,
16
15
F (3) = f (0) + f (1) + f (2) + f (3) = ,
16
F (4) = f (0) + f (1) + f (2) + f (3) + f (4) = 1.
Hence
F (x) =
0, x < 0,
16 , 0 x < 1,
16
, 1 x < 2,
11
16 ,
15
16 ,
2 x < 3,
3 x < 4,
1, x 4.
RANDOM VARIABLES
It is often helpful to look at a probability distribution in graphic form. One might plot the points
(x, f (x)) of previous Example. Instead of plotting the points (x, f (x)), we more frequently construct
rectangles, as in second Figure. Here the rectangles are constructed so that their bases of equal width
are centered at each value x and their heights are equal to the corresponding probabilities given by
f (x). This Figure is called a probability histogram.
0, x < 0,
x/2, 0 x 2,
F (x) =
1, x > 2.
RANDOM VARIABLES
By differentiating F we find
1/2, 0 x 2,
0, otherwise.
Note that this density is constant on the interval [0, 2] (and zero elsewhere), reflecting that each point
in [0, 2] is equally likely. Note also that we have modeled this random experiment using a continuous
random variable and its pdf (and cdf).
Describing an experiment via a random variable and its pdf, pmf or cdf seems much easier than
describing the experiment by giving the probability space. In fact, we have not used a probability
space in the above examples.
f (x) =
f (x)dx =
1
x2
dx = 8/9 + 1/9 = 1,
3
Z
P (0 < X 1) =
dt = 1/9.
F (x) =
f (t)dt =
t2
x3 + 1
dt =
.
3
9
Therefore
0, x < 1,
x3 +1
F (x) =
, 1 x < 2,
9
1, x 2.
We can also use the cumulative distribution function F (x) to find the probability P (0 < X 1).
P (0 < X 1) = F (1) F (0) = 2/9 1/9 = 1/9.
Example 15. The Department of Energy (DOE) puts projects out on bid and generally estimates what
a reasonable bid should be. Call the estimate b. The DOE has determined that the density function of
the winning (low) bid is
(
5
, 2b/5 < x < 2b,
f (x) =
8b
0, elsewhere.
Find F (x) and use it to determine the probability that the winning bid is less than the DOEs preliminary
estimate b.
Sol. For 2b/5 x 2b,
Z
F (x) =
2b/5
Thus
5
5x 1
dx =
.
8b
8b
4
0, x < 2b/5,
5x
14 , 2b/5 x 2b,
F (x) =
8b
5
8b , 2b/5 < x < 2b, 1, x 2b.
To determine the probability that the winning bid is less than the preliminary bid estimate b, we have
5 1
3
F (X b) = F (b) = = .
8 4
8
RANDOM VARIABLES
3. P (X = x, Y = y) = f (x, y).
PP
For any region A in the xy plane, P [(X, Y ) A] =
f (x, y).
A
Example 16. Two ballpoint pens are selected at random from a box that contains 3 blue pens, 2 red
pens, and 3 green pens. If X is the number of blue pens selected and Y is the number of red pens
selected, find
(a) the joint probability function f (x, y),
(b) P [(X, Y ) A], where A is the region {(x, y)|x + y 1}.
Sol. The possible pairs of values (x, y) are (0, 0), (0, 1), (1, 0), (1, 1), (0, 2), and (2, 0).
(a) Now, f (0, 1), for example, represents the probability that a red and a green
pens are selected. The
8
total number of equally likely ways of selecting any 2 pens from the 8 is 2 = 28. The number of ways
of selecting 1 red from 2 red pens and 1 green from 3 green pens is 21 31 = 6.
Hence, f (0, 1) = 6/28 = 3/14.
Similar calculations yield the probabilities for the other cases, which are presented in following Table.
Note that the probabilities sum to 1.
The joint probability distribution of can be represented by the formula
3
3 2
f (x, y) =
for x = 0, 1, 2; y = 0, 1, 2; and 0 x + y 2.
f (x, y)
0
1
2
0
3/28
3/14
1/28
2xy
8
2
x
1
9/28
3/14
0
2
3/28
0
0
10
2.
RANDOM VARIABLES
R R
f (x, y)dxdy = 1,
RR
3. P [(X, Y ) A] =
A f (x, y)dxdy, for any region A in the xy plane.
Example 17. A privately owned business operates both a drive-in facility and a walk-in facility. On
a randomly selected day, let X and Y , respectively, be the proportions of the time that the drive-in and
the walk-in facilities are in use, and suppose that the joint density function of these random variables
is
2
5 (2x + 3y), 0 x 1, 0 y 1,
f (x, y) =
0, elsewhere.
(a) Verify the f (x, y) is a pdf.
(b) Find P [(X, Y ) A], where A = {(x, y)|0 < x < 1/2, 1/4 < y < 1/2}.
Sol. (a) The integration of f (x, y) over the whole region is
Z Z
Z1 Z1
f (x, y)dxdy =
2
5
2
(2x + 3y)dxdy
5
Z1
(1 + 3y)dy
0
= 1.
(b) We write
P [(X, Y ) A] = P (0 < X < 1/2, 1/4 < Y < 1/2)
Z1/2Z1/2
=
(2x + 3y)dxdy
1/4 0
Z1/2
=
(1/10 + 3y/5)dy
1/4
= 13/160.
5.1. Marginal Distributions. Given the joint probability distribution f (x, y) of the discrete random
variables X and Y , the probability distribution g(x) of X alone is obtained by summing f (x, y) over the
values of Y . Similarly, the probability distribution h(y) of Y alone is obtained by summing f (x, y) over
the values of X. We define g(x) and h(y) to be the marginal distributions of X and Y , respectively.
When X and Y are continuous random variables, summations are replaced by integrals. We can now
make the following general definition.
Definition 5.3. The marginal distribution of X alone and of Y alone are
X
X
g(x) =
f (x, y), and h(y) =
f (x, y)
y
Z
f (x, y)dy,
and
h(y) =
f (x, y)dx
RANDOM VARIABLES
11
Sol. By definition,
Z
g(x) =
Z
f (x, y)dy =
0
2
4x + 3
(2x + 3y)dy =
,
5
5
Z
f (x, y)dx =
0
2
2(1 + 3y)
(2x + 3y)dx =
,
5
5
f (y|x) =
g(x) =
f (x, y)dy =
10xy 2 dy =
h(y) =
Z
f (x, y)dx =
10x
(1 x3 ), 0 < x < 1.
3
Now
f (y|x) =
10xy 2
3y 2
=
, 0 < x < y < 1.
10x
3
1 x3
3 (1 x )
f (x, y)
=
g(x)
(b)
Z
f (y|x = 0.25)dy =
1/2
1/2
3y 2
dy = 8/9.
1 0.253
12
RANDOM VARIABLES
Definition 5.5. Let X and Y be two random variables, discrete or continuous, with joint probability
distribution f (x, y) and marginal distributions g(x) and h(y), respectively. The random variables X
and Y are said to be statistically independent if and only if
f (x, y) = g(x)h(y)
for all (x, y) within their range.
Exercises
(1) Let W be a random variable giving the number of heads minus the number of tails in three
tosses of a coin. List the elements of the sample space for the three tosses of the coin and to
each sample point assign a value w of W .
(2) A coin is flipped until 3 heads in succession occur. List only those elements of the sample space
that require 6 or less tosses. Is this a discrete sample space? Explain.
(3) Determine the value c so that each of the following functions can serve as a probability distribution of the discrete random variable X :
(a) f (x) = c(x2 + 4), for x = 0, 1, 2, 3;
3
(b) f (x) = c x2 3x
, for x = 0, 1, 2.
(4) Suppose that you are offered a deal based on the outcome of a roll of die. If you roll a 6, you
win $10. If you roll a 4 or 5, you win $5. If you roll a 1, 2, or 3, you pay $6. Define the Random
Variable X. List the values that X may take on and construct a probability mass function.
(5) The shelf life, in days, for bottles of a certain prescribed medicine is a random variable having
the density function
20000
, x0
f (x) =
(x + 100)3
0, elsewhere.
Find the probability that a bottle of this medicine will have a shell life of
(a) at least 200 days;
(b) anywhere from 80 to 120 days.
(6) An investment firm offers its customers municipal bonds that mature after varying numbers of
years. Given that the cumulative distribution function of T , the number of years to maturity
for a randomly selected bond, is
0, t < 1,
, 1 t < 3,
4
1
F (t) =
2 , 3 t < 5,
4 , 5 t < 7,
1, t 7.
Find (a) P (T = 5); (b) P (T > 3); (c) P (1.4 < T < 6); (d) P (T 5 | T 2).
(7) A shipment of 7 television sets contains 2 defective sets. A hotel makes a random purchase of
3 of the sets. If x is the number of defective sets purchased by the hotel, find the probability
distribution of X. Express the results graphically as a probability histogram. Also find the
cumulative distribution function of the random variable X representing the number of defectives
and construct a graph.
(8) The waiting time, in hours, between successive speeders spotted by a radar unit is a continuous
random variable with cumulative distribution function
0, x < 0,
F (x) =
1 e8x , x 0
Find the probability of waiting less than 12 minutes between successive speeders
(a) using the cumulative distribution function of X; (b) using the probability density function
of X.
(9) Consider the density function
k x, 0 < x < 1,
f (x) =
0, elsewhere.
RANDOM VARIABLES
13
(a) Evaluate k. (b) Find F (x) and use it to evaluate P (0.3 < X < 0.6).
(10) From a box containing 4 black balls and 2 green balls, 3 balls are drawn in succession, each
ball being replaced in the box before the next draw is made. Find the probability distribution
for the number of green balls.
(11) The Probability distribution of a random variable X is given by
x
f (x) = k sin
, 0 x 5.
5
Determine the constant k, median, and quartiles of the distribution.
(12) The time to failure in hours of an important piece of electronic equipment used in a manufactured DVD player has the density function
(
1
exp(x/2000), x 0
f (x) =
2000
0, x < 0.
(a) Find F (x).
(b) Determine the probability that the component (and thus the DVD player) lasts more than
1000 hours before the component needs to be replaced.
(c) Determine the probability that the component fails before 2000 hours.
(13) Let X be a continuous random variable with pdf given by
ax,
0x<1
a,
1x<2
f (x) =
ax
+
3a,
2x<3
0,
otherwise.
(i) Find a, (ii) determine cdf F (x), (iii) if x1 , x2 and x3 are three independent observations
from X, what is the probability that exactly one of these numbers is larger than 1.5.
(14) Suppose that the life in hours of a certain part of radio tube is a continuous random variable
X with pdf given by
(
100
, x 100
f (x) =
x2
0, otherwise
(i) What is the probability that all of three such tubes in a given radio set will have to be
replacing during the 150 hours of operation?
(ii) What is the probability that none of the three of the original tubes will have to be replaced
during that 150 hours of operation?
(iii) What is the probability that a tube will less than 200 hours if it is known that the tube is
still functioning after 150 hours of service?
(15) If the joint probability distribution of X and Y is given by
x+y
f (x, y) =
, for x = 0, 1, 2, 3; y = 0, 1, 2,
30
find (a) P (X 2, Y = 1); (b) P (X > 2, Y 1); (c) P (X > Y ); (d) P (X + Y = 4).
(16) Given the joint density function
x(1 + 3y 2 )
, 0 < x < 2, 0 < y < 1,
f (x, y) =
4
0, otherwise.
Find g(x), h(y), f (x|y), and evaluate P (1/4 < X < 1/2 | Y = 1/3).
Bibliography
[Walpole]
[Miller]
Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, and Keying E. Ye. Probability
and Statistics for Engineers and Scientists, 9th edition, Pearson, 2011.
Richard A. Johnson, I. Miller, and J. Freund. Miller & Freunds Probability and Statistics
for Engineers, 8th edition, Prentice Hall India, 2011.