Sei sulla pagina 1di 43

1

STAT 12049 Week 7


Discrete random variables
and
discrete distributions
2
Objectives
On completion of this weeks material you
should be able to:
define a random variable, a discrete random
variable and a probability distribution
describe conditions under which the
binomial distribution should be used
describe conditions under which the Poisson
distribution should be used
calculate probabilities using the binomial and
Poisson distributions
3
Discrete random variables and discrete
distributions
It is often argued that a significant problem
with management is their failure to understand
uncertainty and variation in a system or
process.
This uncertainty needs to be incorporated into
any analysis where decisions regarding people
and processes are made.
Over the next two weeks we will discuss how
to model such uncertainty.
4
Demings Red Bead Experiment
This experiment is a useful way of illustrating
how causes are often associated with certain
occurrences when no association is justified.
You are given a jar with a large number of
beads in it: 90% are white and 10% are red.
This jar represents a production process
where a white bead is an acceptable item
and a red bead is a defect.
5
The results of the process are simulated by
each participant taking repeated samples of 50
beads (using a paddle).
The first participant, Joe, samples 3 red beads
out of his 50 he gets praised for a job well
done, since this is better than the 10%
average.
Mary now has her turn and finds 2 red beads
among the 50 her defect rate is considerably
smaller and so she is promoted to line
supervisor.
6
John find 8 defectives in the 50 and is
scolded for his poor performance.
This illustrates how easy it is to blame
workers for problems with the system.
Although this experiment seems a little
contrived, this practice occurs frequently in
the real world.
7
Management in this experiment has missed
the fact that the defects are inherent in the
process and the outcomes generated by the
random draws from the beads the workers
have no influence!
There are, of course, instances when
workers are to blame for defects (errors,
sickness, inattentiveness etc) but usually
these cases are rare compared to the
number of defects.
8
Most variations and defects are a result of
the system, so it is the system that needs
improvement the workers do not need
reprimanding.
Deming believes that 94% of problems
belong to the system and thus are the
responsibility of management and only 6%
are due to the operator.
9
We often need some way of measuring
how likely a certain event is this week
we begin to do this.
For example in the Red Bead Experiment,
it is highly unlikely that 15 or more defects
will occur in one sample but not unlikely
for 7 or more defects to occur.
10
A random variable is a variable whose
outcome is uncertain.
We can often predict the value that a random
variable will take, before the activity that leads
to this variable occurs.
For example we are uncertain about the
weight of a box of cereal until we actually
measure it, but past experience may allow us
to predict the weight with a degree of
certainty.
11
A random variable that can only take on
distinct values (often whole numbers) is a
discrete random variable.
We denote a random variable with a capital
letter such as X.
The possible outcomes of this random
variable are indicated by lower case letters.
12
If, for example, a random variable represents
the number of flaws in a product, then its
possible outcomes will be:
x=0 (no flaws),
x=1 (exactly 1 flaw),
x=2 (exactly 2 flaws) etc.
If a random variable is the outcome of a quality
inspection, then its possible values could be:
x=0 (good) and
x=1 (defective).
13
Probabilities are always non-negative numbers
between 0 and 1.
The sum of the probabilities for all possible
outcomes must always be 1.
Often these values come from past
experience.
For example, if we know that in the past 98%
of items have been acceptable and 2%
defective, then the probabilities are:
P(item is good)=P(X=0)=0.98 and
P(item is defective)=P(X=1)=0.02.
14
Example
Let X be the outcome of a roll of a fair dice.
Then we know that
P(X=1)=P(X=2)==P(X=6)=1/6
15
Example
If we toss a fair coin three times and let X be
the number of heads then we can create a
table of probabilities for the four possible
outcomes: 0, 1, 2 and 3.
x 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8
16
Let us see how these probabilities were
calculated by first listing all the possibilities:
TTT (x=0)
TTH, THT, HTT (x=1)
THH, HTH, HHT (x=2)
HHH (x=3)
Since there are eight possibilities, occurring 1,
3, 3 and 1 times respectively, the probabilities
are 1/8, 3/8, 3/8 and 1/8 respectively.
Note that these probabilities are all non-zero
and they sum to one.
17
Example
Let X be the number of defects in found in
the inspection of a microwave oven
component. From previous years
inspections, we know that the following
distribution describes these defects:
x 0 1 2 3
P(X=x) 0.7 0.2 0.06 0.04
18
We can calculate probabilities of related
events such as the probability of at most one
defect as:



The values x=0, 1, are basic outcomes as
they cannot occur at the same time.
The probability of the related event (X 1) is
the sum of the probabilities.
( ) ( ) ( ) ( )
1 0 or 1 0 1
0.7 0.2 0.9
P X P X P X P X s = = = = + =
= + =
19
Another example is the probability of at least
one defect:
( ) ( )
( ) ( ) ( )
1 1 or 2 or 3
1 2 3
0.2 0.06 0.04 0.3
P X P X
P X P X P X
> = =
= = + = + =
= + + =
20
( ) ( ) ( )
1 1 2 2 i i
i
x P X x x P X x x P X x = = = = + = +

The mean and standard deviation are often


used to describe a distribution.
The mean measures the central location and
the standard deviation the variability or
dispersion.
In general for a discrete random variable, X,
with possible outcomes x
1
, x
2
, and
associated probabilities P(X=x
1
), P(X=x
2
),
the mean is given by
21
Similarly, the standard deviation is given by
( ) ( )
( ) ( ) ( ) ( )
2
2 2
1 1 2 2
i i
i
x P X x
x P X x x P X x
o

= =
= = + = +

22
For the distribution given in the example, the
mean is


and the standard deviation is




(to 4 decimal places).
0 0.7 1 0.2 2 0.06 3 0.04
0.44
= + + +
=
( ) ( )
2 2
0 0.44 0.7 3 0.44 0.04
0.6064
0.7787
o = + +
=
=
23
The mean and standard deviation of a
distribution are different from the mean and
standard deviation of a sample.
With a distribution there is no data involved so
we call these parameters of the distribution
and use Greek letters to denote them.
The probability distribution is used to describe
the population.
With a sample we use Roman letters for mean
and standard deviation of this sample ( and
s).
x
24
and s are estimates of the parameters of the
distribution, and o.
For larger sample sizes (relative to the
population), the values and s will converge
to and o.
x
x
25
The Binomial Distribution
The following three conditions must be met in
order for a binomial distribution to be
appropriate:
1. An experiment (a trial) is conducted that can
result in only one of two possibilities.
These are usually called success (S) and
failure (F).
The probability of success is given by P(S)=u
and the probability of failure by P(F)=1-u.
26
2. You conduct n such trials. These trials
must be independent (the result of one will
not affect the result of another).
3. The random variable X, is the number of
successes among these n trials.
27
Success is whatever characteristic is
being studied.
It could be either a positive (an item
coming off a production line is defect free)
or a negative (a road accident involves
fatalities).
28
The possible values that X can take are 0, 1,
2, n.
The probabilities of these n+1 outcomes are
given by the formula


where x! (read x factorial) is defined as
x!=x(x-1)21
We refer to this as the binomial distribution.
( )
( )
( )
!
1
! !
n x
x
n
P X x
x n x
u u

= =

29
The mean of the binomial distribution is
given by

and the standard deviation by
n u =
( )
1 n o u u =
30
Example
A production process is known to result in 5%
of the items being defective. Thus u = 0.05
and 1-u =0.95.
If we sample 5 items, what is the probability
of getting exactly 1 defective item?
( ) ( ) ( )
1 4 5!
1 0.05 0.95 0.2036
1! 4!
P X = = =
( )
( )
( )
!
1
! !
n x
x
n
P X x
x n x
u u

= =

31
What is the probability of getting exactly 2
defective items?


What is the probability that we get at least one
defective item?

( ) ( ) ( )
2 3 5!
2 0.05 0.95 0.0214
2!3!
P X = = =
( ) ( ) ( ) ( )
1 1 2 5 P X P X P X P X > = = + = + + =
32
We could calculate each of the individual
probabilities and add them together or we
could use the fact the all the possible
probabilities must add to one:
( ) ( )
( ) ( )
0 5
1 1 0
5!
1 0.05 0.95
0!5!
0.2262
P X P X > = =
=
=
33
The mean and standard deviation of this
distribution would be:
( )
( )
( )( )
5 0.05 0.25
1 5 0.05 0.95 0.4873
n
n
u
o u u
= = =
= =
34
Using the binomial distribution formula works
well for a small number of trials.
In our example we had only 5 trials and could
easily produce a table of all 6 probabilities (for
x=0,1,5).
Imagine if n was 20 (or even more) we would
not want to evaluate the binomial formula so
many times!
An alternative is to use tabulated values of the
binomial distribution probabilities. (See the
table on page 198 of Ledolter).
35
Given n and u, these tables give the
probability P(X s x).
Returning to our example we could use the
tables to find the probability (as before!):


or other probabilities such as
( ) ( )
1 1 0
1 0.7738 0.2262
P X P X > = s
= =
( )
1 0.9774 P X s =
( ) ( )
2 ( 3) 2
1.0000 0.9988 0.0012
P X P X P X = = s s
= =
36
The Poisson Distribution
We use the Poisson distribution when we
have information about the average rate at
which something is occurring.
For example the average number of calls to a
switchboard per hour or the average number
of defective items coming off a production line
each day.
We define our variable, X, to be the number
of successes in a certain interval.
37
Success is whatever characteristic we are
interested in examining, so it could be either a
positive (average number of babies born per
day) or a negative (the average number of
deaths from cancer each month).
The possible outcomes for X are x=0,1,2,
(all the non-negative whole numbers).
38
Poisson probabilities are found by


where is the rate (the Poisson parameter)
and e is a known constant (e = 2.718282 to 6
decimal places).
The mean and standard deviation of a
Poisson distribution are and .
( )
!
x
P X x e
x


= =
=
o =
39
Example
A production process is known to generate
pocket calculator components, with an average
of 5 defective components per hour (=5).
The probability of getting eight or more
defective components is:


(using the tables on pages 202 and 203 of
Ledolter).
( ) ( )
8 1 7
1 0.867 0.133
P X P X > = s
= =
40
The probability of getting no defective
components would be


using the Poisson formula.
( )
0
5
5
0 0.0067
0!
P X e

= = =
41
Many statistical software packages will
calculate binomial and Poisson probabilities
for you.
Tables exist in other textbooks, which cover
more parameter values. (For example see
the STAT 11048 text Introduction to Business
Statistics by Weiers).
42
We have examined the Binomial and Poisson
distributions since we will need these to
produce control charts and to discuss
acceptance sampling later in the term.
After covering this weeks material, ensure
that you are able to identify which distribution
is appropriate for the situation you are
investigating the type of distribution you
have will determine the type of control chart
you will use.
43
Complete this weeks recommended exercises
from the study guide.

Potrebbero piacerti anche