Sei sulla pagina 1di 21

Comparing Distributions II:

Bayes Rule and Acceptance


Sampling

By Peter Woolf (pwoolf@umich.edu)
University of Michigan

Michigan Chemical Process
Dynamics and Controls
Open Textbook

version 1.0
Creative commons
From last lecture found that variations in the product yield
were significantly related to runny feed
One solution is to find a way to identify
runny feed before it was fed into the
process and avoid it.
Runnyfeedometer
TM

Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors
You develop an offline
tool to detect runny feed
using a cone and plate
viscometer. The test is
inexpensive, but not
always accurate due to
inhomogeneous feed.
You have a more accurate way of measuring runny feed but
it is slow and expensive, so maybe you can get away with
multiple reads on the Runnyfeedometer
TM
?


Experimental Data: 100 known runny and 100 known normal
samples tested in the Runnyfeedometer
TM

P(+ test | runny) = 98:100
P(- test | runny) = 2:100
P(+ test | normal) = 3:100
P(- test | normal) = 97:100

True positive
False negative
False positive
True negative
What are the odds
that 9 in 10 tests
on a runny sample
would all come
back positive?
P(+ test | runny) = 98:100
P(- test | runny) = 2:100
Question: What are the odds that 9 in 10 tests on a runny
sample would all come back positive?
10 combinations

10
1
|
\

|
.
|
=
10!
1!(10 1)!
=10
Probability of a particular outcome
(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*
(0.98)*(0.98)*(0.98)*(0.98)*(0.02)
Overall probability=
probability of a particular outcome*
# combinations=
10*(0.98)
9
(0.02)
1
=0.1667

Possible results:
{+,+,+,+,+,+,+,+,+,-}
{+,+,+,+,+,+,+,+,-,+}
{+,+,+,+,+,+,+,-,+,+}
{+,+,+,+,+,+,-,+,+,+}
{+,+,+,+,+,-,+,+,+,+}
{+,+,+,+,-,+,+,+,+,+}
{+,+,+,-,+,+,+,+,+,+}
{+,+,-,+,+,+,+,+,+,+}
{+,-,+,+,+,+,+,+,+,+}
{-,+,+,+,+,+,+,+,+,+}
Note: hard to list if 2 or more fail..
In our case:
P(+ test | runny) = 98:100 = p
P(- test | runny) = 2:100 = (1-p)
Binomial Distribution
Describes the probability of obtaining k events from N
independent samples of a binary outcome with known
probability.
Examples:
Odds of getting 20 heads from 30 coin tosses
Odds of finding 3 broken bolts in a box of 100

p
binomial
(k, N, p) =
N
k
|
\

|
.
|
p
k
(1 p)
Nk
=
N!
k!(N k)!
p
k
(1 p)
Nk
In Mathematica
Probability of exactly 5 heads out of 10 tosses
Probability of 0-5 heads out of 10 tosses
Probability test: What are the odds of getting 5 heads out of 10
coin tosses?
(a) 25%
(b) 50%
(c) 62%
Probability of exactly 5 heads out of 10 tosses
Probability of 0-5 heads out of 10 tosses
Probability test: What are
the odds of getting 5
heads out of 10 tosses?
Note axes are off by 1
25%
62%
(a) 25%
(b) 50%
(c) 62%
=5 Okay
No
5 Okay
Runnyfeedometer
TM

Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors


P(+ test | runny) = 98:100
P(- test | runny) = 2:100
P(+ test | normal) = 3:100
P(- test | normal) = 97:100

Given these data what acceptance sampling criteria would
be required to correctly identify a normal sample with
99.99% confidence?
Example acceptance sampling criteria:
Accept sample if from 10 samples, 3 or fewer test positive
Translation: We want the following
P(normal | 3 or fewer positive results from 10 tests)
Using our binomial distribution we can
calculate a related quantity
(0 in 10 positive: very likely normal, 10 in 10: very likely runny)
x
P(x)
Using our binomial distribution we can calculate a related
quantity
P(3 or fewer positive results from 10 tests | normal)

=
10!
i!(10 i)!
p
i
(1 p)
10i
i=0
3

Where
i=# of positive results
p= probability of a positive result
given a normal feed=0.03
If normal will get 3 positive tests with 99% probability!
Not the same!
Translation: We want the following
P(normal | 3 or fewer positive results from 10 tests)
1. Joint Probability

2. Conditional Probability

3. Marginalization
Three Probability Definitions

P(A,B)

P(A| B)

P(A) = P(A | B
i
)P(B
i
)
i=1
n

1. Joint Probability

Three Probability Definitions

P(A,B)
What is the probability of drawing
an ace first and then a jack from a
deck of 52 cards?

What is the probability of a protein
being highly expressed and
phosphorylated?

What is the probability that valves
A and B both fail?


4
52
|
\

|
.
|
4
51
|
\

|
.
|
(# highly expressed and
phosphorylated
proteins)/(total proteins)
(# times A & B fail)
(total observations)
2. Conditional Probability
Three Probability Definitions

P(A| B)
What is the probability of
drawing an ace given that you
just drew a jack from a deck of
52 cards?

What is the probability of a
protein being highly expressed
given that it is phosphorylated?


What is the probability that valve
A fails given that B has failed?

4
51
|
\

|
.
|
(# highly expressed
phosphorylated
proteins)/(total
phosphorylated proteins)
(# times A & B fail)
(total observations where B
fails)
3. Marginalization
Three Probability Definitions

P(A) = P(A | B
i
)P(B
i
)
i=1
n

What is the probability of drawing an ace given that you


just drew one other card from a deck of 52 cards?


P(Ace) = P(Ace | previous ace)P( previous ace) +
P(Ace |previous ace)P(previous ace)
P(Ace) =
3
51
|
\

|
.
|
4
52
|
\

|
.
|
+
4
51
|
\

|
.
|
48
52
|
\

|
.
|


in general if independent


Probability Algebra

P(A,B) = P(A| B)P(B)

P(A)P(B)

P(A,B) =P(A| B)P(B) =P(B| A)P(A)

P(A | B) =
P(B | A)P(A)
P(B)
Bayes Rule
We want the following
P(normal | 3 or fewer positive results from 10 tests)

P(A | B) =
P(B | A)P(A)
P(B)
Bayes Rule
P(normal | 3 or fewer positive results from 10 tests)=
P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
Marginalize Binomial distribution Prior
P(3 or fewer positive results from 10 tests | normal):
P(normal): from prior observations, what are the odds of
getting a batch of normal feed?
From previous data found normal feed in 19 of 25 samples,
so a first approximation could be 0.76
P(normal | 3 or fewer positive results from 10 tests)=
P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
=0.9998

=
10!
i!(10 i)!
p
i
(1 p)
10i
i=0
3

P(3 or fewer positive results from 10 tests): Found by


marginalizing over runny and normal
=P(3 of 10 positive | runny)P(runny)+
P(3 of 10 positive | normal)P(normal)

P(A) = P(A | B
i
)P(B
i
)
i=1
n

P(3 of 10 positive | runny)


P(+ test | runny) = 98:100
~0% of the time will a runny sample yield 3 pos.
P(runny)=1-P(normal)
= 0.24
P(normal | 3 or fewer positive results from 10 tests)=
P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)

=
10!
i!(10 i)!
p
i
(1 p)
10i
i=0
3

P(3 or fewer positive results from 10 tests): Found by


marginalizing over runny and normal

P(A) = P(A | B
i
)P(B
i
)
i=1
n

=(0)(0.24)+(0.9998)(0.76)=0.75988
P(runny | 3 or fewer positive results from 10 tests)=
(0.9998) (0.76)= 1
0.75988
Acceptance sampling criteria will
identify runny feeds essentially 100%
of the time.. May be too strict!
=P(3 of 10 positive | runny)P(runny)+
P(3 of 10 positive | normal)P(normal)
P(normal | 3 or fewer positive results from 10 tests)=
P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
Test different acceptance sampling criteria:
Acceptance
sampling criteria will
identify normal
feeds >99.99% of
the time
Remember:
0 in 10 positive: very likely normal
10 in 10 positive: very likely runny
0 to 10 positive: no information
--> 0 to 6 positive: likely normal
Runnyfeedometer
TM

Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors



Analysis result:
If 6 of 10 samples report
positive then I am
>99.99% sure the feed is
normal.
Acceptance criteria:
If 6 of 10 tests are positive, use feed, otherwise reject feed.
Q: What are the odds of rejecting normal feed?
P(normal | 7 or more positive results from 10 tests)=
P(7 or more positive results from 10 tests | normal) P(normal)
P(7 or more positive results from 10 tests)
Very rarely..
Take Home Messages
Acceptance sampling provides an easy
to implement way to eliminate variation
Basic probability rules like Bayes Rule
help to rearrange your expressions to
get to things you can solve.

Potrebbero piacerti anche