Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ctl.mit.edu
Zippy Bright
Zippy Bright manufactures electric toothbrushes that are sold
through large retail outlets. Zippy Bright is concerned with
how variable the sales are at different stores. They requested
and received a year of weekly sales data on their premiere
product, the XP219, for three stores from one of their
retailers, Sellco.
What can we say about the weekly sales in store A?
Week Unit Sales Week Unit Sales Week Unit Sales Week Unit Sales Week Unit Sales
1
1
11
3
21
2
31
1
41
2
2
5
12
2
22
4
32
2
42
3
3
3
13
3
23
4
33
3
43
3
4
2
14
4
24
3
34
4
44
3
5
3
15
2
25
4
35
5
45
4
6
3
16
1
26
1
36
5
46
1
7
3
17
3
27
2
37
1
47
2
8
2
18
4
28
3
38
5
48
3
9
5
19
4
29
4
39
5
49
4
10
2
20
3
30
4
40
1
50
2
2
Number of Weeks
20
15
10
35%
0
1
30%
30%
25%
20%
15%
22%
22%
14%
12%
10%
5%
0%
1
Probability Table
Cumulative
Value Probability Probability
1
14%
14%
2
22%
36%
3
30%
66%
4
22%
88%
5
12%
100%
100%
100%
88%
80%
66%
60%
36%
40%
20%
14%
0%
1
3
Sales per Week
Basic Probability
Probability Theory
Notation
Events:
A = Sales = 5 units
B = Sales 4 units
C = Sales are an Odd number
D = Sales are 2 units
P(A) = complement of P(A) = probability some other event that is not A occurs
P(A)= 1 P(A) = 0.88
P(B)= 1 P(B) = 0.66
P(C) = 1 P(C) = 1 - 0.56 = 0.44
Cumulative
Value Probability Probability
1
.14
.14
2
.22
.36
3
.30
.66
4
.22
.88
5
.12
1.00
Four Laws
of Probability
Cumulative
Value Probability Probability
1
.14
.14
2
.22
.36
3
.30
.66
4
.22
.88
5
.12
1.00
Events:
A = Sales = 5 units
B = Sales 4 units
C = Sales are an Odd number
D = Sales are 2 units
P(Sales>6) = 0
P(Sales are Prime Numbers) = P(1, 2, 3, 5)= 0.78
P(Sales <6) = 1
P(Sales < 1) = 0
Four Laws
of Probability
Cumulative
Value Probability Probability
1
.14
.14
2
.22
.36
3
.30
.66
4
.22
.88
5
.12
1.00
Events:
A = Sales = 5 units
B = Sales 4 units
C = Sales are an Odd number
D = Sales are 2 units
Conditional Probability
P(A|B) = Probability that Event A occurs, GIVEN THAT Event B has occurred.
e.g., P(D|C) = P[(Sales2) Given That (Sales=1, 3, or 5)]
Four Laws
of Probability
Cumulative
Value Probability Probability
1
.14
.14
2
.22
.36
3
.30
.66
4
.22
.88
5
.12
1.00
Events:
A = Sales = 5 units
B = Sales 4 units
C = Sales are an Odd number
D = Sales are 2 units
Independence
Characterizing Uncertainty
11
Characterizing a Distribution
Several ways to characterize a distribution:
Week
1
2
3
4
5
6
7
8
9
10
Unit
Sales
1
5
3
2
3
3
3
2
5
2
Week
11
12
13
14
15
16
17
18
19
20
Unit
Sales
3
2
3
4
2
1
3
4
4
3
Week
21
22
23
24
25
26
27
28
29
30
Unit
Sales
2
4
4
3
4
1
2
3
4
4
Week
31
32
33
34
35
36
37
38
39
40
Unit
Sales
1
2
3
4
5
5
1
5
5
1
Week
41
42
43
44
45
46
47
48
49
50
Unit
Sales
2
3
3
3
4
1
2
3
4
2
12
Mode = 3
Median = 3
13
35%
22%
25%
20%
Expected Value
30%
30%
22%
14%
15%
12%
10%
Notation
5%
0%
1
E[X] = x = m = pi xi
n
i=1
1
2
3
4
5
.14
.22
.30
.22
.12
i i
.14
.44
.90
.88
.60
=2.96
14
Spread Metrics
Range maximum value minus minimum value
Inner Quartiles the 75th percentile value minus the 25th percentile value
Variance expectation of the squared deviation around the mean
2
Var[X] = s = pi ( xi - x ) = pi ( xi - m )
2
i=1
i=1
Max
75th Percentile
Range
= 5-1 = 4
25th Percentile
Inner
Quartile
= 4-2 = 2
Min
15
Spread - Variance
Variance Expectation of the squared deviation around the mean
Also called the Second Moment around the mean
2
Var[X] = s = pi ( xi - x ) = pi ( xi - m )
2
i=1
xi
1
2
3
4
5
pi
.14
.22
.30
.22
.12
pixi
.14
.44
.90
.88
.60
=2.96
i=1
xi-
-1.96
-0.96
0.04
1.04
2.04
(xi-)2
3.84
0.92
0.0016
1.08
4.16
pi(xi-)2
0.5376
0.2024
0.00048
0.2376
0.4992
2=1.48
Minimum = 1
25th Pct = 2
=Mean = 2.96
50th Pct = Median = 3
Mode = 3
75th Pct = 4
Maximum = 5
Range = 4
Inner Quartile = 2
2 = Variance = 1.48
= Standard Deviation = 1.215
CV = Coefficient of Variation = 0.411
17
18
Probability Distributions
Where do they come from?
Which is better?
19
N possible values
Each value has equal probability, i.e., pi= 1/N
Ex: Rolling a die
Poisson Distribution
Probabiity of X
30.0%
25.0%
20.0%
Uniform [1,6]
15.0%
Poisson (mean=1.5)
10.0%
5.0%
0.0%
0
Random Variable X
10
20
a = Minimum
b = Maximum
n = # of values = b a + 1
Metrics
Mean = (a + b) / 2
Median = (a + b) / 2
Mode N/A
Variance = ((b-a+1)2 1)/12
for a x b
P X = x = f (x | a,b) = n
0
otherwise
i
1
2
3
4
5
6
xi
1
2
3
4
5
6
pi
1/6
1/6
1/6
1/6
1/6
1/6
Poisson Distribution
22
Poisson Distribution
Widely used to model arrivals, slow moving inventory, etc.
Discrete distribution that cannot take negative values
Notation: P()
x
p
= mean = variance
0
otherwise
Recall:
0 61%
1 30%
2
8%
3
1%
4 0.2%
5 0.02%
0.70
0.60
0.50
0.40
0.30
0.20
0.10
-
2 3
Suppose =0.5
P[X=0] = (e-0.5 0)/(0!) = (0.607)(1)/1 = 0.61
P[X=1] = (e-0.5 1)/(1!) = (0.607)(0.5)/1 = 0.30
P[X=2] = (e-0.5 2)/(2!) = (0.607)(0.25)/2 = 0.08
P[X=3] = (e-0.5 3)/(3!) = (0.607)(0.125)/6 = 0.01
P[X=4] = (e-0.5 4)/(4!) = (0.607)(0.0625)/24 0.002
P[X=5] = (e-0.5 5)/(5!) = (0.607)(0.0312)/120 0.0002
23
0.45
Note:
0.40
0.35
0.30
= 0.75
0.25
= 2
= 5
0.20
= 10
0.15
0.10
0.05
10
11
12
13
14
15
16
17
18
19
20
24
Poisson Distribution
You are running the customer complaint
center for Zippy Bright. Customer
complaint calls come in ~P(2.2) per minute.
-l x
e l
P X = x = f (x | l ) =
x!
for x = 0,1,2,...
otherwise
P X x =
k=0
k!
1. What is the probability that no calls will come in over the next minute?
P[X=0] = (e-2.2 0)/(0!) = (0.1108)(1)/1 = 0.11 or 11%
2. What is the probability that 2 or fewer calls will come in over the next minute?
P[X=0] = (e-2.2 0)/(0!) = (0.1108)(1)/1 = 0.11 or 11%
P[X=1] = (e-2.2 1)/(1!) = (0.223)(2.2)/1 = 0.24 or 24%
P[X=2] = (e-2.2 1)/(2!) = (0.223)(4.84)/2 = 0.27 or 27%
P[X2]62%
3. What is the probability that at least 1 call will come in over the next minute?
P[X>0] = 1 P[X=0] = 1 0.11 = 0.89 or 89%
Spreadsheet
Function
Prob 1
Prob 2
Microsoft Excel
=POISSON.DIST(0, 2.2, 0)
=POISSON.DIST(2, 2.2, 1)
Google Sheets
=POISSON(0, 2.2, 0)
=POISSON(2, 2.2, 1)
LibreOffice->Calc
=POISSON(Number; Mean; C)
=POISSON(0; 2.2; 0)
=POISSON(2; 2.2; 1)
25
26
P(A and B) P ( A B )
P(A | B) =
=
P(B)
P(B)
0 P(A) 1
27
Central Tendency
Mode value that appears most frequently
Median value in the middle of a distribution, separating the lower
from the higher half
Mean () sum of values multiplied by their probability (expected value
Spread
Range maximum value minus minimum value
Inner Quartiles 75th percentile value minus the 25th percentile value
Variance (2) - expectation of the squared deviation around the mean
Standard Deviation () - Square root of the variance
Coefficient of Variation (CV) Standard deviation over the mean = /
E[X] = x = m = pi xi
n
i=1
Var[X] = s = pi ( xi - x ) = pi ( xi - m )
2
i=1
i=1
28
Discrete Uniform
for a x b
P X = x = f (x | a,b) = n
0
otherwise
20%
16%
12%
8%
4%
0%
1
Poisson
Probability Mass Function
-l x
e l
P X = x = f (x | l ) = x!
0.70
0.60
0.50
for x = 0,1,2,...
0.40
0.30
0.20
otherwise
0.10
-
2 3
5
29
ctl.mit.edu