Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Engineering Statistics
Winter 2006/2007
Introduction
1
Notes and Reference Materials
• Reference Texts:
)Montgomery, D.C. and Runger, G.C. “Applied Statistics and
Probability for Engineers”, First Ed., Wiley, 1994.
)Box, G.E.P., Hunter, W.G. and Hunter, J.S., “Statistics for
Experimenters”, Wiley, 1978.
Eriksson L., Johansson, E., Kettaneh-Wold, N. and Wold, S., 2001.
“Multi- and Megavariate Data Analysis Principles and Applications,
Umetrics AB, Sweden
Part 1: Objectives
• Focus:
The presentation of the material will emphasize the use of
statistical techniques to extract information from measured data,
and the use of this information for decision making. The focus of
the course will be on the practical use of various statistical
techniques, and this will sometimes demand a close look at the
mathematics underlying the techniques so as to understand their
strengths and limitations.
We will discuss several examples in class, but many additional
examples are covered in the reference texts.
2
Outline
• Outline:
Review
– measures of position (mean, median, mode)– measures of
spread (variance, standard deviation, …)
– measures of uncertainty (confidence intervals, hypothesis tests)
Statistical Process Control
– philosophy
– SPC charts
Outline (Cont.)
3
Outline (Cont.)
• PART II
¾ 1) Design of Experiments (DOE)
- Concepts behind DOE
- Randomization and blocking
- Factorial Designs
- Fractional Factorial Designs
- Response Surface Designs
- Optimal Designs
¾ 2) Introduction to Multivariate Statistics
- Introduction to Principal Component Analysis (PCA)
- Troubleshooting processes using plant data
- Multivariate SPC
- Industrial applications
- Introduction to Partial Least Squares (PLS)
4
Review - Visualizing Data
5
Histogram
0.18
0.16
0.14
Frequency
0.12
0.1
Pre DRC
DRC
0.08
0.06
Pre DRC
target
0.04
DRC
target
0.02
constraint
0
End Sulphur
Histogram
frequency
relative frequency =
n
6
Histogram 2
Bins 1 FrequencyCumulative %
Histogram
84.5 4 10.26%
89.5 0 10.26%
94.5 4 20.51%
12 120.00%
99.5 9 43.59% 10 100.00%
Frequency
104.5 10 69.23% 8 80.00%
109.5 6 84.62% Frequency
6 60.00%
114.5 3 92.31% Cumulative %
4 40.00%
More 3 100.00%
2 20.00%
0 .00%
e
89.5
94.5
9 .5
109.5
104.5
119.5
M .5
or
84
4
Bins 1
Scatter Plot
55
y = 0.2239x + 37.349
50
45
X2
Y
Linear (X2)
40
35
30
5 10 15 20
X
7
Probability distributions
∫ f (x )dx = 1
−∞
x2
P( x1 ≤ X ≤ x 2 ) = ∫ f ( u )du
x1
x
F( x ) = P( X ≤ x ) = ∫ f (u )du
−∞
dF( x )
f (x) =
dx
8
The Normal Distribution
σ 2π
9
Example
Example (Cont.)
10
Mean
∑x i
x= i =1
n
Another statistical estimator the mean is the median (It has larger
variance than the sample mean, but is more robust to outliers)
Variance
∞
V (X) = σ x2 = E (X − μ ) 2 = ∫ ( x − μ ) 2 f ( x ) dx
−∞
V(X) = σ = E(X − μ ) = ∑ ( x − μ ) 2 f ( x )
2
x
2
Standard deviation: σ =+ σ2
∑ (x i − x)2
s =
2 i =1
n −1
11
Expected Value
∞
E[g ( x )] = ∫ g ( x ) ⋅ f ( x ) dx
−∞
• The mean and variance are special cases of this general definition
E[cX] = cE[X]
E[X + Y] = E[X] + E[Y]
V(cX) = c 2 V( x )
V(X + Y) = V( X) + V(Y) + Cov( X, Y)
12
Solution
n = 24 bottles
μ = 340 grams
σ = 1.2 grams
The expected value of the mass of beverage in an individual bottle is 340 grams
(μ). Extrapolating this data, with the definition of expected value, to utilise for
batch size of N bottles, yields:
E(nX) = n * E(X)
Solution (Cont.)
The standard deviation is the square root of the variance, with variance defined
as (p.8.4):
σ2(X) = E [ (X - μ)2 ]
= E { [X - E(X)]2 }
= E { [nX - nE(X)]2 }
= n2 * E { [X-E(X)]2 }
= n2 * σ2(X)
13
Covariance
• Population covariance:
σ XY
2
= E{(X − μ X )(Y − μ Y )} = E(XY) − μ X μ Y
• Sample covariance:
n
∑ (x − x )( y
i i − y)
σˆ XY
2
= s XY
2
= i =1
n −1
Correlation
• Population correlation:
σ XY
2
ρ=
σ Xσ Y
• Sample correlation:
n
∑(y i − y )(xi − x )
r= i =1
⎛ n 2 ⎞⎛
n
2⎞
⎜ ∑ ( xi − x ) ⎟⎜ ∑ ( yi − y ) ⎟
⎝ i =1 ⎠⎝ i =1 ⎠
14