Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Multi-Factored Experiments
Engineering
g g 9516
DOE - I
Introduction
1
Design of Engineering Experiments
Introduction
Assumptions
• You have
– a first course in statistics
– heard of the normal distribution
– know about the mean and variance
– have done some regression analysis or heard of it
– know something about ANOVA or heard of it
• Have used Windows or Mac based computers
• Have done or will be conducting experiments
• Have not heard of factorial designs, fractional
factorial designs, RSM, and DACE.
L. M. Lye DOE Course 4
2
Some major players in DOE
• Sir Ronald A. Fisher - pioneer
– invented ANOVA and used of statistics in experimental
design while
hile working
orking at Rothamsted Agricultural
Agric lt ral
Experiment Station, London, England.
• George E. P. Box - married Fisher’s daughter
– still active (86 years old)
– developed response surface methodology (1951)
– plus many other contributions to statistics
• Others
– Raymond Myers, J. S. Hunter, W. G. Hunter, Yates,
Montgomery, Finney, etc..
L. M. Lye DOE Course 5
3
References
• D. G. Montgomery (2012): Design and Analysis
of Experiments, 8th Edition, John Wiley and Sons
– one of the best book in the market. Uses Design-Expert
g p
software for illustrations. Uses letters for Factors.
• G. E. P. Box, W. G. Hunter, and J. S. Hunter
(2005): Statistics for Experimenters: An
Introduction to Design, Data Analysis, and Model
Building, John Wiley and Sons. 2nd Edition
– Classic text with lots of examples. No computer aided
solutions. Uses numbers for Factors.
• Journal of Quality Technology, Technometrics,
American Statistician, discipline specific journals
L. M. Lye DOE Course 7
4
Strategy of Experimentation
• Strategy of experimentation
– Best guess approach (trial and error)
• can continue indefinitely
• cannot guarantee best solution has been found
– One-factor-at-a-time (OFAT) approach
• inefficient (requires many test runs)
• fails to consider any possible interaction between factors
– Factorial approach (invented in the 1920’s)
• F t varied
Factors i d together
t th
• Correct, modern, and most efficient approach
• Can determine how factors interact
• Used extensively in industrial R and D, and for process
improvement.
L. M. Lye DOE Course 9
5
Statistical Design of Experiments
• All experiments should be designed experiments
• Unfortunately,
Unfortunately some experiments are poorly
designed - valuable resources are used
ineffectively and results inconclusive
• Statistically designed experiments permit
efficiency and economy, and the use of statistical
methods in examining the data result in scientific
objectivity when drawing conclusions.
6
• In general, by using DOE, we can:
– Learn about the process we are investigating
– Screen important
p variables
– Build a mathematical model
– Obtain prediction equations
– Optimize the response (if required)
7
INPUTS OUTPUTS
(Factors) (Responses)
X variables Y variables
People
Materials
responses related
Policies to producing a
A Blending of produce
Inputs which
Generates responses related
Procedures
Corresponding to completing a task
Outputs
Methods
INPUTS OUTPUTS
(Factors) (Responses)
X variables Y variables
Type of
cement
compressive
Percent water
strength
g
PROCESS:
Type of
modulus of elasticity
Additives
Percent
Discovering modulus of rupture
Additives
Optimal
Concrete
Mixing Time Mixture Poisson's ratio
Curing
Conditions
8
INPUTS OUTPUTS
(Factors) (Responses)
X variables Y variables
Type of Raw
Material
Mold
p
Temperature
% shrinkage from
Holding Time
mold size
Manufacturing
Injection number of defective
Gate Size
Molded Parts parts
Screw Speed
INPUTS OUTPUTS
(Factors) (Responses)
X variables Y variables
Impermeable layer
(mm)
IInitial
iti l storage
t PROCESS:
(mm)
Coefficient of R-square:
Infiltration Predicted vs
Observed Fits
Rainfall-Runoff
Model
Coefficient of
Recession
Calibration
Soil Moisture
Capacity
(mm)
Model Calibration
Initial Soil Moisture
(mm)
9
INPUTS OUTPUTS
(Factors) (Responses)
X variables Y variables
Brand:
Cheap vs Costlyy
Taste:
PROCESS: Scale of 1 to 10
Time:
4 min vs 6 min
10
More examples
• Cooking
– Factors: amount of cooking wine, oyster sauce, sesame oil
– Response: Taste of stewed chicken
• Sexual Pleasure
– Factors: marijuana, screech, sauna
– Response: Pleasure experienced in subsequent you know what
• Basketball
– Factors: Distance from basket, type of shot, location on floor
– Response: Number of shots made (out of 10) with basketball
• Skiing
– Factors: Ski type, temperature, type of wax
– Response: Time to go down ski slope
Basic Principles
• Statistical design
g of experiments
p (DOE)
( )
– the process of planning experiments so that
appropriate data can be analyzed by statistical
methods that results in valid, objective, and
meaningful conclusions from the data
– involves two aspects: design and statistical
analysis
11
• Every experiment involves a sequence of
activities:
– Conjecture - hypothesis that motivates the
experiment
– Experiment - the test performed to investigate
the conjecture
– Analysis - the statistical analysis of the data
from the experiment
– Conclusion - what has been learned about the
original conjecture from the experiment.
12
Guidelines for Designing Experiments
• Recognition of and statement of the problem
– need to develop all ideas about the objectives of the
experiment - get input from everybody - use team
approach.
• Choice of factors, levels, ranges, and response
variables.
– Need to use engineering judgment or prior test results.
• Choice of experimental design
– sample size, replicates, run order, randomization,
software to use, design of data collection forms.
13
Using Statistical Techniques in
Experimentation - things to keep in mind
• Use non-statistical knowledge of the problem
– physical laws, background knowledge
• Keep the design and analysis as simple as possible
– Don’t use complex, sophisticated statistical techniques
– If design is good, analysis is relatively straightforward
– If design is bad - even the most complex and elegant
statistics cannot save the situation
• Recognize the difference between practical and
statistical significance
– statistical significance ≠ practically significance
L. M. Lye DOE Course 27
14
DOE (II)
Factorial vs OFAT
15
• The ability to gain competitive advantage requires
extreme care in the design and conduct of
experiments.
i t Special
S i l attention
tt ti mustt be b paidid to
t joint
j i t
effects and estimates of variability that are provided
by factorial experiments.
Factorial Designs
• In a factorial experiment, all
ibl combinations
possible bi ti off
factor levels are tested
• The golf experiment:
– Type of driver (over or regular)
– Type of ball (balata or 3-piece)
– Walking vs. riding a cart
– yp of beverage
Type g (Beer
( vs water))
– Time of round (am or pm)
– Weather
– Type of golf spike
– Etc, etc, etc…
16
Factorial Design
17
Erroneous Impressions About Factorial
Experiments
• Wasteful and do not compensate the extra effort with
additional useful information - this folklore presumes that
one knows
k (not
( t assumes)) that
th t factors
f t independently
i d d tl
influence the responses (i.e. there are no factor
interactions) and that each factor has a linear effect on the
response - almost any reasonable type of experimentation
will identify optimum levels of the factors
• Information on the factor effects becomes available only
after the entire eexperiment
periment is completed
completed. Takes too long.
long
Actually, factorial experiments can be blocked and
conducted sequentially so that data from each block can be
analyzed as they are obtained.
18
One-factor-at-a-time experiments (OFAT)
• OFAT experiments are regarded as easier to implement,
more easily understood, and more economical than
f t i l experiments.
factorial i t B Better
tt than
th trial
t i l andd error.
• OFAT experiments are believed to provide the optimum
combinations of the factor levels.
• Unfortunately, each of these presumptions can generally be
shown to be false except under very special circumstances.
• The key reasons why OFAT should not be conducted
except under very special circumstances are:
– Do not provide adequate information on interactions
– Do not provide efficient estimates of the effects
19
Example: Factorial vs OFAT
Factorial OFAT
high high
Factor B B
low
low
l
low hi h
high l
low hi h
high
Factor A A
20
D E S I G N -E A S E P l o t
In te ra c ti o n G ra p h
L n (f ) k /D
- 3 .4 2 0 3 8 2
2
X = A : R e y n o l d 's #
Y = B : k/ D
- 3 .6
64155
D e si g n P o i n t s
B - 0 .0 0 0
B + 0 .0 0 1
Ln(f)
- 3 .8 6 2 7 2
2
- 4 .0 8 3 8 9
- 4 .3 0 5 0 7 2
4 .0 0 0 4 .5 0 0 5 .0 0 0 5 .5 0 0 6 .0 0 0
R e yn o l d 's #
L. M. Lye DOE Course 41
D E S I G N -E A S E P l o t
0 .00 1 0
L n (f)
L n (f )
X = A : R e y n o l d 's #
Y = B : k/ D
D e si g n P o i n t s
0 .00 0 8
-3 . 5 6 7 8 3 -3 . 8 6 2 7 2
k/D
0 .00 0 6
-3 . 7 1 5 2 8
-4 . 0 1 0 1 7
0 .00 0 3
-4 . 1 5 7 6 2
0 .00 0 1
4 .0 0 0 4 .5 0 0 5 .0 0 0 5 .5 0 0 6 .00 0
R e yn o l d 's #
21
D E S I G N -E A S E P l o t
L n (f )
X = A : R e y n o l d 's #
Y = B : k/ D
-3 . 4 2 0 3 8
-3 . 6 4 1 5 5
-3 . 8 6 2 7 2
-4 . 0 8 3 8 9
Ln(f)
-4 . 3 0 5 0 7
0.0010
0.0008
0.0006 6.000
k/ D 5.500
0.0003 5.000
4.500
0.0001 4.000
R e y n o l d 's #
22
D E S I G N -E X P E R T P l o t
In te ra c ti o n G ra p h
L o g 1 0 (f ) B : k /D
- 1.49 5
X = A: RE
Y = B : k/ D
D e si g n P o i n t s - 1.56 7
B - 0 .0 0 0
B + 0 .0 0 1
Log10(f)
- 1.63 9
- 1.71 2
- 1.78 4
A: R E
D E S I G N -E X P E R T P l o t
L o g 1 0 (f )
X = A : RE
Y = B : k/ D
-1 . 5 5 4
-1 . 6 1 1
-1 . 6 6 8
-1 . 7 2 5
Log10(f)
-1 . 7 8 3
0.0008828
0.0007414
B :0 k/
. 0D
006000
0.0004586 5.707
5.354
5.000
4.646
0.0003172 4.293
A: RE
23
D E S I G N -E X P E R T P l o t
0 .0 0 0 8 8 2 8
L o g 1 0 (f)
L o g 1 0 (f )
D e si g n P o i n t s
X = A: RE
Y = B : k/ D
0 .0 0 0 7 4 1 4
-1 . 6 6 8 -1 . 7 0 6
B: k/D
0 .0 0 0 6 0 0 0
- 1 . 5 9 2- 1 . 6 3 0
0 .0 0 0 4 5 8 6
-1 . 7 4 4
0 .0 0 0 3 1 7 2
4 .2 9 3 4 .6 4 6 5 .0 0 0 5 .3 5 4 5 .7 0 7
A: R E
D E S I G N -E X P E R T P l o t
L o g 1 0 (f )
P re d i c te d vs . A c tu a l
- 1 .4 9 4
- 1 .5 6 6
Predicted
- 1 .6 3 9
- 1 .7 1 1
- 1 .7 8 3
- 1 .7 8 3 - 1 .7 1 1 - 1 .6 3 9 - 1 .5 6 6 - 1 .4 9 4
A c tu a l
24
DOE (III)
Basic Concepts
25
Portland Cement Formulation
Observation Modified Mortar Unmodified Mortar
(sample), j (Formulation 1) y1 j (Formulation 2) y 2j
1 16 85
16.85 17 50
17.50
2 16.40 17.63
3 17.21 18.25
4 16.35 18.00
5 16.52 17.86
6 17.04 17.75
7 16.96 18.22
8 17.15 17.90
9 16.59 17.96
10 16.57 18.15
L. M. Lye DOE Course 51
18.3
17.3
16.3
Form 1 Form 2
26
Box Plots
Boxplots of Form 1 and Form 2
(means are indicated by solid circles)
18 5
18.5
17.5
16.5
Form 1 Form 2
27
The Hypothesis Testing Framework
Estimation of Parameters
1 n
y = ∑ yi estimates the population mean μ
n i =1
1 n
S =
2
∑
n − 1 i =1
( yi − y ) 2 estimates the variance σ 2
28
Summary Statistics
Formulation 1 Formulation 2
y1 = 16.76 y2 = 17.92
S12 = 0.100 S 22 = 0.061
S1 = 0.316
0 316 S 2 = 0.247
0 247
n1 = 10 n2 = 10
29
How the Two-Sample t-Test Works:
Use S12 and S 22 to estimate σ 12 and σ 22
y1 − y2
The previous ratio becomes
S12 S22
+
n1 n2
However, we have the case where σ 12 = σ 22 = σ 2
Pool the individual sample variances:
(n1 − 1) S12 + (n2 − 1) S22
S =2
n1 + n2 − 2
p
30
The Two-Sample (Pooled) t-Test
(n1 − 1) S12 + (n2 − 1) S 22 9(0.100) + 9(0.061)
S p2 = = = 0.081
n1 + n2 − 2 10 + 10 − 2
S p = 0.284
y1 − y2 16.76 − 17.92
t0 = = = −9.13
1 1 1 1
Sp + 0.284 +
n1 n2 10 10
31
The Two-Sample (Pooled) t-Test
• A value of t0 between
–2.101 and 2.101 is
consistent with
equality of means
• It is possible for the
means to be equal and
t0 to exceed either
2.101 or –2.101, but it
would be a “rare
event” … leads to the
conclusion that the
means are different
• Could also use the
P-value approach
L. M. Lye DOE Course 63
32
Minitab Two-Sample t-Test Results
Two-Sample T-Test and CI: Form 1, Form 2
Two-sample T for Form 1 vs Form 2
Checking Assumptions –
The Normal Probability Plot
Tension Bond Strength Data
ML Estimates
Form 1
99
Form 2
95 Goodness of Fit
AD*
90
1.209
80 1.387
70
Percent
60
50
40
30
20
10
5
33
Importance of the t-Test
• Provides an objective framework for simple
comparative experiments
• Could be used to test all relevant hypotheses
in a two-level factorial design, because all
of these hypotheses involve the mean
response at one “side” of the cube versus
the mean response at the opposite “side” of
the cube
L. M. Lye DOE Course 67
34
An Example
• Consider an investigation into the formulation of a
new “synthetic” fiber that will be used to make ropes
• The response variable is tensile strength
• The experimenter wants to determine the “best” level
of cotton (in wt %) to combine with the synthetics
• Cotton content can vary between 10 – 40 wt %; some
non-linearity in the response is anticipated
• The experimenter
p chooses 5 levels of cotton
“content”; 15, 20, 25, 30, and 35 wt %
• The experiment is replicated 5 times – runs made in
random order
An Example
35
The Analysis of Variance
36
Models for the Data
∑∑ ( y
i =1 j =1
ij − y.. ) 2 = ∑∑ [( yi. − y.. ) + ( yij − yi. )]2
i =1 j =1
a a n
= n∑ ( yi. − y.. ) 2 + ∑∑ ( yij − yi. ) 2
i =1 i =1 j =1
SST = SSTreatments + SS E
L. M. Lye DOE Course 74
37
The Analysis of Variance
SST = SSTreatments + SS E
• A large value of SSTreatments reflects large differences in
treatment means
• A small value of SSTreatments likely indicates no differences in
treatment means
• Formal statistical hypotheses
yp are:
H 0 : μ1 = μ 2 = L = μ a
H1 : At least one mean is different
38
The Analysis of Variance is
Summarized in a Table
39
The Reference Distribution:
X = A : C o tto n W e ig h t %
D e si g n P o i n t s
2 0 .5
2 2
2 2
Stre ngth
16
2
1 1 .5
2
7 2
15 20 25 30 35
A : C o tto n W e i g h t %
L. M. Lye DOE Course 80
40
Model Adequacy Checking in the ANOVA
95
N orm al % probability
90
= yij − yi. 80
70
50
• Design-Expert generates 30
the residuals 20
10
useful 1
R e s id u a l
41
Other Important Residual Plots
5.2 5.2
2.95 2.95
2
R es iduals
R es iduals
0.7 0.7
2
2
-1.55 -1.55
2
2
2
-3.8
-3.8
1 4 7 10 13 16 19 22 25
9.80 12.75 15.70 18.65 21.60
Predicted R un N um ber
42
Design-Expert Output
Treatment Means (Adjusted, If Necessary)
Estimated Standard
Mean Error
1-15 9.80 1.27
22-20
20 15.40 1.27
3-25 17.60 1.27
4-30 21.60 1.27
5-35 10.80 1.27
43
The Regression Model
25
20.5
Strength = 62.611 - 2 2
9.011* Wt % + 2 2
Strength
0.481* Wt %^2 - 16
7.600E-003 * Wt %^3
2
This is an empirical model of 11.5
2
the experimental results
7 2
D E S I G N -E X P E R T P l o t
O n e F a c to r P lo t
D e si ra b i l i t y 1 .0 0 0
X = A: A Pr e d ic t 0 . 7 7 2 5
X 2 8 .2 3
D e si g n P o i n t s 0 .7 5 0 0 5
5
Desirability
0 .5 0 0 0
0 .2 5 0 0
6
6
0 .0 0 0 0
1 5 .0 0 2 0 .0 0 2 5 .0 0 3 0 .0 0 3 5 .0 0
A: A
44
L. M. Lye DOE Course 89
45
DOE (IV)
General Factorials
46
Some Basic Definitions
40 + 52 20 + 30
A = y A + − y A− = − = 21
2 2
30 + 52 20 + 40
B = yB + − yB − = − = 11
2 2
52 + 20 30 + 40
L. M. Lye
AB = −DOE Course = −1 93
2 2
50 + 12 20 + 40
A = y A+ − y A− = − =1
2 2
40 + 12 20 + 50
B = yB + − yB − = − = −9
2 2
12 + 20 40 + 50
AB = − = −29
2 2
L. M. Lye DOE Course 94
47
Regression Model & The
Associated Response
Surface
y = β 0 + β1 x1 + β 2 x2
+ β12 x1 x2 + ε
The least squares fit is
yˆ = 35.5 + 10.5 x1 + 5.5 x2
+0.5
0 5 x1 x2
≅ 35.5 + 10.5 x1 + 5.5 x2
Interaction is actually
a form of curvature
48
Example: Battery Life Experiment
49
Statistical (effects) model:
⎧ i = 1,, 2,...,
, ,a
⎪
yijk = μ + τ i + β j + (τβ )ij + ε ijk ⎨ j = 1, 2,..., b
⎪k = 1, 2,..., n
⎩
a b n a b
SST = SS A + SS B + SS AB + SS E
df breakdown:
abn − 1 = a − 1 + b − 1 + (a − 1)(b − 1) + ab(n − 1)
50
ANOVA Table – Fixed Effects Case
Design-Expert Output
Response: Life
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of Mean F
Source Squares DF Square Value Prob > F
Model 59416.22 8 7427.03 11.00 < 0.0001
A 10683.72 2 5341.86 7.91 0.0020
B 39118.72 2 19559.36 28.97 < 0.0001
AB 9613.78 4 2403.44 3.56 0.0186
Pure E 18230.75 27 675.21
C Total 77646.97 35
51
Residual Analysis
DE S IG N-E X P E RT P l o t Normal plot of residuals DE S IG N-E X P E RT P l o t Residuals vs. Predicted
L i fe
L i fe
45.25
99
95
18.75
N orm al % probability
90
80
R es iduals
70
50 -7.75
30
20
10 -34.25
5
-60.75
Predicted
R es idual
Residual Analysis
DE S IG N-E X P E RT P l o t Residuals vs. Run
L i fe
45.25
18.75
R es iduals
-7.75
-34.25
-60.75
1 6 11 16 21 26 31 36
R un N um ber
52
Residual Analysis
DE S IG N-E X P E RT P l o t Residuals vs. Material DE S IG N-E X P E RT P l o t Residuals vs. Temperature
L i fe L i fe
45.25 45.25
18.75 18 75
18.75
R es iduals
R es iduals
-7.75 -7.75
-34.25
-34.25
-60.75
-60.75
1 2 3
1 2 3
Tem perature
Material
Interaction Plot
DE S IG N-E X P E RT P l o t Interaction Graph
L i fe A: Mate ria l
188
X = B : T e m p e ra tu re
Y = A : M a te ri a l
A1 A1 146
A2 A2
A3 A3
L ife
104
2
62
2
20
15 70 125
B: Te m pe ra tu re
53
Quantitative and Qualitative Factors
"Sequential Model Sum of Squares": Select the highest order polynomial where the
additional terms are significant.
54
Quantitative and Qualitative Factors
A = Material type Candidate model
t
terms from
f Design-
D i
B = Linear effect of Temperature
Expert:
B2 = Quadratic effect of Intercept
Temperature A
B
AB = Material type – TempLinear
B2
AB2 = Material type - TempQuad AB
B3
B3 = Cubic effect of
AB2
Temperature (Aliased)
Sum of Mean F
Source Squares DF Square Value Prob > F
Linear 9689.83 5 1937.97 2.87 0.0333 Suggested
2FI 7374.75 3 2458.25 3.64 0.0252
Quadratic 7298.69 2 3649.35 5.40 0.0106
Cubic 0.00 0 Aliased
Pure Error 18230.75 27 675.21
"Lack of Fit Tests": Want the selected model to have insignificant lack-of-fit.
55
Quantitative and Qualitative Factors
Model Summary
y Statistics
"Model Summary Statistics": Focus on the model maximizing the "Adjusted R-Squared"
and the "Predicted R-Squared".
56
Regression Model Summary of Results
Final Equation in Terms of Actual Factors:
Material
M i l A1
Life =
+169.38017
-2.50145 * Temperature
+0.012851 * Temperature2
Material A2
Life =
+159.62397
-0.17335 * Temperature
p
-5.66116E-003 * Temperature2
Material A3
Life =
+132.76240
+0.90289 * Temperature
-0.010248 * Temperature2
X = B : T e m p e ra tu re
Y = A : M a te ri a l
A1 A1 146
A2 A2
A3 A3
Life
104
2
62
2
20
B: Tem perature
L. M. Lye DOE Course 114
57
Factorials with More Than
Two Factors
• Basic procedure is similar to the two-factor case;
all abc…kn treatment combinations are run in
random order
• ANOVA identity is also similar:
SST = SS A + SS B + L + SS AB + SS AC + L
+ SS ABC + L + SS ABLK + SS E
58