Using Data and Statistical Tools For Operations Management

CHAPTER 7.
USING DATA AND STATISTICAL

TOOLS FOR OPERATIONS IMPROVEMENT
An Integrated Approach to
Improving Quality and Efficiency
Daniel B. McLaughlin
Julie M. Hays
Healthcare Operations
Management
Copyright 2008 Health Administration Press. All rights reserved. 7-2
Chapter 7.Using Data and Statistical
Tools for Operations Management
Data Collection
Graphical Tools
Mathematical Descriptions
Probability and Probability Distributions
Confidence Intervals, Hypothesis Tests
ANOVA/MANOVA /MANCOVA
Regression
Data Collection
Validity: A valid study has no logic,
sampling, or measurement errors.
- Logic
- Selection or sampling
- Measurement
Data Collection
Diagram created in
Inspiration by
Inspiration
Software, Inc.
Data Collection
Logic
Why are the data needed?
What will the data be used for?
What questions are going to be asked of the
data?
Are the patterns of the past going to be
repeated in the future?
Data Collection
Selection or Sampling
Census versus sample
Nonrandom methods
Simple random sampling
Stratified sampling
Systematic or sequential sampling
Cluster or area sampling
Sample size
Data Collection
Measurement
Reliability
- Would the
measure-
ment be
the same
if we
repeated
it?
Accuracy
- Does the measurement
measure what we want
it to measure (i.e., say
= do)?
Reliable, but
not accurate
Reliable and
accurate
Not reliable,
but accurate
Precision
- How precise should
the measurements
be?
Graphical Tools
Mapping
Visual representations of data
Histograms and Pareto charts
Stem plots, dot plots
Box (and whisker) plots
Normal probability plots
Graphical Tools
Histograms and Pareto Charts
Length of Hospital Stay Diagnosis Category
0
2
4
6
8
10
12
14
1-2 3-4 5-6 7-8 9-10 11-12 13-14 15-16 17-18
Length of Hospital Stay (days)
F
r
e
q
u
e
n
c
y
0
2
4
6
8
10
12
H
e
a
r
t

D
i
s
e
a
s
e
D
e
l
i
v
e
r
y
P
n
u
e
m
o
n
i
a
M
a
l
i
g
n
a
n
t

N
e
o
p
l
a
s
m
s
P
s
y
c
h
o
s
e
s
F
r
a
c
t
u
r
e
s
Diagnosis
F
r
e
q
u
e
n
c
y
Microsoft Excel screen shots reprinted with permission from Microsoft Corporation.
Graphical Tools
Dot Plots
Length of Hospital Stay
Days
18 15 12 9 6 3
Dotplot of C1
Produced with Minitab Statistical Software
Graphical Tools
Turnip Graph
Percentage of diabetic Medicare enrollees receiving eye
exams among 306 hospital referral regions (2001)
Source: Wennberg, J. E. 2005. Data from the Dartmouth Atlas Project. Figure copyrighted by the Trustees of
Dartmouth College. Used with permission.
Graphical Tools
Normal Probability Plots
Length of Hospital Stay
Observed Cumul ati ve Probabi l i ty
1.00 .75 .50 .25 0.00
E
x
p
e
c
t
e
d

C
u
m
u
l
a
t
i
v
e

P
r
o
b
a
b
i
l
i
t
y
1.00
.75
.50
.25
0.00
Produced with SPSS for Windows
Graphical Tools
Scatter Plots
Microsoft Excel screen shots reprinted with permission from Microsoft Corporation.
Strong Negative Correlation
X
Y
r = -0.86
Strong Positive Correlation
X
Y
r = 0.91
Positive Correlation
X
Y
r = 0.70
No Correlation
X
Y
r = 0.06
Mean
The mean is the arithmetic average of the
population:

The population mean can be estimated from
a sample:
. population the in values of number
and values individual where , mean Population
=
=
= =
N
x
N
x
5.
5
3 5 8 6 3
set, data simple our For
sample. the in values of number n where , mean Sample
=
+ + + +
=
= = =

x
n
x
x
Median and Mode
The median is the middle value of the sample or
population. If the data are arranged into an array
(an ordered data set):
3, 3, 5, 6, 8

5 would be the middle value or median.
The mode is the most frequently occurring value.
In the above example, the value 3 occurs more
often (two times) than any other value, so 3 would
be the mode.
Range and Mean Absolute Deviation
The range is the difference between the
high and low values in a data set.

The mean absolute deviation (MAD) is the
average of the absolute value of the
differences from the mean.
5 3 8 Range = = =
low high
x x
6 1
5
8
5
3 1 0 2 2
MAD . = =
+ + + +
=
n
x x
Variance, Standard Deviation
The variance is the average square difference
from the mean.

This standard deviation is the square root of the
variance.
4.5
4
18
1 5
9 1 0 4 4
1
variance Sample
6 3
5
18
5
9 1 0 4 4
variance Population
2
2
2
2
= =
+ + + +
=
= =
= =
+ + + +
=
= =

n-
) x (x
s
N
) (x
.
2.1 4.5
4
18
1 5
9 1 0 4 4
deviation standard Sample
1.9 3.6
5
18
5
9 1 0 4 4
deviation standard Population
2
2
2
2
= = =
+ + + +
=
= =
= = =
+ + + +
=
= =
n
) x (x
s
N
) (x

Coefficient of Variation
The coefficient of variation (CV) is a measure
of the relative variation in the data. It is the
standard deviation divided by the mean.
0.4
5
1.9
or CV = = =
x
s


Probability and Probability
Distributions
Determination of probabilities
Properties of probabilities
Probability distributions
Discrete probability distributions
Continuous probability distributions
Determination of Probabilities
Observed Probability
Observed probability is the relative frequency
of an eventthe number of times the event
occurred divided by the total number of trials.
n
r
P(A) = =
s experiment or trials, ns, observatio of number Total
occured A times of Number
n
r
P = =
drug the given patients of number Total
cured are patients times of Number
effective) is (drug
Theoretical Probability
Theoretical probability is the theoretical
relative frequency of an event; the theoretical
number of times an event will occur divided by
the total number of possible outcomes.
n
r
P(A) = =
outcomes possible of number Total
occur could A times of Number
25 0
52
13
deck the in cards of number Total
deck the in spades of Number
spade) a is (card . = = = P
Opinion Probability
Opinion probability is a subjective
determination of the number of times an event
will occur divided by the imaginary total
number of possible outcomes or trials.
n
r
P(A) = =
total l Theoretica
occur will event an times of number of Opinion
n
r
P
=
=
run be would Belmont the times of number total Imaginary
Belmont the win would t Secretaria times of number the on Opinion
Stakes) Belmont the winning at (Secretari
Properties of Probabilities
Bounds on Probability
Probabilities always must be >0, and an event that
cannot occur has a probability of 0.

Probabilities must always be s1.

P(A) + P(A') = 1 and 1 P(A') = P(A), where A' is
not A.
0
number Any
0
occur could A times of number Least
= = = P(A)
1
occur could A times of number Greatest
= = =
n
n
P(A)
1 0 s s P(A)
Multiplicative Property
For two independent events, the probability of
both A and B occurring, or the intersection ()
of A and B, is the probability of A occurring
times the probability of B occurring.
P(A and B occurring) = P(A B) = P(A) x P(B)
Multiplicative Property
Coin Toss Die Toss Probability
1 1/12
2 1/12
H 3 1/12 P(H 3) = 1/12
4 1/12
5 1/12
6 1/12
Start
1 1/12
2 1/12
T 3 1/12
4 1/12
5 1/12
6 1/12
P(H) = 1/2 P(3) = 1/6 P(H) P(3) = 1/2 1/6 = 1/12
Additive Property
For two events, the probability of A or B
occurring, or the union () of A with B, is the
probability of A occurring plus the probability
of B occurring, minus the probability of both
A and B occurring.
P(A or B occurring) = P(A B) = P(A) + P(B) + P(A B)
Additive Property
Coin Toss Die Toss Probability
1 1/12
2 1/12
H 3 1/12
4 1/12
5 1/12
6 1/12
Start P(H 3) = 7/12
1 1/12
2 1/12
T 3 1/12
4 1/12
5 1/12
6 1/12
P(H) = 1/2 P(3) = 1/6 P(H) + P(3) P(H 3) = 7/12
Conditional Probability
The probability of an event occurring if more
information is obtained:
) (
) (
) (
B P
B A P
B A P

=
Contingency Table for ER Wait Times
s30 minute wait >30 minute wait
Friday night 20 30 50
Other times 40 10 50
60 40 100
Conditional Probability
Note that:

and if one event has no effect on the other
event (the events are independent), then

Bayes theorem
) ( ) ( ) ( ) ( ) ( A P A B P B P B A P B A P = =
and ) ( ) ( A P B A P =
) ( ) ( ) ( B P A P B A P =
.
) ( ) ( ) ( ) (
) ( ) (
) (
) ( ) (
) (
) (
) (
A P A B P A P A B P
A P A B P
B P
A P A B P
B P
B A P
B A P
'
'
+

=
=
Probability Distributions
Discrete Probability Distributions
The binomial distribution
describes the number of
times a binary event will
occur in a sequence of
events.

The Poisson distribution is
used to model the number
of events in a specific
period.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3
Number of Heads in 3 Tosses
P
r
o
b
a
b
i
l
i
t
y
x n x
p) ( p
x)! x!(n
n!
P(x)

= 1
!
) (
x
e
x P
x

=
0
0.05
0.1
0.15
0.2
0.25
1 2 3 4 5 6 7 8 9 10 11
Number of Patient Arrivals in 1 Hour
P
r
o
b
a
b
i
l
i
t
y
Continuous Probability Distributions
In the uniform distribution,
the probability of
occurrence is the same for
all outcomes.

The triangular distribution
is described by the mode,
minimum, and maximum
values.
b x a
a b
x P s s
= for
1
) (
s s

s s

=
b x c
c) a)(b (b
x) (b
c x a
a) a)(c (b
a) (x
P(x)
for
2
for
2
a b
X
P
(
X
)
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
X
P
(
X
)
Min = 0.0, Mode = 0.5, Max = 2.0
Exponential Distribution
The exponential distribution is used to
model arrival rate, the rate of occurrence of
an event.

= mean = 1/, median = ln(2)/, mode = 0, and o = 1/
0 for ) ( > =

x e x P
x
lambda = 2
0
0.5
1
1.5
2
0 1 2
X
P
(
X
)
Normal Distribution
The normal
distribution, x ~N(,o
2
),
is commonly observed
in the world and
provides a reasonable
approximation for
many randomly
distributed variables.
2
2 2
2
2
1
/ ) (x
e
P(x)

=
0
0.2
0.4
0.6
-5 -3 -1 1 3 5
X
P
(
X
)
= 0, o = 1.0
= 0, o = 2.5
= 2, o = 0.7
Standard Normal Distribution
The standard normal distribution,
z distribution, is the normal
distribution with = 0 and o =
1.0. Any normal distribution can
be transformed to a standard
normal distribution by:
0
0.2
0.4
-5 -3 -1 1 3 5
X
P
(
X
)
= 0, o = 1.0
x
z

=

z-score limits
Proportion within the
limits (if normally
distributed)
+/ 1 z 0.680
+/ 2 z 0.950
+/ 3 z 0.997
Confidence Intervals, Hypothesis Testing
Central Limit Theorem
Hypothesis testing
Type I (o) and Type II (|) errors
T-tests
Proportions
Practical significance versus statistical
significance
Confidence Intervals, Hypothesis Testing
Central Limit Theorem
As the sample size becomes large, the
sampling distribution of the mean
approaches normality, no matter what the
distribution the original variable, and

n
x
o
o = =
x
and
Sampling Distribution Simulation
Confidence Intervals
Confidence interval for the true value of the
population mean:
n
z x
n
z x
z x z x
x x
o
o
o o
o o
o o
* . *
* *
2 / 2 /
2 / 2 /
+ s s
+ s s
0
0.2
0.4
-3 -2 -1 0 1 2 3
Z
P
(
X
)
2.5% 2.5%
95%
Hypothesis Testing
Belief or null hypothesis, Ho: = b
Alternate belief or hypothesis, Ha: = b
Decision rule: If z > z* , reject the null
hypothesis. Where
:
x
x
z
o

=
0
0.2
0.4
-3 -2 -1 0 1 2 3
Z
P
(
X
)
Z>Z* Z<-Z*
-Z*< Z < Z* (95% confidence)
Hypothesis Testing
Type I (o) and Type II (|) Errors
Ho:
1
=
2
Ha:
1
=
2

Type I and Type II ErrorClinic Wait Time Example
Reality
Wait times at
the two clinics
are the same
Wait times at the
two clinics are
NOT the same
1
=
2
1
=
2
Assess-
ment or
guess
Wait times at the
two clinics are the
same
1
=
2
Type II or
| error
Wait times at the
two clinics are
NOT the same
1
=
2
Type I or
o error
Equal Variance t-Test
t-tests are used to test hypotheses about
two means.
Ho:
1
=
2
Ha:
1
=
2
Decision rule: If t > t*, reject Ho

Confidence interval

2
) 1 ( ) 1 (
where
1 1
2 1
2
2 2
2
1 1
2 1
2 1 2 1
+
+
=
+

=
n n
s n s n
s
n n
s
) ( ) x x (
t
p
p
(
+ + s s
(
+
2 1
*
2 1 2 1
2 1
*
2 1
1 1
* ) (
1 1
* ) (
n n
s t x x
n n
s t x x
p p

Proportions
Ho: t
1
= t
2
Ha: t
1
=t
2
Decision rule: If z > z*, reject Ho

Confidence interval

where
) 1 ( ) 1 (
) ( ) (
2 1
2 1 2 1
n
p p
n
p p
p p
z
H H
=
2 1
2 2 1 1
n n
p n p n
p
+
+
=
2 1
*
2 1 2 1
2 1
*
2 1
) 1 ( ) 1 (
) (
) 1 ( ) 1 (
) (
n
p p
n
p p
z p p
n
p p
n
p p
z p p

+
+ s s
t t
Practical Significance Versus
Statistical Significance
Basic confidence interval
statistic [(z*) * (s.e. statistic)] s parameter
s statistic + [(z*) * (s.e. statistic)]
As n increases, s.e. decreases and the
confidence interval gets larger.
Large samples may give statistically
significant results that are not practically
significant.
ANOVA/MANOVA/MANCOVA
One-way ANalysis Of VAariance (ANOVA) is used
to test hypotheses about three or more levels of
treatment. A t-test will give the same information
as an ANOVA when there are only two treatment
levels of interest.
Two-way and higher ANOVAs are used when
there is more than one type of treatment variable
of interest.
MANOVA/MANCOVA are used when there is
more than one outcome or dependent variable of
interest.
Regression
Simple linear regressionused to describe
the relationship between two variables
Multiple regressionused to describe the
relationship between multiple predictor
variables and a single dependent variable
General linear model
Artificial neural networks
Design of experiments
What Is the Equation of a Line?
Algebra
a bX Y
+ =
b mx y + =
Statistics
Where
x
y
run
rise
slope b = = =
0 x when y,
intercept y a
= =
=
Problem
Student A owns a health insurance firm and
wants us to determine the cost (price would
be a more difficult problem) of providing
healthcare to insured individuals.
Seeing the Future
Experiences
are irrelevant
Experiences
are relevant
Judgment: To what degree are
these experiences still
relevant?
Data
Deductive reasoning versus inductive reasoning
What Is the Cost of Healthcare
Related To?
Quantitative
______________
______________
______________
______________
______________
______________
Qualitative
_____________
_____________
_____________
_____________
_____________
_____________
Selection
Define population
Census or sample
Type of sample
Measurementaccurate, reliable, precise?
X = number of dependents; Y = annual
healthcare expense ($1,000)
Is the study valid?
How do we create knowledge from data?
Data
Number of
Dependents
Annual
Healthcare
Expense
($1,000)
0 3
1 2
2 6
3 7
4 7
Scatterplot
y = x + 3
y = 1.2x + 2
y = 5
y = 1.3x + 2.4
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
XNumber of Dependents
Y
A
n
n
u
a
l

H
e
a
l
t
h
c
a
r
e

C
o
s
t

$
1
,
0
0
0

Scatterplot Questions
Which is the best line on the scatterplot?
How would you define best (e.g., must be
quantifiable)?
Professors Model
knowledge 3 1X Y
1
X
Y
slope b
3 intercept Y a
($1,000) estimate cost Y
a bX Y
+ =
=
A
A
= =
= =
=
+ =

Model Comparison

X

Y

Yhat =
X + 3
Profs
e =
Y Yhat

Student 1
e

Student 2
e
0 3 3 0 1 0.6
1 2 4 -2 1.2 1.7
2 6 5 1 1.6 1
3 7 6 1 1.4 0.7
4 7 7 0 0.2 0.6
(sum) 0 3 0
4 . 2
) ( 3 . 1
+
= X Y
2
) ( 2 . 1
+
= X Y
Good Model
A good model must be unbiased.
e = 0
Is that enough? What else? Does this
remind you of o
2
?

How do we get rid of signs?
Model Comparison

X

Y

Yhat =
X + 3

e =
Y Yhat

e
2

Student 1
e
2

0 3 3 0 0 1
1 2 2 2 4 1.44
2 6 6 1 1 2.56
3 7 7 1 1 1.96
4 7 7 0 0 0.04
(sum) 25 25 0 6 7
Least Squares Technique
Gauss proved that if you use:
X b Y a and
) X (X
) X )(X Y (Y
b
2
=

=
You are guaranteed that
e = 0 and e
2
is a minimum.

Yhat = 1.3X + 2.4, e = 0, and e
2
= 5.1.
Coefficient of Determination
Are we better off making estimates by using
information (X = number of dependents) and
having created knowledge (Yhat = 1.3X +
2.1) than using no information or knowledge
(i.e., is the model better)?

How would you estimate without using our
knowledge (our model)?
Sum of Squares Total

X

Y

Yhat = Ybar

e = Y
Ybar
SSTO
(Y
Ybar)
2
0 3 5 2 4
1 2 5 3 9
2 6 5 1 1
3 7 5 2 4
4 7 5 2 4
(sum) 25 25 0 22
Note that this method is unbiased.
Graph
y = 5
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
Y
A
n
n
u
a
l

H
e
a
l
t
h
c
a
r
e

C
o
s
t

$
1
,
0
0
0

Errors
0
1
2
3
4
5
6
7
8
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Y
A
n
n
u
a
l

H
e
a
l
t
h
c
a
r
e

C
o
s
t
s

$
1
,
0
0
0

Sum of Squares Error
X Y
Yhat =
1.3X +
2.4
e =
Y
Yhat
SSE
e
2
= (Y
Yhat)
2
Ybar
Y
Ybar
SSTO
(Y
Ybar)
2
0 3 2.4 0.6 0.36 5 2 4
1 2 3.7 1.7 2.89 5 3 9
2 6 5 1.0 1.00 5 1 1
3 7 6.3 0.7 0.49 5 2 4
4 7 7.6 0.6 0.36 5 2 4
(sum) 25 25 0 5.1 25 0 22
What is the percentage of improvement when
we use knowledge gained from our model?

77% 100
22
16.9
22
22 5.1
level error Old
level error old level error New
t improvemen %
=
=
r
2
= coefficient of determination = 77%
r
2
= 0.77

Another Viewpoint
Variation in cost of removal is either explained
by knowledge (the model) or not explained.
Explained and Unexplained Error
0
1
2
3
4
5
6
7
8
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Y
A
n
n
u
a
l

H
e
a
l
t
h
c
a
r
e

C
o
s
t
s

$
1
,
0
0
0

----- Explained
___ Unexplained
Sum of Squares Regression
X Y
Yhat =
1.3X +
2.4
e =
Y
Yhat
SSE
e
2
= (Y
Yhat)
2
Y
bar
Y
Ybar
SSTO
(Y
Ybar)
2
Yhat
Ybar
SSR
(Yhat

Ybar)
2
0 3 2.4 0.6 0.36 5 2 4 2.6 6.76
1 2 3.7 1.7 2.89 5 3 9 1.3 1.69
2 6 5 1.0 1.00 5 1 1 0 0
3 7 6.3 0.7 0.49 5 2 4 1.3 1.69
4 7 7.6 0.6 0.36 5 2 4 2.6 6.76

(sum)
35 25 0 5.1 25 0 22 0 16.9
0.77
22.0
16.9
SSTO
SSR
Total
Explained
r
2
= = = =
Note: r
2
is not based on statistics or
probability; it is just a percentage.
Correlation Coefficient
r = \ r
2
r = Correlation coefficient
= Measure of the strength of the linear
relationship between two variables
1 s r s 1
r = +1
r = 1
Correlation Coefficient Examples
r = 0.9 r = 0.0
r = 0.5
Questions:
If r
2
is low, does that mean there is no
relationship between your variables?

If r
2
is high (close to 1), does that mean you
always get useful predictions from your model?

If r
2
is high, does that mean your model has a
good fit?

r
2
and Curves

Can we fit a straight line to this?
Yes, and we are guaranteed that the errors
sum to zero and are a minimum.
However, a curve would be better.
X
Y
Excel Output
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8765
R Square 0.7682
Adjusted R
Square 0.6909
Standard
Error 0.8790
Observations 5
ANOVA
df SS MS F Significance F
Regression 1 7.6818 7.6818 9.9412 0.0511
Residual 3 2.3182 0.7727
Total 4 10
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0%
Intercept -0.9545 1.0162 -0.9393 0.4169 -4.1885 2.2794 -3.3460 1.4369
Y - $ 1000
Annual
Health Care
Expense 0.5909 0.1874 3.1530 0.0511 -0.0055 1.1873 0.1499 1.0320
RESIDUAL OUTPUT PROBABILITY OUTPUT
Observation
Predicted X -
Number of
Dependents Residuals
Standard
Residuals Percentile
X - Number
of
Dependents
1 0.8182 -0.8182 -1.0747 10 0
2 0.2273 0.7727 1.0150 30 1
3 2.5909 -0.5909 -0.7762 50 2
4 3.1818 -0.1818 -0.2388 70 3
5 3.1818 0.8182 1.0747 90 4
To get this sheet, go to Tools -> Data Analysis -> Regression. If you don't have Data Analysis
listed in your tools, see Excel help "Install and Use the Analysis ToolPak.
Residual Plot
-1.0000
-0.5000
0.0000
0.5000
1.0000
0 2 4 6 8
Y$ 1,000 Annual Healthcare Expense
R
e
s
i
d
u
a
l
s

Line Fit Plot
0
1
2
3
4
5
0 2 4 6 8
Y$ 1,000 Annual Healthcare Expense
X
N
u
m
b
e
r

o
f

D
e
p
e
n
d
e
n
t
s

XNumber
of
Dependents
Predicted X
Number of
Dependents
Normal Probability Plot
0
5
0 20 40 60 80 100
Sample Percentile
X
N
u
m
b
e
r

o
f

D
e
p
e
n
d
e
n
t
s

F Test
If F* > F
(1-o;1;n-2)
, reject H
0
: | = 0

(in this case)

MSR/MSE 1 | = 0

MSR/MSE big | = 0

*
2 /
1 /
F
n SSE
SSR
MSE
MSR
=
=
Assumptions of Linear
Regression
Linear regression is based on several
assumptions. If these assumptions are
violated, the resulting model will be
misleading. The principal assumptions are:
- The dependent and independent variables are
linearly related.
- The errors associated with the model are not
serially correlated.
- The errors are normally distributed and have
constant variance.
Transformations
If the variables are not linearly related or the
assumptions of regression are violated, the variables
can be transformed to produce a possibly better
model.

X Y
Transform
X ->X
2
3 9 9
2 4 4
1 1 1
0 0 0
1 1 1
2 4 4
3 9 9
0
2
4
6
8
10
0 2 4 6 8 10
X
Y
2
Multiple Regression
Multiple independent variables are used to
predict a single dependent variable to
improve the model.
Y = o + |1X1 + |2X2 + + |kXk + c
Multicollinearity can be a problem.
General Linear Model
The most general of all linear models
Multiple predictor variables:
- Metric
- Categorical
- Both
Multiple dependent variables:
- Metric
- Categorical
- Both
Can be used to build complex models
Artificial Neural Networks
Neural Networks
Large amounts of data
No explanation of
how/why
Used to predict
outcomes
Traditional Models
Limited amount of data
Model explains
how/why
Used to predict
outcomes
Outline for Analyses
1. Define the problem/question.
2. Determine what data will be needed to address
the problem question.
3. Collect the data.
4. Graph the data.
5. Analyze the data using the appropriate tool.
6. Fix the problem.
7. Evaluate the effectiveness of the fix.
8. Start again.
Choice of Statistical Technique
Independent
Variable
Dependent
Variable
Mathematical Graphical
Categorical One Categorical One _
2

Many _
2
(layered)
Metric One t-Test
Histogram
type
Many MANOVA Box plot
Many Categorical One _
2

Many _
2
(layered)
Metric One ANOVA Box plots
Many MANOVA
Both GLM
Independent
Variable
Dependent
Variable
Metric One Categorical One Logit
Many GLM
Metric One Simple regression Scatterplot
Many GLM
Both MANCOVA
Many Categorical One Logit
Many GLM
Metric One Multiple regression
Many GLM
Both GLM; neural net
Independent
Variable
Dependent
Variable
Both Categorical One ANCOVA
Many MANCOVA
Metric One Simple regression
Many Multiple regression
Both
GLM
Neural Net

Using Data and Statistical Tools For Operations Management

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Using Data and Statistical Tools For Operations Management

Caricato da

Copyright:

Formati disponibili

CHAPTER 7.

USING DATA AND STATISTICAL

Copyright 2008 Health Administration Press. All rights reserved. 7-18

Copyright 2008 Health Administration Press. All rights reserved. 7-19

Copyright 2008 Health Administration Press. All rights reserved. 7-54

Potrebbero piacerti anche