Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Multiple Imputation
in Education Research
Paul T. von Hippel
Harvard Graduate School of Education
Larsen G-06
Wednesday, April 22, 1-230 pm
I. Background
II. New Results
I. Background
Education Data
Listwise Deletion
Regression Imputation
Multiple Imputation
Rubin 1987
Steps:
1.
2.
3.
4.
Replication
Imputation
Analysis
Recombination
MI Point Estimate
MI Standard Errror
Large Large M
Often enough (Rubin 1987; von Hippel 2005)
M = 3 to 10
SASs MI procedure
Multivariate normal
Normal
Linear
Alternating regression
IIA. Non-normality
Horton, Lipsitz & Parzen 2003; Allison 2005; Bollen & Barb 1981
Often OK
impute as though normal
Bad
Try to make data normal
Editing imputations
Principle
Imputed data
original data
Imputed estimates = original estimates
IIB. Missing Y
IIB. Missing Y
Missing Ys are useless for regression
But cases with missing Ys have information about X
Little 1992
IIB. Missing Y:
Multiple Imputation, then Deletion (MID)
Steps:
1. Replication
2. Imputation
2 . Deletion [of cases with imputed Y]
3. Analysis
4. Recombination
IIC. Non-linearity
Principle
Imputed data
real data
Imputed statistics = real statistics
Conclusions
Plausible estimates more important than
plausible data
Normal imputations are versatile, but messy
Future research and software
Resampling (approximate Bayesian bootstrap)
Alternatives to imputation
full-information maximum likelihood estimation
Data quality
References
Assumption:
Ignorable missingness
In MI
impute 3-10 values
from this distribution
In ML,
density
0.01
0.008
0.006
0.004
integrate
0.002
across the full distribution
Weight
75
100
125
150
175
200
225
of possible Y values
Like MI
with an infinite number of imputations
ML in AMOS
to run
to view results