Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Arnulfo Perez
Saturday, February 21, 2015
Executive summary
For cars that are lighter than 2.61, milage is always better for manual transmission, and for cars heavier than
3.32, milage is always better for automatic transmission.
Analysis
Lets start by testing the relation between weight and milage per galon. There is an inverse correlation but
there are some outliers or possible a non-linear relation. Because the question asks to compare automatic vs
manual transmission, I procede to segment the data by this attribute. The manual data show a clear inverse
relation between milage and weight, except for two outlieres that have mpg > 30. This points correspond to
4 cylinder manual transmission vehicles.
The automatic transmission data do not show a clear relation between milage and weight, but seem to have
three outlieres that have weight > 5. This points correspond to 8 cylinder automatic transmission vehicles.
Therefore, I will divide the data in three groups. Those with mpg > 32, those with wt > 5, and the rest that
will be used as the test points to generate the regression linear models.
testData <- mtcars[(mtcars$mpg<=30)&(mtcars$wt<=5),]
testAutomatic <- testData[testData$am==0,]
testManual <- testData[testData$am==1,]
The linear models are generated using lm():
fitAutomatic <- lm(mpg ~ wt + cyl, data=testAutomatic)
fitManual <- lm(mpg ~ wt + cyl, data=testManual)
# Global test of model assumptions
library(gvlma)
gvAutomatic <- gvlma(fitAutomatic)
gvManual <- gvlma(fitManual)
gvAutomatic
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = mpg ~ wt + cyl, data = testAutomatic)
Coefficients:
(Intercept)
28.0718
wt
0.5855
cyl
-1.7722
##
##
##
##
##
##
##
##
##
##
##
Level of Significance =
0.05
Call:
gvlma(x = fitAutomatic)
Global Stat
Skewness
Kurtosis
Link Function
Heteroscedasticity
Value p-value
Decision
1.19177 0.8795 Assumptions acceptable.
0.41090 0.5215 Assumptions acceptable.
0.40468 0.5247 Assumptions acceptable.
0.31453 0.5749 Assumptions acceptable.
0.06166 0.8039 Assumptions acceptable.
gvManual
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = mpg ~ wt + cyl, data = testManual)
Coefficients:
(Intercept)
40.6192
wt
-5.9706
cyl
-0.6241
Global Stat
Skewness
Kurtosis
Link Function
Heteroscedasticity
Value p-value
Decision
2.23487 0.6927 Assumptions acceptable.
0.05182 0.8199 Assumptions acceptable.
0.46779 0.4940 Assumptions acceptable.
0.95039 0.3296 Assumptions acceptable.
0.76488 0.3818 Assumptions acceptable.
According to the linear regression model, the expected difference in milage between manual and automatic
transmission is given by:
## (Intercept)
##
12.547405
wt
-6.556098
cyl
1.148119
From the model it can be seen that for cars that are lighter than 2.61, milage is always better for manual
transmission, and for cars heavier than 3.32, milage is always better for automatic transmission.
References
Harold V. Henderson and Paul F. Velleman, Building Multilple Regression Models Interactively,
Biometrics, Vol. 37, No. 2 (Jun., 1981), pp. 391-411, Published by: International Biometric Society, Article
Stable URL: http://www.jstor.org/stable/2530428
2
t Quantiles
1.0
0.0
1.0
2.0
Studentized Residuals(fitManual)
Studentized Residuals(fitAutomatic)
10
15
20
mpg
25
30
Appendix
3
4
1.5
wt
0.5
0.5
t Quantiles
1.5