Sei sulla pagina 1di 3

Department of Industrial and Systems Engineering, IIT Kharagpur

Subject: Applied Multivariate Statistical Modeling I (IM60061)


Home Assignment VII on MLR
Prepared by Prof J Maiti of ISE, IIT Kharagpur

1. i) Define Multiple Linear Regression (MLR).


ii) Give a pictorial representation of MLR.
2. State the assumptions of MLR.
3. i) Define design matrix in MLR.
ii) State the properties of design matrix.

( )

1
4. Prove that the OLS estimate of regression coefficient vectors is = ( X T X ) X T Y .

5. Prove that
( I H ) y [where H nn = Hat matrix]

a) =

T =
yT ( I H ) y
b) SSE =

(I H )

c)

is symmetric and idempotent matrix

( )
Cov ( ) = ( X X )

d) E =

e)

( y ( I H ) y ) / ( n p 1)
is N ( , C ) , a multivariate normal, then prove that

f) =
Se2
6. If

p +1

i) 100(1 )% CR for is

) ( X X ) ( ) ( p + 1) S
T

Where C = ( X T X )

2
e

Fp(+1,) n p 1

Hints: (see Johnson & Wichern, p -371)


)
ii) 100(1 )% SCI for j j Se2C jj ( p + 1) Fp(+1,(n
0,1,... p )
p 1) ( j =
7. If

Hints: (see Johnson & Wichern, p -371).


j j / SE j follows t-distribution with ( n p 1) dof then prove that

tn( /2p )1 Se

( )
C ) (
jj

+ tn( /2p )1 Se C jj .

8. Show that:
) =0
i) E (

) 2 ( I H ) [here H= Hat matrix]

ii) Cov (=

[Hints: take =

(I H ) y

9. i) What is R2?

SSE
2
ii) Show that R =
1

SST
iii) State the limitation of R2 as a measure of model adequacy in MLR.
iv) How does Ra2 overcome the limitation of R2?

10. i) Develop ANOVA table for MLR


ii) What hypotheses are tested in MLR?
iii) How is individual regression parameter tested in MLR?
11. Test the following using residuals of MLR
a) Test of linearity
b) Test of homoskedasticity
c) Test of uncorrelated error terms
d) Normality of error terms
12. Explain the remedial measures against the violation of assumptions of MLR.
13. i) What are the diagnostic issues involved in MLR?
ii) Define outliers and leverage points.
iii) Explain influential observations.
iv) What are good & bad leverage points?
14. What are the measures of influence of observations in MLR?
15. i) Define multicollinearity.
ii) Explain the following w.r.t. multicollinearty.
a) Variance inflation factor (VIF)
b) Tolerance statistics
c) Eigen value structure
d) Multicolinearity condition number
16. For a dependence relationship using regression model, the following statistics are
obtained from a secondary data source that used 90 direct observations:

180 780
90

; X T y
=
X X 180 420 1450
=

780 1450 9600


T

4800
9600
; SST 7928;
=
SSE 6405.
=

3970

(a)

Compute the regression coefficients ( ) .

(b)

(c)

Obtain the sampling distribution of .


Compute ANOVA table and test model adequacy.

(d)
(e)

Test the regression coefficients ( ) .


Comment on the results.

17. The amount of water needed in a day (24 hours) of a city is dependent on many factors.
The city is divided into 3 zones based on population density and industrialization. The
number of households and household size also varies. Another important consideration is
the season of the year. Every month is attributed to a seasonality index. The city
commissioner is interested to measure the relationship between water consumed in a
particular zone-of-city with number of households, household size, and seasonality

index. As the city is expanding both in size and density, a model is required to predict the
future demands as well as its distribution over zones and seasons.
i) Develop a multiple linear regression (MLR) model for the water distribution system.
ii) Suppose n = 50 observations were collected. The MINITAB MLR output is partially
shown in Tables 1 and 2 below. Fill in the blank spaces in Tables 1 and 2. Also
comment on the overall fit of the MLR model.
Table-1: ANOVA table for the MLR model
Source
SS
DoF MS F Remarks
Regression
?
?
?
?
Error
1501
?
? ?
Total
7500
?
?

Table-2: Parameter estimates


Variables Coefficients Standard Error (SE) t Significance level
X1
1.30
0.75
?
?
X2
2.65
0.43
?
?
X3
0.81
0.83
?
?

18. A company manufactures worm gears of two different sizes as per AGMA guidelines.
One of the stages of the manufacturing process is centrifugal casting where molten
metal from furnace (input) is poured into the centrifugal casting machine and worm
wheel is the output. One of the important quality variables of the worm wheel is
hardness, which is dependent on input molten metal characteristics namely, %Cu,
%Sn, %Ni and %P and casting process variables namely, preheat temperature (OC),
rpm, casting time (min) including cooling.
i. Develop a multiple linear regression (MLR) model for worm wheel manufacturing.
ii. Suppose n = 100 observations were collected. The MINITAB MLR output is
partially shown in Tables 1 and 2 below. Fill up the blank spaces in Tables 1 and 2.
Also comment on the overall fit of the MLR model.
Table-1: ANOVA table for the MLR model
Source
Regression
Error
Total

SS
?
1251
35951

DoF
7
?
?

MS
?
?
?

Remarks

Table-2: Parameter estimates


Variables Coefficients Standard Error (SE)
X1
0.06
1.75
X2
-0.65
2.43
X3
-5.81
3.83
X4
1280
43.74
X5
0.27
0.10
X6
-0.026
0.24
X7
-0.021
0.02
**END**

t
?
?
?
?
?
?
?

Significance level
?
?
?
?
?
?
?

Potrebbero piacerti anche