Sei sulla pagina 1di 4

STAT 3008: Applied Linear Regression

2014-15 Term 2
Assignment #3
Due: April 8th, 2015 (Wednesday) at 5:00pm
This assignment covers material from Chapter 4 to 6 of the lecture notes.
** Please submit the hardcopy of the R-codes and outputs for Problem 3 and Problem 4.
You need to show your calculation in details order to obtain full scores.
Problem 1 (Section 4.2 Misspecification of Mean Function) [20 points]: The kinetic energy
y 0 1 x 2 e,

of an object (y) is related with its velocity (x) through

e ~ N (0, 02 )

Suppose we fit the data {(xi, yi), i = 1, , n} based on y 0 1 x ,

(a)

~ N (0, 2 ) .

1 x1
1 x12

[4points] Show that E ( ) (X' X) 1 X' X 2 , with 0 , 0 , X 1 x2 and X 2 1 x2




1
1

2
1 xn
1 xn

(b) [11points] Express E ( 0 ) and E (1 ) in terms of


(c) [5points] Given that lim
n

i 1

xi ,

i 1

xi2 ,

i 1

xi3 , n, 0 and 1.

1 n
1 n
1 n
xi 0 , lim xi2 x2 0 and lim xi3 x x3 with

n n
n n
n i 1
i 1
i 1

x 0 . Express E ( 0 ) and E (1 ) in terms of 0 , 1 , x2 and x as n . Show that


0 is NOT a consistent estimator for 0, and 1 is NOT a consistent estimator for 1.

0 0 and lim 1 1 )
(That is, lim
n
n

Problem 2 (Section 4.1) [14 points]: Consider the data {(ui, vi, yi), i=1,2,,n} with u v 0 ,
n

and SUV ui vi 0 . The data is fitted by a multiple linear regression


i 1

E (Y | U u,V v) 0 1u 2 v

(a) [8 points] Show that the OLS estimates 1 SUY / SUU , 2 SVY / SVV and 0 y .
(b) [6 points] Suppose that a simple linear regression E (Y | U u) 0 1u is fitted to the
data. Do the OLS estimates 0 and 1 the same as the corresponding estimates in
part (a)?

Page 1/4

Problem 3 (Weighted Least Squares and Lack-of-Fit Test) [26 points]: To study the
inheritance of sweet peas () from one generation to the next, diameters of peas
from parent plants and their offspring plants are recorded for comparisons. First, parent
plants are classified according to the diameters of their peas, from 0.15 inches to 0.21
inches (7 size class).
He then distributed the 7 size classes of seeds to his friends. Certain time later, the
diameters of peas from offspring plants were recorded from his friends, with the sample
average and sample standard deviation of peas diameters presenting in the table below:

(a) [3 points] Draw a scatterplot of average offspring diameter ( y j ) vs parent diameter (xj)
using the plot function.
(b) [5 points] Using the lm function, compute the weight least squares estimates by
regressing the average offspring diameter with the parent diameter, with weights wj =
1/SDj.
(c) [6 points] Suppose we are interested in whether there is perfect inheritance in the
diameter. That is, the size of the offspring pea is the same as the size of the parent pea.
Test the hypothesis on whether there is perfect inheritance. I.e., H0: 1 = 1 vs H1: 1 1.
(d) [12 points] The table above was summarized from the original data with 198 offspring
plants. If we consider the same simple linear regression but on every single plant,
yi 0 1 xi ei

for i = 1,2, , 198

The residuals sum of squares RSS= 4.2617. Based on the information above, construct
the ANOVA Table for the Lack-of-fit Test as in Chapter 5 page 28, and perform the
corresponding lack-of-fit test at =0.05.

Page 2/4

Problem 4 (Dummy Variable) [20 points]: The data file salary.txt (in the alr3 package)
contains salary and other characteristics of all faculty in a small Midwestern college in early
1980s. Below is the description of selected variables in the data file:
Predictor
Sex
Rank
Year
Salary

Variable
S
R
X
Y

Description
1 for female and 0 for male
1 = Assistant Professor, 2 = Associate Professor, 3 = Full Professor
Number of years in current rank
Annual salary (in US Dollar)

Define U2i and U3i to be the dummy variables for Associate Professor and Full Professor
respectively.
(a) [10 points] Assume that the impact of the number of years in current rank (Year X) is the
same for different sex and ranks, we construct a linear model to explain the salary by the
3 other variables: E (Y | S s, R j, X x ) x 0 1s 0 jU j 1 jU j s
3

j 2

Compute the OLS estimates for the parameters ( ,0 ,02 ,03,12 ,13, 2 ) .
(b) [7 points] Test the hypothesis on whether the salary for male and female are the same
for all the 3 ranks. (Consider the model in part (a) as the alternative hypothesis, and the
3

following model as the null hypothesis: E (Y | R j, X x ) x 0 0 jU j )


j 2

(c) [3 points] Based on the results in part (b), do you think there is sex discrimination due to
the amount of salary?

Page 3/4

Problem 5 [20 points] (Polynomial Regression): Consider the scatterplot below with data
{(xi, yi), i=1, 2, , 24}:

Suppose we want to fit the data based on a quadratic regression, E (Y | X x) 0 0 x 2 x 2


In matrix form,

Y X e

y1 1 x1

y 2 1 x 2

y 24 1 x24

=>

Given that x 5.833333, y 56.6275 ,

24

y
i 1

140
1164
24

X' X 140 1164 10568 ,


1164 10568 98268

X' X

2
i

x12
e1


2 0
x 2 e2
1

2
e
2
x24
24

Y' Y = 89882.2642.

0.4500197 0.2426193 0.0207614


1359.06

0.2426193 0.1671811 0.0151052 , X' Y 6322.83


0.0207614 0.0151052 0.0013887
47452.65

(a) [6 points] What are the OLS estimates for 0, 1 and 2?


(b) [6 points] Compute the RSS and show that 13.98 .
(c) [5 points] Suppose we are interested in maximizing the response y. Based on the results
above, estimate the optimal value of x such that the response is maximized, and
construct a 95% C.I. for that optimal value of x.
(d) [3 points] Suppose x is an experimental predictor. Before performing the experiment to
obtain the response y, its known that the optimal value of x is somewhere in the
middle of the interval [1.0, 10.0]. Briefly comment on whether the current choice of
predictors {xi, i = 1,2, ,24} is reasonable.
-

End of the Assignment -

Page 4/4

Potrebbero piacerti anche