Sei sulla pagina 1di 8

Assignment 1, due Sept.

19 2014
(4)

1. Find the equation of the line which passes through the points (1,1) and
(3,5).
slope = rise/run, so slope = (5-1)/(3-1) = 2
y-intercept is -1, because a decrease in 1 in the horizontal direction
leads to a change of 2 in the vertical direction

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

the equation of the line is y = 0 + 2x, so substituting the first


point gives 1 = 0 + 2(1), solving gives 0 = 1
The equation of the line is y = 1 + 2x

(4)

2. Suppose you are given three data points (1,4), (2,6) and (3,7) and the
line y = 2 + 3x. Give the three residuals and their sum of squares.

The points on the line corresponding to the given x values are


shown below, as are the residuals obtained by subtraction, and their
squares
x
1
2
3

2 + 3x
5
8
11

e = y (2 + 3x)
-1
-2
-4

e2
1
4
16

The sum of squares of the residuals is 21.


P

3. Some data gives the summaries: n = 20, xi yi = 100, xi = 20 and


P
yi = 10. Suppose that the response y is temperature in degrees
Celcius.

Th

(2)

(2)

(a) What is Sxy , the sum of corrected cross products?


Sxy =

xi yi

xi

yi /n = 100 20(10)/20 = 90

(b) If the response was converted to temperature in degrees Fahrenheit,


P
so that y = 32 + 1.8y, what is yi ?

(2)

yi =

(32+1.8)yi = 20(32)+1.8

yi = 640+1.8(10) = 658

(c) If the response was converted to temperature in degrees Fahrenheit,


so that y = 32 + 1.8y, what is the sum of corrected crossproducts
Sxy ?
1

https://www.coursehero.com/file/10709401/sol114/

From above note that y = 32 + 1.8


y.

Therefore yi y = 1.8(yi y)
P
P
So Sxy = (xi = x)(yi y ) = (xi = x)1.8(yi y) =
1.8Sxy = 1.8(90) = 162
(6)

4. In a simple linear regression, the sum of squares function is


S(0 , 1 ) = 3500 7000 7401 + 1000 1 + 5002 + 5412 .

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

Find the least squares values for 0 and 1 .


Using calculus

S
= 700 + 1001 + 1000
0

and

S
= 740 + 1000 + 1081
1

Rearranging after division by 100 and 4 respectively gives the two


equations
0 + 1 = 7
and

250 + 271 = 185

Substuting 0 = 7 1 from the first equation into the second gives


25(7 1 ) + 271 = 185

Th

or

2 1 = 185 175

or

1 = 10/2 = 5

Substuting this back in the first equation gives


0 = 7 5 = 2

This can also be solved by completing the square (twice).


2

https://www.coursehero.com/file/10709401/sol114/

Considering S as a function of 0 allows one to write


S = 50(0 M)2 + D
where
M=

700 1
= 7 1
100

and
D = 50M 2 + 3500 7401 + 5412

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

It follows that whatever the choice of 1 , the best choice of


0 = 7 1 .
Substituting this in S gives S as a function of 1 only, and
S = D = 50(7 1 )2 + 3500 7401 + 5412

Centering this quadratic gives

S = 4(1 m)2 + d

where m can be determined by equating the linear terms in the


centered and uncentered versions
4(2)m = (2)7(50) 740

or

m=5

This is the best choice for 1 and substituting above gives


0 = 7 5 = 2

Th

5. A random sample of 13 elementary school students is selected, and each


student is measured on a creativity score (x) using a well-defined testing
instrument and on a task score (y) using a new instrument. The task
score is the mean time taken to perform several hand-eye coordination
tasks. The data are:

Use R to do the following questions. Make sure your output is integrated


into your responses. (Cut and paste as necessary.)

(4)

(a) Plot Tasks versus Creativity and comment on the form and strength
of the association. Be sure to label the axes.
3

https://www.coursehero.com/file/10709401/sol114/

CREATIVITY(X)
28
35
37
50
69
84
40
65
29
42
51
45
31

TASKS(Y)
4.5
3.9
3.9
6.1
4.3
8.8
2.1
5.5
5.7
3.0
7.1
7.3
3.3

tasks

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

STUDENT
AE
FR
HT
IO
DP
YR
QD
SW
DF
ER
RR
TG
EF

Th

30

(8)

40

50

60

70

80

creativity

There is a weak positive association between tasks and


creativity scores.
and Y .
(b) Calculate the summaries Sxx , Sxy , Syy and X

a program was written for the entire question as shown below


the solutions are inserted in each part
> creat.ass
4

https://www.coursehero.com/file/10709401/sol114/

Th

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

function(creativity, tasks){
plot(creativity,tasks)
ssxx=crossprod(creativity-mean(creativity))
ssxy=crossprod(creativity-mean(creativity),tasks-mean(tasks))
ssyy=crossprod(tasks-mean(tasks))
xbar=mean(creativity)
ybar=mean(tasks)
r = ssxy/sqrt(ssxx*ssyy)
b1hat=ssxy/ssxx
b0hat=ybar-b1hat*xbar
postscript("tcplot",horizontal=F)
plot(creativity,tasks)
abline(b0hat,b1hat)
yhat = b0hat+b1hat*creativity
e=tasks-yhat
esum=sum(e)
rex=cor(e,creativity)
dev.off()
postscript("res.creat",horizontal=F)
plot(creativity,e)
abline(0,0)
dev.off()
ssres=ssyy-ssxy^2/ssxx
tss=ssyy
ssreg=ssxy^2/ssxx
R2=ssreg/tss
return(list(ssxx=ssxx,ssxy=ssxy,ssyy=ssyy,ybar=ybar,xbar=xbar,r=r,
b0hat=b0hat,b1hat=b1hat,e=e,esum=esum,rex=rex,ssres=ssres,tss=tss,
ssreg=ssreg,R2=R2))
}
$ssxx
[,1]
[1,] 3463.077
$ssxy

[,1]
[1,] 220.0923

https://www.coursehero.com/file/10709401/sol114/

$ssyy
[,1]
[1,] 44.53077
$ybar
[1] 5.038462
$xbar
[1] 46.61538
(c) Use these data summaries to calculate the correlation coefficient.
Does the value agree with your visual assessment in (a)?

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

(2)

the correlation of .56 confirms the weak positive association in


the plot

$r

[,1]
[1,] 0.5604588

(2)

(d) Use these summaries to calculate the least squares values for the
intercept and slope.
b0hat

[,1]
[1,] 2.075869
$b1hat

[,1]
[1,] 0.06355398

Th

(1)

(3)

(e) Add the least squares line to the plot in (a).


see above

(f) Obtain the residuals, ei = yi yi . Calculate their sample mean to


verify it is zero, and the correlation with X to verify it is also zero.

$e
[1] 0.6446202 -0.4002577 -0.5273656 0.8464327 -2.1610928
[7] -2.5180275 -0.7068769 1.7810662 -1.7451355 1.7828787
[13] -0.7460418
6

https://www.coursehero.com/file/10709401/sol114/

1.3855975
2.3642026

$esum
[1] 8.881784e-16
$rex
[1] -2.420007e-16
The sum of the residuals and their correlation with tasks are
essentially zero.
(g) Plot the residuals versus X. Do the residuals look random?

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

(2)

30

40

50

60

70

80

creativity

the residuals look random

(6)

(h) Obtain the residual, regression and total sums of squares, using the
data summaries.

Th

$ssres

[,1]
[1,] 30.54303
$tss

[,1]
[1,] 44.53077
$ssreg
[,1]

https://www.coursehero.com/file/10709401/sol114/

[1,] 13.98774
(2)

(i) What is the value of the coefficient of determination, i.e. what


proportion of the variation in Tasks is explained by Creativity?
$r2

Th

sh is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m

[,1]
[1,] 0.3141141

https://www.coursehero.com/file/10709401/sol114/

Powered by TCPDF (www.tcpdf.org)

Potrebbero piacerti anche