Sei sulla pagina 1di 8

Problem 5.1,5.

5:
The purpose of the problem is to see if there is a correlation between first (x, y), (x, z)scorned to see if there is
correlation between X Y Z. the sample included 3 variables each of them has 5 observation.

Simple
Statistic
s
Variable

Mean

Std Dev

Sum

Minimu
m

Maximu
m

7.80000

4.54973

39.00000

3.00000

13.00000

10.20000

4.32435

51.00000

5.00000

15.00000

4.60000

2.88097

23.00000

1.00000

8.00000

Pearson Correlation Coefficients,


N = 5Prob > |r| under H0: Rho=0
X
Y

0.965090.0078

-0.975250.0047

There is strong correlation between X and Y (r=0.96) and also there is strong correlation between X and Z
(r=0.97).

3 Variables:

Simple
Statistic
s
Variable

Mean

Std Dev

Sum

Minimu
m

Maximu
m

4.60000

2.88097

23.00000

1.00000

8.00000

7.80000

4.54973

39.00000

3.00000

13.00000

10.20000

4.32435

51.00000

5.00000

15.00000

Pearson Correla
tion Coefficients
, N = 5Prob > |r|
under H0: Rho=
0
X

1.00000

0.965090.0078

-0.975250.0047

0.965090.0078

1.00000

-0.963170.0084

-0.975250.0047

-0.963170.0084

1.00000

There is strong correlation between X and


Y (r=0.96). Also there is strong
correlation between X and Z (r=0.97) and
strong correlation between Y and
(r=0.96)Number of Observations

Read
Number of Observations Used

Analysis of
Variance
DF

Sum of
Squares

Mean
Square

F Value

Pr > F

Model

77.11928

77.11928

40.73

0.0078

Error

5.68072

1.89357

Corrected
Total

82.80000

Source

Root MSE

1.37607 R-Square

0.9314

Dependent
Mean

7.80000 Adj R-Sq

0.9085

Coeff Var

17.64195

Parameter
Estimates
DF

Paramete
rEstimate

Standard
Error

t Value

Pr > |t|

Intercept

0.78916

1.25920

0.63

0.5753

1.52410

0.23882

6.38

0.0078

Variable

The equation of the regression line to predict Y from X is Y=0.78+1.52Xthis line has a coefficient of determination
of 0.93 meaning that 93% of the scatter or variation can be explained by using X to predict y and a s.e.e. of 1.37
meaning that roughly 95% of the time our quit Y prediction could be off .

data d;
input X Y Z;
DATALINES;
1 3 15
7 13 7
8 12 5
3 4 14
4 7 10
;
ODS GRAPHICS ON;
ODS RTF FILE='D.RTF';
PROC CORR ;
VAR X;
WITH Y Z;
RUN;
PROC CORR ;
VAR X Y Z;
RUN;
PROC REG;
MODEL Y=X;
RUN;
ODS GRAPHICS OFF;
ODS RTF CLOSE;
RUN;
Problem 5.3:

2 Variables:

AGE

SBP

Simple
Statistic
s
Minimu
m

Maximu
m

180.0000
0

15.00000

50.00000

796.0000
0

116.0000
0

150.0000
0

Variable

Mean

Std Dev

Sum

AGE

30.00000

13.03840

SBP

132.6666
7

14.00952

Pearson Correlation C
oefficients, N = 6Prob

> |r| under H0: Rho=0


AGE

SBP

AGE

1.00000

0.952580.0033

SBP

0.952580.0033

1.00000

The purpose of the problem was to compete the correlation between the age and sbp. The sample
was collected from 6 person including the age and their SBP. There is a strong correlation between
the age of the person and his SBP (r=0.95, p=0.0033).

DATA P;
INPUT AGE SBP;
DATALINES;
15 116
20 120
25 130
30 132
40 150
50 148
;
ODS GRAPHICS ON;
ODS RTF FILE='P.RTF';
PROC CORR;
VAR AGE SBP;
RUN;
ODS GRAPHICS OFF;
ODS RTF CLOSE;
RUN;

Hot dog problem:


Variable
Attendance
Sales

Mean

Std Dev

Minimu
m

1010

7214.90
5183.80

1679.23
1242.42

4534.00
3216.00

2 Variables:

Maximu
m

Skewne
ss

9821.00
7001.00

0.017936
5
0.022890
7

Attendance Sales

Pearson Correlation C
oefficients, N = 10
Prob > |r| under H0: R
ho=0

Attendance

Attendance

Sales

1.00000

0.93748<.0001

Sales

0.93748<.0001

1.00000

Number of Observations Read

10

Number of Observations Used

10

Analysis of
Variance
DF

Sum of
Squares

Mean
Square

F Value

Pr > F

Model

12209634

12209634

58.04

<.0001

Error

1682814

210352

Corrected
Total

13892448

Source

Root MSE

458.64114 R-Square

Dependent
Mean

0.8789

5183.80000 Adj R-Sq

Coeff Var

0.8637

8.84759

Parameter
Estimates
DF

Paramete
rEstimate

Standard
Error

t Value

Pr > |t|

Intercept

179.41977

672.68015

0.27

0.7964

Attendanc
e

0.69362

0.09104

7.62

<.0001

Variable

a) The correlation coefficient r=0.93,p<.0001 meaning that there is a strong correlation


between attendance and hot dog sales.
b) The regression line to predict sales based on attendance is sales=179.42+0.69 attendance.
c) s.e.e. was around 458.6.
d)hot dog sales when attendance is 7000 =5009.42.

e) yes.
f) The daily attendance and the number of hot dog sales at a local ball park are studied over a
period of time for ten games. The study was to see if hotdog sales could be predicted as a
result of the attendance at the game. The attendance at the games for this sample ranged
from 4534 to 9821 with a mean attendance of 7214.9 and s = 1679.2. There was a strong
positive correlation between hot dog sales and attendance, ( r= 0.94)it was decided to conduct a
pilot study that would do a regression analysis on the data to predict hotdog sales from
attendance. The model was significant and developed the predictor equation for hotdog sales to
be:
Sales = 179.42 + 0.69Attendance valid for number attending between 4534 and 9821. The
coefficient of determination, R-squared indicates that around 87.9 % of the variation about the
line could be explained by using daily attendance to explain hotdog sales. The s.e.e. was
around 458.6 which is the average amount each of our predictions should be off.
data hotdogs;
input Attendance Sales ;
datalines;
8747 6845
5857 4168
8360 5348
6945 5687
8688 6007
4534 3216
7450 5018
5874 4652
9821 7001
5873 3896
;
ods graphics on;
ods rtf file = 'hotdogs.rtf';
proc means n mean stddev min max skew ;
var Attendance Sales ;
run;
proc corr nosimple ;
var Attendance Sales ;
run;
proc reg ;
model Sales = Attendance ;
run;
ods graphics off;
ods rtf close ;
run;

Taxes problem:
Variable
income
taxes

Mean

Std Dev

Minimu
m

1111

75.23636
36
21.51818
18

48.18655
98
7.282145
0

9.100000
0
10.20000
00

2 Variables:

Pearson Correlation C
oefficients, N = 11
Prob > |r| under H0: R

income taxes

Maximu
m

Skewne
ss

150.7000
000
30.40000
00

0.291914
70.364562
2

ho=0
income

taxes

1.00000

0.834160.0014

0.834160.0014

1.00000

income
taxes
Number of Observations Read

11

Number of Observations Used

11

Analysis of
Variance
DF

Sum of
Squares

Mean
Square

F Value

Pr > F

Model

368.98798

368.98798

20.59

0.0014

Error

161.30839

17.92315

10

530.29636

Source

Corrected
Total

Root MSE

4.23357 R-Square

Dependent
Mean

21.51818 Adj R-Sq

Coeff Var

19.67441

0.6958
0.6620

Parameter
Estimates
DF

Paramete
rEstimate

Standard
Error

t Value

Pr > |t|

Intercept

12.03382

2.44923

4.91

0.0008

Income

0.12606

0.02778

4.54

0.0014

Variable

a)r=0.83 there is a strong positive correlation between tax an income.


B) r-square=0.69 meaning that around 69% of the variation about the line could be explained
by using income to predict taxes.
c)r=0.83 and p=0.0014 meaning that there is strong correlation between the income and tax.
d)the line of regression tax=12.033+0.12 income.
e)s.e.e=4.23 which is the average amount each of our prediction should be off.
f)when gross income is $80,000 the tax should be $9612.

The purpose of the study was to see if taxes could be predicted by the income of individual. A
sample of 11 individual 2009 and taxes returns. The income for this sample ranged from 9.1 to
150.7with a mean taxes of 21.51 and s = 7.28. There was positive correlation between income
and taxes, ( r= 0.83) The model was significant and developed the predictor equation for tax to
be: taxes=12.033+0.12 income valid for number attending between9.1and150.7. The coefficient
of determination, R-squared indicates that around 69% of the variation about the line could be
explained by income to predict taxes. The s.e.e. was around 4.23which is the average amount
each of our predictions should be off.

data tax;
input income taxes ;
datalines;
38.7 16.0
80.5 20.1
14.8 11.1
47.3 24.3
9.1 10.2
150.7 30.4
55.9 27.3
110.2 27.9
73.2 16.2
146.8 29.8
100.4 23.4
ods graphics on;
ods rtf file = 'tax.rtf';
proc means n mean stddev min max skew ;
var income taxes ;
run;
proc corr nosimple ;
var income taxes ;
run;
proc reg ;
model taxes=income ;
run;
ods graphics off;
ods rtf close ;
run

Potrebbero piacerti anche