Sei sulla pagina 1di 27

SIMPLE REGRESSION

AND
CORRELATION ANALYSIS

Maria Nerissa C. Gomopas


REGRESSION ANALYSIS

REGRESSION ANALYSIS is concerned with the


problem of estimation and forecasting. If you are
given a series of values for two variables, where
the values of one variable depend on the other, it
is possible to estimate the value of the dependent
variable corresponding to a given value of the
independent variable

Example: Given the weights of a persons


corresponding to specific heights, estimate the
possible weight of a person whose height is …
REGRESSION ANALYSIS
REGRESSION ANALYSIS is used when predicting
the behavior of a variable. The regression
equation explains the amount of variations
observable in the independent variable x. It is
actually an equation of a straight line in the form:

y = a + bx
where
y = criterion measure
x = predictor
a = ordinate or the point where the regression
line crosses the y-axis
b = beta weight or slope of the line
REGRESSION ANALYSIS
REGRESSION ANALYSIS is used when predicting
the behavior of a variable. The regression
equation explains the amount of variations
observable in the independent variable x. It is
actually an equation of a straight line in the form:

y = a + bx

a = (Σy) (Σx²) – (Σx) (Σxy) b = n(Σxy) – (Σx) (Σy)


n(Σx²) – (Σx)² n(Σx²) – (Σx)²

where n = number of pairs


Example 1.
The table shows the final grades of 10 students in
Algebra and Statistics.

Algebra (x) 75 80 93 65 87 71 98 68 84 77
Statistics(y) 82 78 86 72 91 80 95 72 89 74

What is a student’s expected grade in Statistics if his


grade in Algebra is 78? 85?
Solution

No. of Pairs Algebra (x) Statistics (y) xy x²

1 75 82 6150 7625
2 80 78 6240 6400
3 93 86 7998 8649
4 65 72 4680 4225
5 87 91 7917 7569
6 71 80 5680 5041
7 98 95 9310 9604
8 68 72 4896 4624
9 84 89 7476 7058
10 77 74 5698 7056
n = 10 Σx = 798 Σy = 819 Σxy = 66045 Σx² = 64722
What is a student’s expected grade in Statistics if his
grade in Algebra is 78? 85?

n = 10 Σx = 798 Σy = 819 Σxy = 66045 Σx² = 64722

a = (Σy) (Σx²) – (Σx) (Σxy) b = n(Σxy) – (Σx) (Σy)


n(Σx²) – (Σx)² n(Σx²) – (Σx)²

a = 819 (64722) – 798 ( 66045) b = 10 (66045) – 798 (819)


10 (64722) – (798)² 10 (64722) – (798)²

a = 29.129 or 29.13 b = 0.661 or .66


y = a + bx
y = 29.13 + .66x
Thus, if Algebra is 78, (x = 78) and, if Algebra is 85, (x = 85)
y = a + bx y = a + bx
y = 29.13 + .66 (78) y = 29.13 + .66 (85)
y = 80.6 or 81 y = 85.23 or 85
Then Statistics is 81 Then Statistics is 85
The graphical representation of the data is called a scatter plot or scatter
diagram. The line in scatter diagram is called is called the trend line. The
line can be used to estimate the expected grades in Statistics if the grades
in Algebra is 78 and 85.

100
90
80
Grades in Statistics

70
60
50
40
30
20
10
0
0 20 40 60 80 100 120
Grades in Algebra
Example 2.
The Data in the table represent the memberships at a
university mathematics club during the past 5 years.

Number of Years (x) Membership (y)

2008 25
2009 30
2010 32
2011 45
2012 50
Find the equation in the form of y = a + bx to predict the
membership 3 years from now.
Solution : Let Σx = 0

No. of Years Membership xy x²


(x) (y)
2008 -2 25 -50 4
2009 -1 30 -30 1
2010 0 32 0 0
2011 1 45 45 1
2012 2 50 100 4
Σx = 0 Σy = 182 Σxy = 65 Σx² = 10

a = (Σy) (Σx²) – (Σx) (Σxy) b = n(Σxy) – (Σx) (Σy)


n(Σx²) – (Σx)² n(Σx²) – (Σx)²

a = (Σy) (Σx²) – (0) (Σxy) b = n(Σxy) – (0) (Σy)


n(Σx²) – (0)² n(Σx²) – (0)²

a = Σy b = Σxy
n Σx²
Find the equation in the form of y = a + bx to predict the
membership 3 years from now.

Σy = 182 Σxy = 65 Σx² = 10

Three years from now, the club would have 69 members


The trend line can be used to estimate the number of members in 2015.
From the figure, the estimate members in 2015 is between 65 - 70.

80
70
60
Membership

50
40
30
20
10
0
2006 2008 2010 2012 2014 2016 2018
Year
CORRELATION ANALYSIS

CORRELATION AND REGRESSION ANALYSIS are


closely interrelated topics. Regression analysis
deals with the estimation of one variable based on
the changes and movements of two variables.
Correlation analysis is concerned with the
relationship in the changes of such variables. It is
a method of measuring the strength of such
relationship between the two variables.
The following are examples of correlated variables.

1. The students mental ability and academic


performance in school are related.

2. There is a close relationship between reading


comprehension and mathematical ability.

3. In linear equation y = x +1, the higher the value of


x to be assigned, the higher the corresponding
value of the dependent variable y.

4. In Physics, the larger the force exerted to push a


body, the faster the acceleration of the body will
be.
Pearson Product-Moment
Correlation Coefficient

The most common statistical tool in measuring


the linear relationship between two random
variables, x and y, is a linear correlation
coefficient commonly called the Pearson Product-
Moment Correlation Coefficient or Pearson r for
short. This formula was develop and perfected by
Karl Pearson.
Pearson Product-Moment
Correlation Coefficient

r Verbal Interpretation

0.00 to ± 0.20 Slight correlation

±0.21 to ± 0.40 Low correlation

±0.41 to ± 0.60 Moderate correlation

±0.61 to ± 0.80 High correlation

±0.81 to ± 1.00 Very high correlation


Example 1.
Test the hypothesis that there is no correlation between
Mental Ability and English Proficiency at 5% level of significance.

Mental English Mental English Mental English


Proficiency Proficiency Proficiency
Ability Ability Ability
(y) (y) (y)
(x) (x) (x)
50 200 48 185 48 184
54 198 47 197 53 190
50 200 44 183 54 191
51 203 44 171 33 170
49 186 46 179 34 168
46 205 45 185
Solution
1. State the null hypothesis.
Hₒ: There is no significant correlation between Mental Ability
and English Proficiency.
2. Level of significance α = 5 %
3. Pearson r will be used to test the hypothesis
4. Computation
Mental Ability (x) English Proficiency (y) xy x² y²

50 200 10,000 2,500 40,000

54 198 10,692 2,916 39,204

50 200 10,000 2,500 40,000

51 203 10,353 2,601 41,209

49 186 9,114 2,401 34,596

46 205 9,430 2,116 42,025

48 185 8,880 2,304 34,225

47 197 9,259 2,209 38,809

44 183 8,052 1,936 33,489

44 171 7,524 1,936 29,241

46 179 8,234 2,116 32,041

45 185 8,325 2,025 34,225

48 184 8,832 2,304 33,856

53 190 10,070 2,809 36,100

54 191 10,314 2,916 36,481

33 170 5,610 1,089 28,900

34 168 5,712 1,156 28,224

796 3,195 150,401 37,834 602,625


Solution
1. State the null hypothesis.
Hₒ: There is no significant correlation between Mental Ability
and English Proficiency.
2. Level of significance α = 5 %
3. Pearson r will be used to test the hypothesis
4. Computation

r = _______17 (150,401) – (796) (3,195)_________


 [17 (37,834) – (792)²] [17 (602,625) – (3,195)]²

r = 0.727
Solution
1. State the null hypothesis.
Hₒ: There is no significant correlation between Mental Ability
and English Proficiency.
2. Level of significance α = 5 %
3. Pearson r will be used to test the hypothesis
4. Computation. r = 0.727
5. df = N – 2 = 17 – 2 = 15
6. Tabular value = 0.482
7. Reject the null hypothesis because the computed value , 0.727
is greater than the tabular value, 0.482.
8. There is a significant linear relationship between Mental Ability
and English Proficiency. Since, r = 0.727, is between ±0.61 to
± 0.80, then there is high correlation between the two variable.
SCATTER PLOT OF DATA IN EXAMPLE 1
250

200

150

100

50

0
0 20 40 60
CORRELATION

 Measures the relative strength of the linear


relationship between two variables
 Unit-less
 Ranges between –1 and 1
 The closer to –1, the stronger the negative linear
relationship
 The closer to 1, the stronger the positive linear relationship
 The closer to 0, the weaker any positive linear relationship
SCATTER PLOTS OF DATA WITH VARIOUS
CORRELATION COEFFICIENTS
Y Y Y

X X X
r = -1 r = -.6 r=0
Y
Y Y

X X X
r = +1 r = +.3 r=0
Pearson Product-Moment
Correlation Coefficient

r Verbal Interpretation

0.00 to ± 0.20 Slight correlation

±0.21 to ± 0.40 Low correlation

±0.41 to ± 0.60 Moderate correlation

±0.61 to ± 0.80 High correlation

±0.81 to ± 1.00 Very high correlation


LINEAR CORRELATION

Strong relationships Weak relationships

Y Y

X X

Y Y

X X
LINEAR CORRELATION
No relationship

Potrebbero piacerti anche