Sei sulla pagina 1di 4

02/12/20

DIFFERENCE BETWEEN CORRELATION AND


REGRESSION ANALYSIS: REGRESSION ANALYSIS:
The regression analysis is concerned with the
1. Correlation coefficient is quantative measure of the extent or the
formulation and determination of algebraic expressions degree of linear relationship between two variables. But the
for the relationship between the two variables which are regression means an average relationship between two variables.
also known as “Regression Lines”. These regression 2. Correlation does not necessarily establish cause and effect
lines or the exact algebraic forms of the relationship are relationship. However, in regression analysis there is a clear
then used for predicting the value of one variable from indication of cause (independent variable) and effect (dependent
variable) relationship.
that of the other. Here the variable whose value is to be
3. In correlation analysis, correlation coefficients, 𝒓𝒙𝒚 = 𝒓𝒚𝒙 measures
predicted is called dependent or explained variable and
the linear relationship between the variables 𝒙 and 𝒚. On the other
the variable used for prediction is called independent or hand, in regression analysis, the identity of variables (dependent
Explanatory variable. 43
or independent) are important. Also, the regression coefficients,
44

𝒃𝒙𝒚 ≠ 𝒃𝒚𝒙 .

DIFFERENCE BETWEEN CORRELATION AND


REGRESSION LINES:
REGRESSION ANALYSIS:
4. Correlation coefficients 𝒓𝒙𝒚 is a relative measure of the In a simple regression analysis, there are two
linear relationship between 𝒙 and 𝒚, while the regression regression lines called
coefficients 𝒃𝒙𝒚 and 𝒃𝒚𝒙 are absolute measures of change
𝒙(𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆) 𝒐𝒏 𝒚(𝒊𝒏𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆)
in the value of one variable corresponding to a unit
change in the value of another variable. and
5. Correlationanalysis is confined only to study of linear 𝐲(𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆)𝒐𝒏 𝒙(𝒊𝒏𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆).
relationship between two variables but the regression A regression line is the straight line which gives the
analysis deals with linear and non-linear relationship best fit in the least square sense to the given
between variables. 45
frequency. 46
02/12/20

REGRESSION COEFFICIENTS: NOTE:


Let 𝒙 𝒂𝒏𝒅 𝒚 be means for series 𝑿 and 𝒀 and let 𝝈𝒙 , 𝝈𝒚 be the 1.The mean of 𝒙 𝒙 and mean of 𝐲 ( 𝒚) also lie
standard deviation for 𝑿 and 𝒀 .
on the regression lines.
Then the regression coefficients for 𝒚 on 𝒙,
𝝈𝒚
2.The correlation coefficient between 𝒙 and 𝒚 for
𝒀 − 𝒚 = 𝒃𝒚𝒙 (𝑿 − 𝒙), is 𝒃𝒚𝒙 = 𝒓 and
𝝈𝒙 the given lines of regression 𝒓 is negative
the regression coefficients for 𝒙 on 𝒚, when both regression coefficients 𝒃𝒙𝒚 and 𝒃𝒚𝒙
are negative.
𝝈
𝑿 − 𝒙 = 𝒃𝒙𝒚 (𝒀 − 𝒚) is 𝒃𝒙𝒚 = 𝒓 𝒙 .
𝝈𝒚

Hence, the correlation coefficient between 𝒙 and 𝒚 for the given


lines of regression is 𝒓 = 𝒃𝒙𝒚 × 𝒃𝒚𝒙 47 48

EXAMPLE: SOLUTION:
Find the correlation coefficient between 𝒙 and 𝒚, when the lines Therefore, the line of regression of 𝒙 on 𝐲 is 𝒙 − 𝟐𝒚 +
of regression are 𝟐𝒙 − 𝟗𝒚 + 𝟔 = 𝟎, 𝒙 − 𝟐𝒚 + 𝟏 = 𝟎. 𝟏 = 𝟎. line of regression of 𝒚 on 𝒙 is 𝟐𝒙 − 𝟗𝒚 + 𝟔 = 𝟎.
SOLUTION: Then
Let the line of regression of 𝒙 on 𝐲 be 𝟐𝒙 − 𝟗𝒚 + 𝟔 = 𝟎. Then the
𝒙 − 𝟐𝒚 + 𝟏 = 𝟎 ⇒ 𝒙 + 𝟏 = 𝟐𝒚 ⇒ 𝒃𝒙𝒚 = 𝟐.
line of regression of 𝒚 on 𝒙 is 𝒙 − 𝟐𝒚 + 𝟏 = 𝟎. Then
𝟗 𝟗
And
𝟐𝒙 − 𝟗𝒚 + 𝟔 = 𝟎 ⇒ 𝒙 + 𝟑 = 𝒚 ⇒ 𝒃𝒙𝒚 = ,
𝟐 𝟐
𝟐 𝟔 𝟐
And 𝟐𝒙 − 𝟗𝒚 + 𝟔 = 𝟎 ⇒ 𝒚 = 𝒙 + ⇒ 𝒃𝒚𝒙 = ,
𝟗 𝟗 𝟗

𝒙 − 𝟐𝒚 + 𝟏 = 𝟎 ⇒ 𝒚 =
𝟏 𝟏
𝒙 + ⇒ 𝒃𝒚𝒙 =
𝟏 And
𝟐 𝟐 𝟐
𝟐
𝒓= 𝒃𝒙𝒚 𝒃𝒚𝒙 =
𝟑
> 𝟏, which is not possible. Thus, our choice of
𝒓= 𝒃𝒙𝒚 𝒃𝒚𝒙 = . Hence, the correlation coefficient
𝟐 49
𝟑 50

regression is incorrect. between 𝒙 on 𝐲 is 2/3.


02/12/20

EXAMPLE: EXAMPLE:
If the regression equations are 𝒙 + 𝟎. 𝟖𝟕𝒚 − 𝟏𝟗. 𝟏𝟑 = 𝟎, In the following table are recorded data showing the test
𝟎. 𝟓𝟎𝒙 + 𝒚 − 𝟏𝟏. 𝟔𝟒 = 𝟎. Then find (a) mean of 𝒙 (b) mean of 𝒚 (c) the scores made by salesmen on an intelligence test and their
correlation coefficient between 𝒙 and 𝒚.
sales:
SOLUTION: Salesmen: 1 2 3 4 5 6 7 8 9 10
Since mean 𝒙 and mean of 𝒚 lie on the two regression lines, we have, Test Scores: 40 70 50 60 80 50 90 40 60 60
𝒙 + 𝟎. 𝟖𝟕 𝒚 − 𝟏𝟗. 𝟏𝟑 = 𝟎 −−−−− −(𝟏) Sales(000): 2.5 6.0 4.5 5.0 4.5 2.0 5.5 3.0 4.5 3.0

𝟎. 𝟓𝟎𝒙 + 𝒚 − 𝟏𝟏. 𝟔𝟒 = 𝟎 ------(2)


solving (1) and (2) , we get, 𝒙 = 𝟏𝟓. 𝟕𝟗 and 𝒚 = 𝟑. 𝟕𝟒. Also, the
Calculate the regression line of Sales on test scores and
regression coefficient of 𝒚 on 𝒙 is 𝒃𝒙𝒚 = −𝟎. 𝟓𝟎. And regression estimate the most probable weekly sales volume if a
coefficient of 𝒙 on 𝒚 is 𝒃𝒚𝒙 = −𝟎. 𝟖𝟕. salesman makes a score of 70.
Therefore, the regression coefficient 51 52

𝒓= 𝒃𝒙𝒚 𝒃𝒚𝒙 = −𝟎. 𝟓𝟎 −𝟎. 𝟖𝟕 = 𝟎. 𝟒𝟔=-0.66.

SOLUTION: EXAMPLE:

Obtain the line of regression of 𝒚 on 𝒙 for


𝑥 𝑦 𝑢 𝑣 𝑢𝑣 𝑢 𝑣
40 2.5 -20 -1.55 31 400 2.4025 Mean of test scores= 𝑥̅ = 60
70 6 10 1.95 19.5 100 3.8025 mean of sales= 𝑦 =4.05
50
60
4.5
5
-10
0
0.45
0.95
-4.5
0
100
0
0.2025
0.9025 𝜎 =
∑ ̅
=

= 240
the given data:
80 4.5 20 0.45 9 400 0.2025 ∑ ∑
50 2 -10 -2.05 20.5 100 4.2025 𝜎 = = =1.6225
90 5.5 30 1.45 43.5 900 2.1025
40 3 -20 -1.05 21 400 1.1025 𝒙: 1.53 1.78 2.60 2.95 3.42
60 4.5 0 0.45 0 0 0.2025
60 3 0 -1.05 0 0 1.1025 𝒚: 33.50 36.30 40.00 45.80 53.50
600 40.5 0 0.00 140 2400 16.225
∑𝑥 = ∑𝑦= ∑𝑢= ∑𝑣= ∑𝑢𝑣= ∑𝑢 = ∑𝑣 =
𝑛∑𝑢𝑣 − ∑𝑢∑𝑣 𝜎
𝑟= = 0.70946 𝑏 =𝑟 = 0.058
𝑛∑𝑢 − ∑𝑢 𝑛∑𝑣 − ∑𝑣 𝜎

the regression coefficients for 𝒚 on 𝒙 is


𝑦 − 𝑦 = 𝑏 𝑥 − 𝑥̅ ⇒ 𝑦 − 4.05 = 0.058 𝑥 − 60 ⇒ 𝑦 = 0.0058𝑥 + 0.57 53 54
When 𝑥 = 70, 𝑦 = 4.63
Thus, the most probable weekly sales volume is 4.63.
02/12/20

SOLUTION:
The regression line of 𝒚 on 𝒙 𝐢𝐬 𝐠𝐢𝐯𝐞𝐧 𝐛𝐲 𝒚 − 𝒚 = 𝒃𝒚𝒙 𝒙 − 𝒙 , where
𝝈𝒚 𝒏∑𝒙𝒚 ∑𝒙∑𝒚 𝟏𝟐𝟏.𝟗𝟗𝟕 ∑𝒙 ∑𝒚
𝒃𝒚𝒙 = 𝒓 = = = 𝟗. 𝟕𝟐𝟔, 𝒙 = = 𝟐. 𝟒𝟓𝟔, 𝒚 = = 𝟒𝟏. 𝟖𝟐
𝝈𝒙 𝒏∑𝒙𝟐 ∑𝒙 𝟐 𝟏𝟐.𝟓𝟒𝟑 𝒏 𝒏

Therefore, 𝒚 − 𝟒𝟏. 𝟖𝟐 = 𝟗. 𝟕𝟐𝟔 𝒙 − 𝟐. 𝟒𝟓𝟔 ⇒ 𝒚 = 𝟏𝟗. 𝟗𝟑𝟐 + 𝟗. 𝟕𝟐𝟔𝒙.

𝒙 𝒚 𝒙𝟐 𝒙𝒚
1.53 33.50 2.3409 51.255
1.78 36.30 2.1684 64.614
2.60 40.00 6.76 104
2.95 45.80 8.7025 135.11
3.42 53.50 11.6964 182.97
∑𝒙 = 𝟏𝟐. 𝟐𝟖 ∑𝒚 = 𝟐𝟎𝟗. 𝟏 ∑𝒙𝟐 = 𝟑𝟐.6682 ∑𝒙𝒚 = 𝟓𝟑𝟕. 𝟗𝟒𝟗 55

Potrebbero piacerti anche