Sei sulla pagina 1di 2

Summary for Correlation & Regression

 Given two variables X and Y we want to study the relationship between them.

Do X and Y have a strong linear


relationship?

Correlation coefficient, r Scatter diagram

 r only gives an indication of the strength of linear relationship, it DOES NOT confirms a
linear relationship.
 Scatter diagram is a more accurate method to investigate in a possible relationship between X
and Y.
 If possible, always check the scatter diagram to see if there is a possible linear relationship,
then use the value of r for a more precise measurement on the strength of linear relationship.

Finding the equation of linear relationship between X and Y

 Find the least square regression line equation to describe the linear relationship between X
and Y.
 Two questions to answer: 1. Which model should I use? X on Y or Y on X?
2. When given a model, say Y on X, when is it alright to use X on Y
instead?

X Y Correct model to Alright to use the other


use model?
1 Controlled Random Y on X No
2 Random Controlled X on Y No
3 Random Random Y on X Provided r is very close to +1
(subject of or –1
interest)
4 Random Random X on Y Provided r is very close to +1
(subject of or –1
interest)
  
y on x: y  y  b x  x  y  y  bx  bx
a
  
x on y: x  x  b' y  y  x  x  b' y  b' y
a'

(Formula for b and b’ can be found in MF 15. Usually we will use G.C to do the
computation)

 Any regression line will always pass through x and y

What if X and Y do not follow linear relationship?

 Try and perform linearlisation : Transform equation between y and x into a linear form.
 Apply concepts of regression line and correlation on the new linear equation.

Prediction / Estimation

Suppose we have a model y on x (similarly of x on y),

Interpolation: Predicting a value of y given the x value is within the range of values of x in our
sample of data. (similarly for predicting x given y)

Extrapolation: Predicting a value of y given the x value is OUTSIDE the range of values of x in
our sample of data. (similarly for predicting x given y)

 Since our data only gives us information on the variables within its range of values,
extrapolation should be avoided whenever possible.

Potrebbero piacerti anche