Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Example
5 4 4 6 6
Life expectancy (years) 66 43 49 64 61
4 2 5 1 6
Birth rate (per 3 4 4 3 3
30 38 34 31 26
thousand) 8 3 2 2 4
Steps:
1. Construct a SCATTERPLOT on your calculator
after identifying the EV & RV.
∴ y=¿
4. INTERPRET the COEFFICIENTS of the regression line i.e. the slope and intercept
For the regression equation y=a+bx :
the slope (b)estimates the average change (increase/decrease) in the response
variable ( y ) for each one-unit increase in the explanatory variable ( x).
the intercept (a) estimates the average value of the response variable ( y ) when
the explanatory variable ( x) equals 0.
Slope –
Intercept –
When using a regression line to make predictions, we must be aware that strictly
speaking, the equation we have found applies only to the range of data values used
to derive the equation.
7. RESIDUALS
Residuals (error of prediction) are the vertical distance between the individual data
points and the regression line. To calculate, use:
Residual = actual y-value – predicted y-value
Data points above the regression line have a positive residual
Data points below the regression line have a negative residual
Data points on the regression line have a zero residual
NB: The sum of all the residuals always adds to 0 (or very close after rounding).
Example 1
The equation of a regression line that enables hand span to be predicted from
height is: Hand span=2.9+0.33 × Height
A person is 160 cm tall and has an actual hand span of 58.5 cm.
Using this regression equation, their predicted hand span is?
The residual value for this person is?
Testing the assumption of linearity.
A better way to test linearity is to create a residual plot. We plot the residual value
of each data value against the explanatory variable (x-axis). As the mean of
residuals is always zero, the horizontal zero line helps us orient ourselves. This line
corresponds to the regression line.
A residual plot can be done by hand or on your calculator.
Using your calculator:
Interpreting residuals
No pattern indicates the current model is most likely the best.
Pattern indicates another model may be more appropriate.
Conclusion
The lack of a clear pattern in the residual plot
confirms the assumption of a linear association.
y=a+bx is an appropriate model.
Conclusion
The residual plot indicates a distinct pattern
suggesting that a non-linear model could be
more appropriate.
Conclusion
The residual plot indicates a distinct pattern
suggesting that a non-linear model could be
more appropriate.
8. Write a REPORT on findings (combine all the above information together).
From the scatterplot we see that there is a strong negative, linear relationship
between life expectancy & birth rate, r = -0.8069. There are no obvious outliers.
Equation of least squares regression line is: Life expectancy = 105.37 – 1.44 ×
Birth rate. The slope predicts that, on average, life expectancy decreases by 1.44
years for an increase in birth rate of one birth per 1000 people. The coefficient of
determination indicates that 65.11% of the variation in life expectancy is explained
by the variation in birth rate. A residual plot shows the lack of a clear pattern and
confirms that the use of a linear equation to describe the relationship between life
expectancy and birth rate is appropriate.
Example 2
A student fits a least squares line to a set of bivariate data as
shown in the scatterplot opposite.
The residual plot for this least squares line would look like:
4C & 4D