Sense of Regression

Making Sense of Regression Results
Kwamina Banson
Socio-Economics Department
BNARI Seminar Room
30th 07 - 2009
Linear Regression: Introduction
Interpreting SPSS regression output

Coefficients for independent variables Fit of the regression: R Square
Statistical significance
How to reject the null hypothesis
Multivariate regressions
Academic Performance of Junior High Sch.
What is SPSS?
SPSS is a computer program used for a wide variety of statistical analysis. (Statistical Package for the Social Sciences) Statistical Product and Service Solutions In addition to statistical analysis, data management and data documentation are features of the base software. Statistics included in the base software:
Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests Prediction for numerical outcomes: Linear regression Prediction for identifying groups: Factor analysis, cluster analysis (twostep, K-means, hierarchical), Discriminant

100
y = mx + b.
80
60
Slope or coefficient
where m is the slope of the line and b is the yintercept
40
Graduation Rate
How tight is the fit? Y-intercept or constant

20
0 0 200 400 600 800 1000 1200 1400 1600
Rsq = 0.3454
Average SAT Score
An SPSS regression output includes two key tables for interpreting your results:
A Coefficients table that contains the yintercept (or constant) of the regression, a coefficient for every independent variable, and the standard error of that coefficient.
A Model Summary table that gives you information on the fit of your regression.
Interpreting SPSS regression output: Coefficients

Coefficientsa Unstandardized Coefficients Std. B Error 4.236 7.048 .007 Standardized Coefficients Beta .588 t .601 8.778 Sig. .549 .000
Model 1
(Constant) Average 5.88E-02 SAT Score
y = mx + b.
a. Dependent Variable: Graduation Rate
Here, we will ONLY LOOK AT UNSTANDARDIZED COEFFICIENTS!

The y-intercept is 4.2% with a standard error of 7.0% The coefficient for SAT Scores is 0.059%, with a standard error of 0.007%.

y = mx + b.
Model 1
The y-intercept or constant is the predicted value of the dependent variable when the independent variable takes on the value of zero. This basic model predicts that when a college admits a class of students who averaged zero on their SAT, 4.2% of them will graduate. The constant is not the most helpful statistic.

y = mx + b.
Model 1
The coefficient of an independent variable is the predicted change in the dependent variable that results from a one unit increase in the independent variable. A college with students whose SAT scores are one point higher on average will have a graduation rate that is 0.059% higher. Increasing SAT scores by 200 points leads to a (200)(0.059%) = 11.8% rise in graduation rates
Interpreting SPSS regression output: Fit of the Regression

Model Summary Adjusted R Square .341 Std. Error of the Estimate 12.45% Model 1 R R Square .588 a .345
a. Predictors: (Constant), Average SAT Score
The R Square measures how closely a regression line fits the data in a scatter plot. It can range from zero (no explanatory power) to one (perfect prediction).
An R Square of 0.345 means that differences in SAT scores can explain 35% of the variation in college graduation rates.
Statistical Significance
What would the null hypothesis look like in a scatterplot?
If the independent variable has no effect on the dependent variable, the scatterplot should look random, the regression line should be flat, and its slope should be zero.
Null hypothesis: The regression coefficient for an independent variable equals zero.
Statistical Significance
Multivariate Regressions
A multivariate regression uses more than one independent variable (or confound) to explain variation in a dependent variable.
The coefficient for each independent variable reports its effect on the DV, holding constant all of the other IVs in the regression. Thought experiment: Looking at factors such as
class size, sch. feeding program, and credentials effect on academic performance of Junior High School
Let's perform a regression analysis using ap2000 as the outcome variable and the variables acs_JH, meals and full as predictors

(ap2000)- These measure the academic performance of the school( acs_JH)- the average class size in Junior High Sch. (meals)- the percentage of students receiving free meals - which is an indicator of poverty, and (full)- the percentage of teachers who have full teaching credentials
We expect that better academic performance would be associated with lower class size, fewer students receiving free meals, and a higher percentage of teachers having full teaching credentials.
Coefficients(a) Unstandardized Coefficients Model (Constant) ACS_JH 1 MEALS -3.702 .154 -.808 -24.038 .000 B 906.739 -2.682 Std. Error 28.265 1.394 -.064 Standardize d Coefficients Beta 32.080 -1.924 .000 .055 t Sig.
FULL
a Dependent Variable: AP2000
.109
.091
.041
1.197
.232
Model Summary Model 1 R .821(a) R Square .674 Adjusted R Square .671 Std. Error of the Estimate 64.153
a Predictors: (Constant), FULL, ACS_JH, MEALS An R Square of 0.674 means that differences in ACS-JH, MEALS and FULL can explain 67% of the variation in academic performance rates.
FULL
.109
.091
.041
1.197
.232
The average class size (acs_JH, b=-2.682) is not significant (p=0.055), but the coefficient is negative which would indicate that larger class sizes is related to lower academic performance -- which is what we would expect.
FULL
.109
.091
.041
1.197
.232
Next, the effect of meals (b=-3.702, p=.000) is significant and its coefficient is negative indicating that the greater the proportion students receiving free meals, the lower the academic performance.
Please note that we are not saying that free meals are causing lower academic performance. The meals variable is highly related to income level and functions more as a proxy for poverty. Thus, higher levels of poverty are associated with lower academic performance. This result also makes sense.
FULL
.109
.091
.041
1.197
.232
Finally, the percentage of teachers with full credentials (full, b=0.109, p=.2321) seems to be unrelated to academic performance. This would seem to indicate that the percentage of teachers with full credentials is not an important factor in predicting academic performance - this result was somewhat unexpected.
Should we take these results and write them up for publication?
From these results, we would conclude that :

lower class sizes are related to higher performance, that fewer students receiving free meals is associated with higher performance, and that the percentage of teachers with full credentials was not related to academic performance in the schools.
Before we write this up for publication, we should do a number of checks to make sure we can firmly stand behind these results. We start by

getting more familiar with the data file, doing preliminary data checking, and looking for errors in the data.

Sense of Regression

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Sense of Regression

Caricato da

Copyright:

Formati disponibili

Making Sense of Regression Results

BNARI Seminar Room

Linear Regression: Introduction

Interpreting SPSS regression output

How to reject the null hypothesis

Academic Performance of Junior High Sch.

Interpreting SPSS regression output

where m is the slope of the line and b is the yintercept

How tight is the fit? Y-intercept or constant

0 0 200 400 600 800 1000 1200 1400 1600

Average SAT Score

Interpreting SPSS regression output

Interpreting SPSS regression output: Coefficients

(Constant) Average 5.88E-02 SAT Score

a. Dependent Variable: Graduation Rate

Here, we will ONLY LOOK AT UNSTANDARDIZED COEFFICIENTS!

Interpreting SPSS regression output: Coefficients

(Constant) Average 5.88E-02 SAT Score

a. Dependent Variable: Graduation Rate

Interpreting SPSS regression output: Coefficients

(Constant) Average 5.88E-02 SAT Score

a. Dependent Variable: Graduation Rate

Interpreting SPSS regression output: Fit of the Regression

a. Predictors: (Constant), Average SAT Score

What would the null hypothesis look like in a scatterplot?

Should we take these results and write them up for publication?

From these results, we would conclude that :

Potrebbero piacerti anche