Correlational Study

Educational Research
Chapter 7
Correlational Research
Gay, Mills, and Airasian
Topics to Be Discussed
n
n
n
Definition, purpose, and limitation of

correlational research
Correlation coefficients and their
significance
Process of conducting correlational
research
Relationship studies
Prediction studies
Correlational Research
n
Definition
n
Purpose
n
n
Whether and to what degree variables are

related
Determine relationships
Make predictions
Limitation
n
Cannot indicate cause and effect

Objectives 1.1, 1.2, & 1.3
The Process
n
Problem selection
n
Variables to be correlated are selected on the

basis of some rationale
n
n
Math attitudes and math achievement

Teachers sense of efficacy and their effectiveness
Increases the ability to meaningfully interpret

results
Inefficiency and difficulty interpreting the

results from a shotgun approach
Objective 2.1
The Process
n
Participant and instrument selection

n
n
Minimum of 30 subjects
Instruments must be valid and reliable
n
n
Design and procedures

n
Higher validity and reliability requires smaller samples

Lower validity and reliability requires larger samples
Collect data on two or more variables for each

subject
Data analysis
n
Compute the appropriate correlation coefficient

Objectives 2.2 & 2.3
Correlation Coefficients
n
A correlation coefficient identifies the

size and direction of a relationship
n
Size/magnitude
n
Ranges from 0.00 1.00
Direction
n
Positive or negative
Objectives 3.1, 3.2, & 3.3
n
Interpreting the size of correlations

n
General rule
n
n
n
Less than .35 is a low correlation

Between .36 and .65 is a moderate correlation
Above .66 is a high correlation
Predictions
n
Between .60 and .70 are adequate for group

predictions
Above .80 is adequate for individual predictions
Objective 3.5
n
Interpreting the size of correlations (cont.)

n
Criterion-related validity
n
n
Above .60 for affective scales is adequate

Above .80 for tests is minimally acceptable
Inter-rater reliability
n
n
n
n
Above .90 is very good

Between .80 and .89 is acceptable
Between .70 and .79 is minimally acceptable
Lower than .69 is problematic
Objective 3.5
n
Interpreting the direction of correlations

n
Direction
n
Positive
n
Negative
n
High scores on the predictor are associated with high

scores on the criterion
Low scores on the predictor are associated with low
High scores on the predictor are associated with low
Low scores on the predictor are associated with high
Positive or negative does not mean good or bad

Objective 3.3
n
Interpreting the size and direction of

correlations using the general rule
n
n
n
n
n
n
+.95 is a strong positive correlation

+.50 is a moderate positive correlation
+.20 is a low positive correlation
-.26 is a low negative correlation
-.49 is a moderate negative correlation
-.95 is a strong negative correlation
Which of the correlations above is the

strongest, the first or last?
Objective 3.3 & 3.5
n
Scatterplots
Graphical presentations of correlations
n Example of predicting from an attitude
scale EX 1 to an achievement test
EX 2
n
Predictor variable - EX1 - is on the

horizontal axis
Criterion variable - EX 2 - is on the vertical
axis
Objective 3.4
An Example of a Scatterplot
50.00
Linear Regression
ex2 = 11.23 + 0.72 * ex1

R-Square = 0.66
ex2
45.00
40.00
35.00
30.00
30.00
40.00
ex1
50.00
Objective 3.4
n
Common variance
n
Definition
n
n
The extent to which variables vary in a systematic manner

Interpreted as the percentage of variance in the criterion
variable explained by the predictor variable
Computation
n
n
The squared correlation coefficient - r2

Examples
2
n If r = .50 then r = .25
n 25% of the variance in the criterion can be explained
by the predictor
2
n If r = .70 then r = .49
n 49% of the variance in the criterion can be explained
by the predictor
Statistical Significance
n
Statistical significance
n
Is the observed coefficient different from 0.00?

n
n
Determining statistical significance

n
n
Does the correlation represent a true relationship?

Is the correlation only the result of chance?
Consult a table of the critical values of r
See Table A.2 in Appendix A
Three common levels of significance

n
n
n
.01 (1 chance out of 100)

.05 (5 chances out of 100)
.10 (10 chances out of 100)
Statistical Significance
n
Sample size and statistical significance

n
n
Small samples require higher correlations for significance

Large samples require lower correlations for significance
Practical significance and statistical significance

n
Small correlation coefficients can be statistically significant even

though they have little practical significance
n +.20
n
n
Statistically significant at the .05 level if the sample is about 100

Little or no practical significance because it is very low and
predicts only .04 of the variation in the criterion scores
-.30
n
n
Statistically significant at the .05 level if the sample is about 40

Little or no practical significance because it is low and predicts
only .09 of the variation in the criterion scores
Relationship Studies
n
General purpose
n
Gain insight into variables that are related to other

variables relevant to educators
n
n
n
Achievement
Self-esteem
Self-concept
Two specific purposes

n
Suggest subsequent interest in establishing cause

and effect between variables found to be related
Control for variables related to the dependent
variable in experimental studies
Conducting Relationship Studies

n
Identify a set of variables

n
n
n
n
n
n
Limit to those variables logically related to the criterion

Avoid the shotgun approach
n Possibility of erroneous relationships
n Issues related to determining statistical significance
Identify a population and select a sample

Identify appropriate instruments for measuring each
variable
Collect data for each instrument from each subject
Compute the appropriate correlation coefficient
Objective 6.1
Types of Correlation Coefficients

n
The type of correlation coefficient depends on the

measurement level of the variables
n
Pearson r - continuous predictor and criterion variables

n
Spearman rho ranked or ordinal predictor and criterion

variables
n
Rank in class and rank on a final exam
Phi coefficient dichotomous predictor and criterion

variables
n
Math attitude and math achievement
Gender and pass/fail status on a high stakes test
See Table 7.2

Objectives 7.1, 7.2, & 7.3
Linear and Curvilinear Relationships

n
Linear relationships
n
Plots of the scores on two variables are best

described by a straight line
n
n
Math scores and science scores

Teacher efficacy and teacher effectiveness
Curvilinear relationships
n
Plots of scores on two variables are best described

by functions
n
n
Age and athletic ability

Anxiety and achievement
Estimated by the eta correlation

Objectives 8.1, 8.2, & 8.3
An Example of a Linear Relationship

1.0000
Linear Regression
fp = 0.39 + 0.01 * ex1

R-Square = 0.80
fp
0.9000
0.8000
0.7000
30.00
40.00
50.00
ex1
Objective 8.4
An Example of a Curvilinear Relationship

LLR Smoother
100.00
score
75.00
50.00
25.00
0.00
2.00
4.00
6.00
8.00
10.00
study
Objective 8.4
Factors that Influence Correlations

n
Sample size
n
The larger the sample the higher the likelihood of

a high correlation
Analysis of subgroups
n
n
n
If the total sample consists of males and females each

gender represents a subgroup
Results across subgroups can be different because they
are being obscured by the analysis of the data for the
total sample
Reduces the size of the sample
Potentially reduces variation in the scores
Objective 9.1
Factors that Influence Correlations

n
Variation
The greater the variation in scores the
higher the likelihood of a strong correlation
n The lower the variation in scores the higher
the likelihood of a weak correlation
n
Attenuation
Correlation coefficients are lower when the
instruments being used have low reliability
n A correction for attenuation is available
n
Prediction Studies
n
Attempts to describe the predictive

relationships between or among
variables
The predictor variable is the variable from
which the researcher is predicting
n The criterion variable is the variable to
which the researcher is predicting
n
Prediction Studies
n
Three purposes
Facilitates decisions about individuals to
help a selection decision
n Tests variables believed to be good
predictors of a criterion
n Determines the predictive validity of an
instrument
n
Objective 11.1
Prediction Studies
n
Single and multiple predictors

n
Linear regression - one predictor and one

criterion
n
n
Y = a + bX
r2
Multiple regression more than one

predictor and one criterion
n
n
Y = a + bX1 + bX2 + + bXi

r2 or the coefficient of determination
Objective 11.4
Conducting a Prediction Study

n
Identify a set of variables

n
n
n
Identify a population and select a sample

Identify appropriate instruments for measuring each
variable
n
Ensure appropriate levels of validity and reliability
Collect data for each instrument from each subject

n
Limit to those variables logically related to the criterion
Typically data is collected at different points in time
Compute the results

n The multiple regression coefficient
n The multiple regression equation (i.e., the
prediction equation)

n
Issues of concern
Shrinkage the tendency of a prediction
equation to become less accurate when
used with a group other than the one on
which the equation was originally
developed
n Cross validation validation of a prediction
equation with another group of subjects to
identify problematic variables
n
Objective 11.3

n
Issues of concern (cont.)

n
Errors of measurement (e.g., low validity or

reliability) diminish the accuracy of the prediction
Intervening variables can influence the predictive
process if there is too much time between
collecting the predictor and criterion variables
Criterion variables defined in general terms (e.g.,
teacher effectiveness, success in school) tend to
have lower prediction accuracy than those defined
very narrowly (e.g., overall GPA, test scores)
Objective 11.5
Differences between Types of Studies

n
Correlational research is a general category

that is usually discussed in terms of two
variables
Relationship studies develop insight into the
relationships between several variables
n
The measurement of all variables occurs at about

the same time
Predictive studies involve the predictive

relationships between or among variables
n
The predictor variables are collected long before

the criterion variable
Other Correlation Analyses

n
Path analysis
n
Investigates the patterns of relationships among a

number of variables
Results in a diagram that indicates the specific
manner by which variables are related (i.e., paths)
and the strength of those relationships
An extension of this analysis is structural equation
modeling (SEM)
n
n
n
Clarifies the direct and indirect relationships among

variables based on underlying theoretical constructs
More precise than path analysis
Often known as LISREL for the first computer program
used to conduct this analysis
Objective 13.1

n
Discriminant function analysis

Similar to multiple regression except that
the criterion variable is categorical
n Typically used to predict group
membership
n
n
n
High or low anxiety

Achievers or non-achievers
Objective 13.2

n
Cannonical correlation
n
An extension of multiple regression in which more

than one predictor variable and more than one
criterion variable are used
Factor analysis
n
A correlational analysis used to take a large

number of variables and group them into a smaller
number of clusters of similar variables called
factors
A Checklist of Questions
n
n
n
Was the correct correlation coefficient

used?
Is the validity and reliability of the
instruments acceptable?
Is there a restricted range of scores?
How large is the sample?
Statistical Assessment of
Relationships
Data
Are the data quantitative or nominal?
quantitative
nominal
Do you have more than two predictor

variables?
No
Yes
Correlation Analysis:
Do you have more than two predictor

variables?
No
Yes
Chi-Square Analysis:
Regression Analysis: R
Log-Linear Analysis
Logistic Regression
The Correlation Coefficient
for Association among Quantitative Variables

Scatterplot
College
GPA
A graph in which the x axis indicates

4.0
the scores on the predictor variable
and the y axis represents the scores
on the outcome variable. A point is3.0
plotted for each individual at the
intersection of their scores.
2.0
Regression Line
1.0
A line in which the squared distances

of the points from the line are
minimized. (least square methods)
1.0
2.0
3.0
4.0
High School
GPA
Linear Relationships and Nonliniar Relationships

Y
Y
Positive Linear
Negative Linear
Curvilinear
X
Y
Curvilinear
Independent
The Pearson Correlation

Coefficient
Calculation
Esteem 1 Esteem 2
1
4 (4-3)/0.8 =1.674
2
4
3
3
3
2
4
2
2
5
2
1
Mean
2.4
Sesteem1 = 0.8 Sesteem2=1.04
r = (Z
ZY )
N 1
[(4-3)(4-2.4)]2 + ...
[( X X )(Y Y )]
= [ ( X X ) ][ (Y Y )
(4-2.4)2 + ...
(4-3)2 + ...
4+4+3+2+2
4 x 4 + 4 x 3 + ...
=
Task 1: compute r
XY
( X )( Y )
N
4+3+2+2+1
5
2
2
&
#
&
#
(
)
(
)
X
Y
2
2
$ X
! $ Y
!
N
N
$%
!" $%
!"
4 x 4 + 4 x 4 + 3 x 3 ...
4 x 4 + 3 x 3 + 2 x 2 ...
Interpretation of r
-1< r <1
If the relationship between X and Y are positive:0 < r < 1
-1 < r < 0
If the relationship between X and Y are negative:
If p-value associated with the r is < .05
The variable X and Y are significantly correlate to each other.
Positively: 0 < r < 1, Negatively -1 < r < 0
If p-value associated with the r is >. 05
There is no significant correlation between X and Y
Reporting Correlations
r(Number of Participants) = Correlation Coefficient r, p < p value.
As predicted by the research hypothesis, the variable of optimism
and reported health behavior were (significantly) positively correlated
in the sample (the data), r(20) = .52, p < .01
Limitation
1. Cases in which the correlation between X and Y that have
curvilinier relationships
r=0
2. Cases in which the range of variables is restricted.
Example. SAT scores and college GPA Restriction of Range
3. Cases in which the data have outliers
r > |.99|
Limitation (visual)
Curviliniar
Small Range
Outlier
The Chi-square Statistic
for Association among nominal variables

Yes
Northerner
No
30 (.15)70 (.35) 100 (.50)
45 (.225) 55 (.275)
Southerner
60 (.30)40 (.20) 100 (.50)
45 (.225) 55 (.275)
90 (.45)110 (.55)200 (1.00)

Row marginal X Column marginal
N
fe =
Task 2 computation 2
2 =
X=
( fo fe )2
f
e
(30 45) 2 (70 55) 2 (60 45) 2 (40 55) 2

+
+
+
22.5
27.5
22.5
27.5
Interpretation of 2
Go to Table E in Appendix E.
Degree of Freedom (df):

(Level of variable 1 - 1) X (Level of variable 2 -1)
Number of Participants
See the value at the intersection between Alpha p < .05 and df
If 2 is greater than the value in Table E, the contingency table
is significantly differ from the expectation.
If 2 is greater than the value in Table E, the contingency table
is not significantly differ from the expectation.
Reporting Chi-Square Statistic

2 (degree of freedom (df), Number of Participants(N)) =
Chi value, p < p value
As predicted by the research hypothesis, the southerners were more
likely to approve of a policeman striking an adult male citizen who
was being questioned as a suspect in a murder case, 2(1, N =30) =
34.23, p < .01

Correlational Study

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Correlational Study

Caricato da

Copyright:

Formati disponibili

Educational Research

Definition, purpose, and limitation of

Whether and to what degree variables are

Cannot indicate cause and effect

Variables to be correlated are selected on the

Math attitudes and math achievement

Increases the ability to meaningfully interpret

Inefficiency and difficulty interpreting the

Participant and instrument selection

Design and procedures

Higher validity and reliability requires smaller samples

Collect data on two or more variables for each

Compute the appropriate correlation coefficient

A correlation coefficient identifies the

Ranges from 0.00 1.00

Objectives 3.1, 3.2, & 3.3

Interpreting the size of correlations

Less than .35 is a low correlation

Between .60 and .70 are adequate for group

Interpreting the size of correlations (cont.)

Above .60 for affective scales is adequate

Above .90 is very good

Interpreting the direction of correlations

High scores on the predictor are associated with high

Positive or negative does not mean good or bad

Interpreting the size and direction of

+.95 is a strong positive correlation

Which of the correlations above is the

Predictor variable - EX1 - is on the

ex2 = 11.23 + 0.72 * ex1

The extent to which variables vary in a systematic manner

The squared correlation coefficient - r2

Is the observed coefficient different from 0.00?

Determining statistical significance

Does the correlation represent a true relationship?

Three common levels of significance

.01 (1 chance out of 100)

Sample size and statistical significance

Small samples require higher correlations for significance

Practical significance and statistical significance

Small correlation coefficients can be statistically significant even

Statistically significant at the .05 level if the sample is about 100

Statistically significant at the .05 level if the sample is about 40

Gain insight into variables that are related to other

Two specific purposes

Suggest subsequent interest in establishing cause

Conducting Relationship Studies

Identify a set of variables

Limit to those variables logically related to the criterion

Identify a population and select a sample

Types of Correlation Coefficients

The type of correlation coefficient depends on the

Pearson r - continuous predictor and criterion variables

Spearman rho ranked or ordinal predictor and criterion

Rank in class and rank on a final exam

Phi coefficient dichotomous predictor and criterion

Math attitude and math achievement

Gender and pass/fail status on a high stakes test

See Table 7.2

Linear and Curvilinear Relationships

Plots of the scores on two variables are best

Math scores and science scores