Sei sulla pagina 1di 44

SW388R7

Data Analysis &


Computers II Assumption of Homoscedasticity
Slide 1

Homoscedasticity
(aka homogeneity or uniformity of variance)

Transformations

Assumption of normality script

Practice problems
SW388R7

Assumption of Homoscedasticity
Data Analysis &
Computers II

Slide 2

 Homoscedasticity refers to the assumption that that


the dependent variable exhibits similar amounts of
variance across the range of values for an
independent variable.

 The test for homoscedasticity requires that the


independent variable be non-metric and the
dependent variable be metric (ordinal or interval).
When the independent variable is metric, we will
convert it to a categorical or non-metric variable
before we conduct the test. When the independent
variable is ordinal, we will use its categories the
same way we would for a non-metric variable.
SW388R7
Data Analysis &
Computers II Evaluating homoscedasticity
Slide 3

 Homoscedasticity is evaluated for pairs of variables.

 There are both graphical and statistical methods for


evaluating homoscedasticity .

 The graphical method is called a boxplot.

 The statistical method is the Levene statistic which


SPSS computes for the test of homogeneity of
variances.

 Neither of the methods is absolutely definitive.


SW388R7
Data Analysis &
Computers II
Assumption of Homoscedasticity :
Slide 4
The boxplot

Each red box shows the middle


50% of the cases for the group,
indicating how spread out the
group of scores is.
If the variance across
the groups is equal, the
height of the red boxes
5
will be similar across the
groups.
4 141
262 63
68
197
236

If the heights of the red


boxes are different, the
3 78 90
100
163
171
181 40
66
69
81
112
217
234
plot suggests that the
variance across groups
2 134
203 is not homogeneous.

The married group is


1
more spread out than
the other groups,
0 243
214
89
87
58 18
9
256
142
132
105
29 suggesting unequal
variance.
-1
N= 138 20 42 11 56

MARRIED DIVORCED NEVER MARRIED


WIDOWED SEPARAT ED

MARITAL STATUS
SW388R7
Data Analysis &
Computers II
Assumption of Homoscedasticity :
Slide 5
Levene test of the homogeneity of variance
Test of Homogeneity of Variances

RS HIGHEST DEGREE
Levene
Statis tic df1 df2 Sig.
5.239 4 262 .000

The null hypothesis for the test of homogeneity of


variance states that the variance of the dependent
variable is equal across groups defined by the
independent variable, i.e., the variance is homogeneous.

Since the probability associated with the Levene Statistic


(<0.001) is less than or equal to the level of significance,
we reject the null hypothesis and conclude that the
variance is not homogeneous.

To satisfy the assumption, we need a Levene Statistic


that is not statistically significant.
SW388R7
Data Analysis &
Computers II Transformations
Slide 6

 When the assumption of homoscedasticity is not


supported, we can transform the dependent variable
variable and test it for homoscedasticity . If the
transformed variable demonstrates homoscedasticity,
we can substitute it in our analysis.
 We use the same three common transformations that
we used for normality: the logarithmic
transformation, the square root transformation, and
the inverse transformation.

 All of these change the measuring scale on the


horizontal axis of a histogram to produce a
transformed variable that is mathematically
equivalent to the original variable.
SW388R7
Data Analysis &
Computers II When transformations do not work
Slide 7

 When none of the transformations results in


homoscedasticity for the variables in the relationship,
including that variable in the analysis will reduce our
effectiveness at identifying statistical relationships,
i.e. we lose power to detect relationship and
estimated values of the dependent variable based on
our analysis may be biased or systematically
incorrect.
SW388R7
Data Analysis &
Computers II Problem 1
Slide 8
SW388R7
Data Analysis &
Computers II Request a boxplot
Slide 9

The boxplot provides a visual


image of the distribution of the
dependent variable for the
groups defined by the
independent variable.

To request a boxplot, choose


the BoxPlot… command from
the Graphs menu.
SW388R7
Data Analysis &
Computers II Specify the type of boxplot
Slide 10

First, click on the Simple Second, click on the Define


style of boxplot to highlight button to specify the
it with a rectangle around variables to be plotted.
the thumbnail drawing.
SW388R7
Data Analysis &
Computers II Specify the dependent variable
Slide 11

First, click on the Second, click on the right


dependent variable arrow button to move the
to highlight it. dependent variable to the
Variable text box.
SW388R7
Data Analysis &
Computers II Specify the independent variable
Slide 12

Second, click on the right


arrow button to move the
First, click on the
independent variable to the
independent
Category Axis text box.
variable to highlight
it.
SW388R7
Data Analysis &
Computers II Complete the request for the boxplot
Slide 13

To complete the
request for the
boxplot, click on
the OK button.
SW388R7
Data Analysis &
Computers II The boxplot
Slide 14

Each red box shows the middle


50% of the cases for the group,
indicating how spread out the
group of scores is.
If the variance across
the groups is equal, the
height of the red boxes
will be similar across the
groups.

If the heights of the red


boxes are different, the
plot suggests that the
variance across groups
is not homogeneous.

The married group is


more spread out than
the other groups,
suggesting unequal
variance.
SW388R7
Data Analysis &
Computers II Request the test for homogeneity of variance
Slide 15

To compute the Levene test for


homogeneity of variance,
select the Compare Means |
One-Way ANOVA… command
from the Analyze menu.
SW388R7
Data Analysis &
Computers II Specify the independent variable
Slide 16

First, click on the Second, click on the right


independent arrow button to move the
variable to highlight independent variable to the
it. Factor text box.
SW388R7
Data Analysis &
Computers II Specify the dependent variable
Slide 17

Second, click on the right


arrow button to move the
First, click on the
dependent variable to the
dependent variable
Dependent List text box.
to highlight it.
SW388R7
Data Analysis &
Computers II The homogeneity of variance test is an option
Slide 18

Click on the Options…


button to open the options
dialog box.
SW388R7
Data Analysis &
Computers II Specify the homogeneity of variance test
Slide 19

Second, click on
First, mark the the Continue button
checkbox for the to close the options
Homogeneity of dialog box.
variance test. All of
the other checkboxes
can be cleared.
SW388R7
Data Analysis &
Computers II Complete the request for output
Slide 20

Click on the OK button to


complete the request for
the homogeneity of
variance test through the
one-way anova procedure.
SW388R7
Data Analysis &
Computers II Interpreting the homogeneity of variance test
Slide 21

Test of Homogeneity of Variances

RS HIGHEST DEGREE
Levene
Statis tic df1 df2 Sig.
5.239 4 262 .000

The null hypothesis for the test of homogeneity of


variance states that the variance of the dependent
variable is equal across groups defined by the
independent variable, i.e., the variance is homogeneous.

Since the probability associated with the Levene Statistic


(<0.001) is less than or equal to the level of significance,
we reject the null hypothesis and conclude that the
variance is not homogeneous.
SW388R7
Data Analysis &
Computers II Problem 1 - Answer
Slide 22
SW388R7
Data Analysis &
Computers II Script for the assumption of homoscedasticity
Slide 23

First, move the variables to the


list boxes based on the role that
the variable plays in the analysis
and its level of measurement.

Second, click on the Assumption of


homogeneity option button to request that
SPSS produce the output needed to evaluate
the assumption of homoscedasticity.

Fourth, click on
the OK button to
produce the output.

Third, mark the checkboxes


for the transformations that
we want to test in evaluating
the assumption.
SW388R7
Data Analysis &
Computers II Script output for testing homoscedasticity
Slide 24

The script produces the


same output that we
computed manually, in this
example, the test of
homogeneity of variances.

While we do not need it to


answer this problem, the same
output is produced for each of
the transformed variables.
SW388R7
Data Analysis &
Computers II Problem 2
Slide 25
SW388R7
Data Analysis &
Computers II Computing the logarithmic transformation
Slide 26

To compute the logarithmic


transformation for the variable,
we select the Compute…
command from the Transform
menu.
SW388R7
Data Analysis &
Computers II Specifying the variable name and function
Slide 27

First, in the target variable text box, type the


name for the log transformation variable “logdegre“.

Third, click
on the up
arrow button
to move the
highlighted
function to
Second, scroll down the list of functions to the Numeric
find LG10, which calculates logarithmic Expression
values use a base of 10. (The logarithmic text box.
values are the power to which 10 is raised
to produce the original number.)
SW388R7
Data Analysis &
Computers II Adding the variable name to the function
Slide 28

Second, click on the right arrow


button. SPSS will replace the
highlighted text in the function (?)
with the name of the variable.

First, scroll down the list of


variables to locate the
variable we want to transform.
Click on its name so that it is
highlighted.
SW388R7
Data Analysis &
Computers II Preventing illegal logarithmic values
Slide 29

The log of zero is not defined mathematically. If


we have zeros for the data values of some cases
as we do for this variable, we add a constant to all
cases so that no case will have a value of zero.

To solve this problem, we


add + 1 to the degree
variable in the function.

Click on the OK
button to complete
the compute request.
SW388R7
Data Analysis &
Computers II The transformed variable
Slide 30

The transformed variable which we


requested SPSS compute is shown in the
data editor in a column to the right of the
other variables in the dataset.

Once we have the transformation


variable computed, we repeat the
“Boxplot” analysis using this variable.
SW388R7
Data Analysis &
Computers II The boxplot
Slide 31

In this boxplot, the spread is the same for 3 of the 5


groups, which is an improvement over the original boxplot.

However, it is difficult to judge whether or not the problem


is solved based solely on the graphic.
SW388R7
Data Analysis &
Computers II The homogeneity of variance test
Slide 32

Test of Homogeneity of Variances

LOGDEGRE
Levene
Statis tic df1 df2 Sig.
2.151 4 262 .075

The null hypothesis for the test of homogeneity of


variance states that the variance of the transformed
dependent variable is equal across groups defined by the
independent variable, i.e., the variance is homogeneous.

Since the probability associated with the Levene Statistic


(0.075) is greater than the level of significance, we fail
to reject the null hypothesis and conclude that the
variance is homogeneous.
SW388R7
Data Analysis &
Computers II Problem 2 - Answer
Slide 33
SW388R7
Data Analysis &
Computers II Homogeneity of variance test from the script
Slide 34

The script for homoscedasticity creates the


transformed dependent variables and tests
them for homogeneity of variance.
SW388R7
Data Analysis &
Computers II Problem 3
Slide 35
SW388R7
Data Analysis & Categorizing the interval independent
variable
Computers II

Slide 36

In this problem, the independent


variable, occupational prestige
score, is interval level. To conduct
the test, we will recode the variable
in four categories.

First, select the


Categorize Variables
command from the
Transform menu.
SW388R7
Data Analysis &
Computers II Specifications for categorizing the variable
Slide 37

First, move the


variable to be
transformed, prestg80,
to the “Create
Categories for” list box.

Third, click on the


OK button to
produce the
categories.

Second, specify the number of


categories to create. In this example,
we accept the default of 4 which will
divide the cases into quartiles using
the values of prestg80.
SW388R7
Data Analysis &
Computers II The categorized metric variable
Slide 38

In the data editor, we see that SPSS


has created a new variable, named by
pre-pending the variable name with
an “n.” The values for this variable
range from 1 to 4 to represent the
four values of the quartiles.

We use this variable, nprestg8, as the


factor variable in the one-way analysis
of variance, which produces the
Levene test of homogeneity.
SW388R7
Data Analysis &
Computers II Homogeneity of variance test
Slide 39

Using this variable, nprestg8, as the


factor variable in the one-way analysis
of variance, we find the output for the
Test of Homogeneity of Variances has
a probability less than the level of
significance specified for the problem.
SW388R7
Data Analysis &
Computers II Problem 3 - Answer
Slide 40
SW388R7
Data Analysis &
Computers II The script with a metric independent variable
Slide 41

When we test the assumption of


homoscedasticity with a metric
independent variable, we must be
careful to put the interval level
variable in the list box for metric
independent variables. The script will
convert variables in this list into
quartiles when it does the test for
homogeneity of variance.
SW388R7
Data Analysis &
Computers II Other problems on homoscedasticity assumption
Slide 42

 A problem may ask about the assumption of


homoscedasticity for a non-metric dependent variable.
The answer will be “An inappropriate application of a
statistic” since variance is not computed for a non-
metric variable.

 A problem may ask about the assumption of


homoscedasticity for an ordinal level dependent
variable. If the variable or transformed variable
satisfies the assumption of homogeneity of variance,
the correct answer to the question is “True with
caution” since we may be required to defend treating
ordinal variables as metric.
SW388R7
Data Analysis & Steps in answering questions about the
assumption of homoscedasticity – question 1
Computers II

Slide 43

Question: variance in dependent variable is homoscedastic?

Dependent variable Yes Independent No Recode


variable is non- independent
is metric? variable
metric?

No Yes

Incorrect application
of a statistic Does the Levene statistic No
support the assumption of False
homoscedasticity?

Yes

No
Is the dependent variable True
ordinal level?

Yes

True with caution


SW388R7
Data Analysis & Steps in answering questions about the
assumption of homoscedasticity – question 2
Computers II

Slide 44

Question: variance in dependent variable is NOT


homoscedastic, but transformation is?

Dependent is metric? No Incorrect application


Independent variable is of a statistic
non-metric/categorized?

Yes

Does the Levene


Does the Levene statistic No statistic support the No
support the assumption of assumption of False
homoscedasticity? homoscedasticity for
transformed variable?

Yes
Yes
False
(assumption satisfied No
Is the dependent
by original variable) variable ordinal level? True

Yes

True with caution

Potrebbero piacerti anche