Data Analysis Using SPSS - Evsu PDF

COMPUTER-AIDED DATA
ANALYSIS FOR RESEARCH

Gabino P. Petilos, Ph.D.
Education Supervisor II
Commission on Higher Education
Regional Office VIII
ROLE OF STATISTICS IN RESEARCH
Identification Measurement
of Variables: of Variables:
PROBLEM Variable A
Variable B • direct
Variable C •Indirect
... etc...
Collection of
Data
Interpretation of Data Analysis of Data

Variable
A characteristic or attribute of persons or
objects that can assume different values
Constant
A characteristic or attribute of persons
that does not change
Illustration 1.
Population: Public School Teachers in Region 8
• Gender
• Age
• Length of Service
• Marital Status
• Salary
• Educational Attainment
• Work Performance
• Religion
• Level of Motivation
• Attitude towards Teaching
• Tenure
Illustration 2.
Population: Senior High School Students currently

enrolled in Region 8
• Gender
• Age
• Attitude towards Schooling
• Level of Interest in Math
• Math Ability
• Reading Ability
• Type of High School Enrolled in
• Curriculum (Strand)
Variables may be:
 Quantitative – a variable that

assumes numerical values
 Qualitative – a variable that cannot

assume numerical values (only
attributes)
Quantitative Variables
 Age
 Length of Service
 Salary
 Work Performance (scores)
 Level of Motivation (scores)
 Attitude towards Teaching
(scores)
Qualitative Variables
• Gender
• Marital Status
• Educational Attainment
• Religion
• Tenure
Classification of Variables
 According to FUNCTIONAL RELATIONSHIP
Independent Variable (or variate) –
variable that explains the variation of
another variable;
Dependent Variable (response or

criterion variable) – variable that is
influenced by another variable
Temporal Sequence
Independent Dependent
Variable Variable Y
X Y
 According to CONTINUITY OF SCALE

Continuous Variable – variable that can assume an
unlimited number of intermediate
values within a specified range of
values.
Examples: Height, Weight, Age, Teaching Experience

Attitude towards Teaching
Discrete Variable – variable that can take on only

designated values (finite or
countable)
Examples: Number of Children in a Family,

Number of service vehicles of government
agencies
 According to SCALE OF MEASUREMENT

Measurement – the assignment of numbers to the
categories of a variable according to
rules (may be arbitrary rule or standard
standard rule)
Illustrations:
Variable: Sex
Categories: Male and Female
Measurement (Arbitrary): Assign 1 if sex is male
Assign 0 if sex is female
Variable: Age
Categories: Years
Measurement (standard): Use the number of years as
measure of age
Variable: Attitude towards Teaching

Categories: Very Positive to Very Negative
Measurement (Arbitrary): Use a 5 point scale
SCALES OF MEASUREMENT
 NOMINAL SCALE
 ORDINAL SCALE
 INTERVAL SCALE
 RATIO SCALE
NOMINAL Scale (KEY WORD: LABEL)
• Establishes equivalence or difference

between the attributes of the objects or
respondents
• In this scale, numbers are used as mere

labels of the categories of the variable;
• The numbers cannot be meaningfully

ordered
NOMINAL Scale (KEY WORD: LABEL)
Examples: Sex: 1 – male; 0 – female

Here, 1 & 0 serve only as labels of the
categories of sex.
Religion: 1 – Roman Catholic

2 – Protestant
3 – INC
4 – Others
Marital Status:
1 – single; 2 – married; 3 – others
ORDINAL Scale (KEY WORD: RANK)
• This scale possesses all characteristics of

the nominal scale, i.e., numbers are
used as labels
• The numbers can be meaningfully

ordered
• But difference between successive

categories may not be equal
ORDINAL Scale (KEY WORD: RANK)
Examples: SES: 1 – Low; 2 – Average; 3 - High
Educational Attainment:
3 – Doctorate Degree Holder
2 – Master’s Degree Holder
1 – Bachelor’s Degree Holder
Salary Grade: 1, 2, 3, ... , 30

INTERVAL Scale (KEY WORDS: EQUAL INTERVAL)
• This scale possesses all characteristics

of the ordinal scale, i.e., numbers are
used as labels and they can be
meaningfully ranked.
• The differences between successive

categories can be assumed equal
• But there is no true zero point of the

scale
INTERVAL Scale (KEY WORDS: EQUAL INTERVAL)
Examples: IQ Score
Temperature in Degree Celsius
Achievement Score in Standardized

Tests
RATIO Scale (KEY WORDS: TRUE ZERO POINT)
• This scale possesses all characteristics

of the interval scale, i.e., numbers are
used as labels, they can be
meaningfully ranked, and differences
between successive categories are
equal.
• The scale has a true zero point (the

point that indicates the complete
absence of the characteristic being
measured.)
RATIO Scale (KEY WORDS: TRUE ZERO POINT)
Examples: Age
Height
Weight
Work Experience
Number of children in a family
Number of times absent in a year
Relationship of the concepts scale, variable, data:
Nominal Scale Nominal Nominal Data

Variable
1 – male SEX 20 males and

0 - female 30 females
Ordinal Scale Ordinal Ordinal Data

Variable
3 – High SES High SES = 10

2 – Average Average SES = 20
1 - Low Low SES = 30
9, 10, 11, 12 Salary 9, 10, 11, 12
Grades
Interval Scale Interval Interval Data

Variable
100, 105, 102, IQ Scores 100, 105, 102,
110, 115 110, 115
36, 37, 36, 35, Temperature 36, 37, 36, 35,
38 in Degree 38
Celsius
Ratio Scale Ratio Variable Ratio Data

110, 115, 150, Height 110, 115, 150,
160 (in cm) 160
49, 50, 50, 53, Weight 49, 50, 50, 53,

55 (in Kg) 55
Remark: It is possible to downgrade data from
higher level of measurement to lower
level but NOT THE OTHER WAY
AROUND
Interval/Ratio Score on a 60, 85, 90

Data standardized
test
Ordinal Data Ranked 1, 2, 3
Scores
Nominal Data Categorized Passed – 2
Score as Pass- Failed - 1
Fail
Remark: However, downgrading of data results
in loss of information and we will be
constrained to use a less powerful
statistical test.
Power of a statistical test is the probability of

correctly rejecting a null hypothesis.
INDEPENDENT SAMPLES
DISTINCT POPULATIONS
P2 P3
P1Parents)
(Teachers) (Students)
S1 S2 S3
(parents) (teachers) (students)
Independent or uncorrelated samples

DEPENDENT SAMPLES
(Repeated Measures Design)
Pretest Scores Intervention Posttest Scores
Dependent or correlated samples

DEPENDENT SAMPLES
(Matched Groups Design)
Population of
Paired Subjects
Sample of
Paired Subjects
Sample 1 Sample 2
Dependent or correlated samples

Probability – a measure of the likelihood of
occurrence of events which
ranges in value from 0 (if an event
cannot occur) to 1 if the event is
sure to occur.
The concept of probability is important in

inferential statistics since we often attach a
measure of reliability of the inference we make
from the sample to the population.
The p-value (probability value) associated to a
test statistic is used as basis in deciding whether
a hypothesis is rejected or not.
P-value or attained level of significance is

defined as the probability of obtaining a value of
a test statistic as extreme or more extreme than
the one observed when the null hypothesis is
true.
Illustration:
Research Hypothesis: A given coin is biased
Null Hypothesis: The Coin is Fair; i.e.,

P(H) = ½ and P(T) = 1/2
One way of testing the null hypothesis is to toss

the coin 12 times, say, and observe the number
of heads H (or tails T) that occur of the different
outcomes.
Illustration:
If the null hypothesis is true, an outcome such

as 6H (or 6T) is not surprising (likely to
occur if Ho is TRUE) and tends to SUPPORT
the null hypothesis.
Note that if the null hypothesis is true, the

probability of occurrence of 6T or 6H will be
very high.
Illustration:
Moreover, if the null hypothesis is true, an

outcome such as 0H (or 12T) is surprising
(unlikely to occur) and CASTS DOUBT on the
truth of the null hypothesis.
Also, if the null hypothesis is true, the

probability of occurrence of 12T and 0H (0T and
12H) should be very low.
Tabled Probability of outcomes if the coin is tossed 12 times:
Event of Interest Expected Frequency Probability (H = h) if
(Number of Heads H that occur) of Occurrence Null Hypothesis is True
12 1 .00024
11 12 .0029
10 66 .0161
9 220 .0537
8 495 .1208
7 792 .1936
6 924 .2256
5 792 .1936
4 495 .1208
3 220 .0537
2 66 .0161
1 12 .0029
0 1 .00024
Total number of
outcomes = 4096 Total probability = 1.000
Note: If hypothesis is true, P(H) = P(T) = ½ or 0.50

Note that the probability of obtaining 10 heads (or a
value more extreme) if the null hypothesis is TRUE is
P(>10H) = P(10H) + P(11H) + P(12H)

= .00024 + .0029 + .0161
= .019 which is interpreted as “highly unlikely”
On the other hand, the probability of obtaining 6 heads

or a value more extreme than 6 heads is
P(>6H) = P(6H) + P(7H) + P(8H) + P(9H) + P(10H) + P(11H) + P(12H)

= .2256 + .1936 + .1208 + .0537 + .0161 + .0029 + .00024
= .6127 which is interpreted as “likely”
Problem:
What probability of an outcome (criterion) are we

going to use as evidence that the hypothesis is
FALSE and must be rejected? This probability is
set by the researcher and must be “small”. This
probability is called “level of significance ”.
Thus, if the probability of an outcome is less than

, we decide to reject the Hypothesis.
Decision Rule for rejecting a null hypothesis:
If the null hypothesis is TRUE, a small probability

associated to an outcome casts doubt on the truth of the
null hypothesis and if this probability is smaller than or
equal to the criterion set by the researcher, the null
hypothesis is rejected.
Example: Suppose  = .05
If p-value = .049 < , H0 is rejected. (Any p-value which is

less than or equal to  leads to the rejection of H0)
If p-value = .051 > , H0 is not rejected. (Any p-value

which is greater than  leads to the non-rejection of H0)
If data are manually analyzed, we use a statistical

table and the decision rule is:
Reject the null hypothesis if and only if the

computed value of the test statistic is greater than
or equal to the critical (tabular) value.
If data are analyzed using a computer software, p-

values are automatically generated and the
decision rule is:
Reject the null hypothesis if and only if the p-value

associated to the computed test statistic is less
than or equal to the specified level of significance
Equivalence between the two rules:
If the computed value is greater than the critical value, iw

will fall in the rejection when plotted, hence we reject the
null hypothesis.
Note that this computed value will cut an area to the right
(p-value) that is less than the given level of significance .
Hence, we have to reject the null hypothesis when the p-
value  .
Some Statistical Tests Used for Comparing Groups:
(Experimental & Causal Comparative Research Designs)
Type of Samples
Independent Samples Dependent or Correlated
Type of DATA Samples
2 Groups at least 3 Grps 2 Groups at least 3 Grps
Interval or t-test for One Way t-test for Repeated
Ratio independent ANOVA dependent Measures
(Population samples Samples ANOVA
means are
compared)
Ordinal Wilcoxon Kruskal-Wallis Wilcoxon Friedman’s
(Population Rank Sum ANOVA Signed Two Way
medians are Test Ranks Test ANOVA
compared)
Nominal Chi-square Chi-square McNemar Cochran’s
(Population Test Test Test Test
proportions Fisher Exact
are compared) Test
THE t-TEST FOR TWO
INDEPENDENT SAMPLES
Assumptions (t-test for independent samples):
1. Both sampled populations have relative

frequency distributions that are approximately
normal.
2. The population variances are equal

(Homogeneity of variance assumption)
Homogeneous population
Heterogeneous population
Assumptions:
3. The samples are randomly and independently

selected from the populations.
The manner of selecting an element in the

population does not influence the manner of
selecting the succeeding element.
4. Data are at least interval

TEST STATISTIC:
x1  x 2 x1  x 2
t or t
1 1 S2
S 2
S   
2
p
1
 2
 n1 n2  n1 n2
Where:
x1 is the mean of group 1; s12 variance of group 1
x2 is the mean of group 2; s22 variance of group 2
Sp2 is the pooled variance defined by
S p2 
n1  1s12  n 2  1s 22
n1  n 2  2 
ILLUSTRATION:
Research Question: Is there a significant difference
between the mean math achievement
scores of students exposed to the
constructivist model and those
exposed to the lecture method?
Null hypothesis: There is no significant difference between

the mean achievement scores of
students exposed to the two methods of
teaching mathematics.
ALTERNATIVE HYPOTHESIS:
Directional: The mean achievement score of students

exposed to the constructivist model is
significantly higher than those exposed to the
lecture method.
Non-Directional: There is a significant difference between

the mean achievement scores of students
exposed to the constructivist model and
those exposed to the lecture method.
Note: A directional alternative hypothesis requires a one-

tailed test while a non-directional one requires a two-
tailed test.
Hypothetical Data
Group 1 Group 2
(Constructivist) (Lecture Method)
x1 x2
84 79
81 81
89 78
85 82
85 80
What are the variables of the study?

 Method of Teaching – independent variable
 Achievement Score – dependent variable
Variable name – one word alphanumeric
Example: method
Variable Label: Example:

Method of Teaching
Value labels (labels of the categories of a

qualitative variable)
Example: Method of Teaching is qualitative
Group Code Label

1 1 Constructivist
2 2 Lecture Method
Measure:
Method – Nominal
Score - Scale
To analyze
the data:
Define your Test Variable
(dependent variable) and
your Grouping Variable
(independent Variable) by
CLICKING the button
with arrow
Provide the value labels for

method of teaching
(qualitative variable) by
CLICKING the button
“Define Groups”
Click Define Groups
Type 1 for Group 1

and 2 for Group 2
then
Click Continue
After defining the

value labels, click OK
Homogeneity of Variance Assumption:
This assumption requires that the population variances of

the groups compared are equal. Thus, the null hypothesis
to be tested is:

H0: 1
2
  2
2 If the null hypothesis is not rejected, report the
pooled variance estimate t-value
H1: 1  2 If the null hypothesis is rejected, report the

2 2
separate variance estimate t-value
The test statistic for testing the above null hypothesis is the
Levene’s Test. If the p-value associated with the F-value is
less than  = 0.05, say, reject H0, otherwise do not reject H0.
Output1: Descriptive Statistics (Highlight the number of cases N, Means
(rounded to one decimal place), and the standard deviation
(rounded to two decimal places)
Output 2:
1. Levene’s Test gives information if the Homogeneity of Variance
Assumption of the population variances is satisfied.
H0: 21 = 22
Value of the test statistic: F = 0.439 and the
p-value = .526 (not significant)
Hence, we can assume that the population variances are equal and the
computed t value to be reported is 3.281 with 8 degrees of freedom
(pooled variance estimate).
*** If the F value is significant, we report the t-value with 6.2 degrees
of freedom (separate variance estimate).
1.
P-value
Output 2:
2. To test the null hypothesis: H0: 1 = 2
of equality between the population means, we report the computed
t-value of 3.218 with df = 8. Since the associated p-value of the test
statistic is .011 (two-tailed) <  = 0.05 or
2(0.011) = .022 (one-tailed) <  = 0.05
the null hypothesis is rejected.
There is sufficient evidence to show that the mean score posted by the
students exposed to the constructivist model (84.8) is significantly higher
than the mean posted by the students exposed to the lecture method
(80.0), t=3.281; p-value = .011.
P-value
THE T-TEST FOR
COMPARING TWO
DEPENDENT SAMPLES
x1  x2
t
 D   D 
TEST STATISTIC: 2 2
n
n ( n  1)
2
Where:
x1 is the mean of group 1;

x2 is the mean of group 2;
n is the total number of paired data/observations;
D is the difference between paired scores/data
ILLUSTRATION:
Management training programs are often instituted in
order to teach supervisory skills and thereby increase
productivity. Suppose a set of examinations was
administered to each of ten supervisors before such a training
program begins and then administers similar examinations at
the end of the program. The examinations are designed to
measure the supervisory skills with higher scores indicating
increased skills.
The results of the test are shown in the table below. Is the
training program effective?
Supervisor 1 2 3 4 5 6 7 8 9 10
Score Before 63 93 84 72 65 72 91 84 71 80
Score After 78 92 91 80 69 85 99 82 81 87
NULL HYPOTHESIS:
There is no significant difference between the mean scores

on supervisory skills posted by the participants before and
after the training.
ALTERNATIVE HYPOTHESIS:
The mean posttest score on supervisory skills is
significantly higher than the mean pretest score.
(There is a significant improvement in the mean score on

supervisory skills after the training).
WORKSHEET:
BEFORE AFTER
63 78
93 92
84 91
72 80
65 69
72 85
91 99
84 82
71 81
80 87
x=77.5 x=84.4
Data view
Click Variable View:

Under the column Name: Type pre_test, then
post_test as variable names
Variable Labels: Type “Pretest Score on

Supervisory Skills” for pre_test and “Posttest
Score on Supervisory Skills” for post_test and
Click: “Scale” under the column measure
Variable view
To analyze
the data:
Data view
Click the arrow button
to transfer the pretest
variable to the right
box.
Do the same for the

posttest variable
Click ok.
P-value
Based on the results of the analysis, the posttest mean score on

supervisory skills (84.4) is significantly higher than the pretest
mean (77.5), t = 4.022, df = 9, p = .003.
ONE WAY ANALYSIS OF
VARIANCE
What factors explain the variation of teachers’
emotional exhaustion (level of burnout)?
Let the circle below represent the total variation of the

teachers’ emotional exhaustion scores and suppose we
identify only one variable, marital status, as the factor that
contributes to such variation.
This total variation may be partitioned into two sources:

1. due to marital status
2. due to all other factors different from marital status
(lumped together and is called error)
PARTITION OF THE TOTAL VARIATION:
Explained Variation
(due to Marital Status)
Unexplained Variation (Error)

(Due to all other factors marital status excluded)
Note that the proportion of total variance explained by marital

status is very large compared to the proportion unexplained
(error variance). If marital status explains a large proportion of
the total variance, it means that the emotional exhaustion of
the teachers would differ across the categories of marital
status.
PARTITION OF THE TOTAL VARIATION:
Explained Variation
(due to Marital Status)
Unexplained Variation (Error)

(Due to all other factors marital status excluded)
On the other hand, if marital status explains only a small

proportion of the total variance, it means that the emotional
exhaustion of the teachers is not an important factor and the
emotional exhaustion scores are expected to be comparable
across the categories of marital status.
The test statistic that we can use to test the equality of
the mean exhaustion scores across categories of
marital status is the ratio of the variance due to marital
status divided by the variance due to error, i.e.,
Variance due to Marital Status

F
Variance due to Error
A large value of F would occur when the numerator is

large compared to the denominator. Hence, large
values of F (much greater than 1.0) would lead to the
rejection of the null hypothesis.
Research Question:
Do teachers’ emotional exhaustion vary across categories of
marital status?
Null Hypothesis:
There is no significant difference in the mean level of
emotional exhaustion among teachers with different marital
status.
Alternative Hypothesis:
There is a significant difference in the level of emotional

exhaustion among teachers with different marital status.
Sample Data:
single married widow/separated
Group 1 Group 2 Group 3
34 34 13
38 22 21
What are the variables of the study?
35 30 28
 Marital Status – independent variable
31 16 13
34 30 14  Emotional Exhaustion Score
34 14 16 – dependent variable
34 22 25
27 30 24
34 35 19
34 14 14
(encode the data using SPSS)
35 23 17
34 35 14
31 16
30 25
29 17
31
26
Encoded Data Using SPSS. Note that we only have two columns since there are only
two variables (Data View worksheet).
Click “Variable View” then do the following:
1. Under the column Name: Type e_score; then ms
2. Type the variable Label under the column “Label”:

for e_score - Emotional Exhaustion Score
for ms - Marital Status
3. Indicate the labels of the values of the categorical variable by clicking the
right portion of the cell under the column Values:
Thus: Value: 1 Label: Single - then click “Add”
Value: 2 Label: Married - then click “Add”
Value: 3 Label: Widow/Separated - then click “Add”
4. Under the column “Measure”,

click “Scale” for “e-score” since it is interval
click “Nominal” for “ms” since it is categorical
To analyze
the data:
Click “One way ANOVA” and declare the dependent variable by clicking the
button for “Dependent List” as well as the independent variable for the “Factor”.
The result is shown in the exhibit below.
For the desired output:

1. Click Options
For the desired output:
1. Click Options, click “Descriptive”;
Click “Homogeneity of variance test”
2. Click “Continue” to go back to the previous window, then click “Post Hoc”
for the pairwise comparison test.
3. Cliick “Scheffe” – most conservative pairwise comparison test
4. Click “Continue” to go back to the previous window.
5. Click “ok” to generate the outputs.
Output 1: descriptive statistics
Output 2: Test for equality of population variances
P-value
Output3: Test of equality among the population means
P-value
Output4: Pairwise comparison of two means

Output5: Clustering of the means
REPEATED MEASURES
ANALYSIS OF VARIANCE
Assumptions:
1. Scores in each population are normally

distributed around the population mean 
2. The population variances are equal

(Homogeneity of variance assumption)
3. The correlations among pairs of levels of the

repeated variable is constant (Assumption on
Sphericity).
4. Data are at least interval

Research Question:
Does repeated experience with the Licensure Examination for
Teachers (LET) lead to better scores, even without any
intervening study?
Null Hypothesis:
There is no significant difference in the mean scores of 15
students on four sessions of taking the LET examinations.
Alternative Hypothesis:
There is a significant difference in the mean scores of 15
students on four sessions of taking the LET examinations.
Data: Scores on four practice sessions
Session 1 Session 2 Session 3 Session 4

178 180 179 183
156 160 163 165
182 179 185 188
145 152 154 165
113 110 116 118
162 165 164 170
134 129 132 136
152 160 162 168
145 149 151 154
178 180 176 183
129 132 138 140
149 146 152 154
Data encoded using SPSS (Data View Worksheet)
Variable View Worksheet
1. Under the column “Name”, type Session1, Session2, Session3, Session4
2. Under the column “Measure”, click “Scale” for all variables
(The result is shown in the second exhibit below.)
To Analyze: Click Analyze, General Linear Model, Repeated Measures…
Result after clicking “Repeated Measures”
1. Type the within subjects variable: Session
2. Type the number of levels, 4 then click “Add”
3. Click Define (to obtain the exhibit at the right)
4. Transfer the 4 levels of Session to the right, one after the other by clicking the
arrow
Result after transferring the levels of “session” to the right:
NEXT: Click “Options”; Transfer the variable “Session” to the box “Display Means
for”;
Check “Compare Means”; Click “Bonferroni”; You may click “Descriptive” if you
want to show the means per session. (The result is shown in the exhibit at the
right), Click “Continue” to go back to the previous window; click “ok”.
Reading the outputs
The mean scores and standard deviations per session are indicated. Note
that the mean scores tend to increase as the number of times of taking
the LET examination increases.
P-value
Mauchly’s Test of Sphericity tests equality of variances of the

differences of paired scores in all combination of treatment levels.
If significant, sphericity assumption has not been met, we can use the
corrected ANOVA depending on the estimated value of sphericity
(epsilon); or we can use an alternative method of analyzing the data.
If not significant, assumption on sphericity has been met.
In the output, Mauchly’s test is not significant, 2 = 8.238,

p = .146.
When the estimated sphericity (epsilon in the output) is > .75, use Huynh-Feldt
values; When < .75, use Greenhouse-Geisser values. Since sphericity can be
assumed, the ANOVA table will be as follows:
P-value
Based on the results of the analysis, the null hypothesis is rejected,

F = 22.238, p < .001. The mean scores on the four sessions are
significantly different.
The pairwise comparison test using Bonferroni technique indicates
that the mean score posted in the first session is significantly lower
than the means during the 3rd and 4th sessions. The mean scores
during the 2nd and 3rd session are significantly lower than the mean
score during the 4th session. Overall, the mean score during the 4th
session is significantly higher than in the first three sessions.
WILCOXON RANK SUM TEST
(MANN WHITNEY U TEST)
Click “Analyze”  “Nonparametric Tests” to generate the
window shown at the right.
Using the window at the right, Click “Fields”

Transfer the variables to the appropriate boxes
Click “Run”
Output 1 (IBM SPSS): P-value
The null hypothesis is rejected because the p-value

is .016 (<  = .05)
Using the Legacy Dialogs
Click “2 Independent Samples”

Transfer the variables to the appropriate boxes; Click “Define Groups”

and encode the codes for “Methods of Teaching”; Click “Continue”;
then Click “OK”
Output 2 Using the Legacy Dialogs: P-value
Based on the medians, it can be said that the median score posted by
the students exposed to the constructivist model is higher than those
exposed to the lecture method and this is significant at the .05 level of
significance, z = 2.312, p = .016. Hence, the null hypothesis is
rejected.
WILCOXON SIGNED RANKS TEST

Click “Run”
Output 1 (IBM SPSS):
P-value

= .012 (<  = .05)
Click “2 Related Samples”

Transfer the paired variables to the appropriate boxes one at a time

(or by highlighting both and clicking the arrow button; Click “Options”
then check “Quartiles” to generate the medians; then Click “OK”
Output 2 Using the Legacy Dialogs:
P-value
The results indicate that the median postttest score (83.5) is higher
than the median pre-test score (76) and this is significant at the .05
level of significance, z = 2.501, p = .012. Hence, the null hypothesis
is rejected.
KRUSKAL WALLIS ONE WAY
ANOVA

Click “Run”
P-value

is less than .001 (highly significant)
Click “K Independent Samples”

Transfer the variables to the appropriate boxes one at a time; Click

“Define Range” then encode the smallest and highest codes for the
categorical independent variable; Click “Continue”; then Click “OK”
To generate the median scores of the three groups, click “Analyze” 

“Compare Means”  “Means”. Next, transfer the variables to the
appropriate boxes, transfer the statistics at the right (Mean, Number
of Cases, Standard Deviation) to the left box, select “Median” statistic
from the left and transfer it to the right box; click “continue”; click
“OK”.
P-value
The Kruskal Wallis test indicates that the median emotional exhaustion
scores of the three groups are significantly different, 2 = 22.535, d.f.
= 2, p < .001. Hence, the null hypothesis is rejected.
Pairwise Comparison Test:
We can run the Wilcoxon Rank Sum Test for each

pair of groups. Since there are three groups, we
will run this test three times (G1 vs G2), (G1 vs.
G3), (G2 vs. G3).
To control the level of significance  = .05, we

divide this value by the number of comparison
groups (3). Thus, .05/3 = 0.0167. Two groups
will be declared significantly different if the p-
value of the test statistic is less than 0.0167. This
is called the Bonferroni technique for comparing
groups.
Pairwise Comparison Test:
To control the error rate  = 0.05, we first divide

this value by the number of comparison groups
(3). Thus,
0.05
 0.0167
3
Hence, p-values less than 0.0167 indicates that

the two groups are significantly different.
Output 1:
The median emotional scores of the single and married teachers are
significantly different, z = 2.877, p = .004 (< .0167).
Output 2:
The median emotional scores of the single and widowed/separated

teachers are significantly different, z = 4.442, p < .001 (< .0167).
Output 3:
The median emotional scores of the married and widowed/separated

teachers are significantly different, z = 2.502, p = .012 (< .0167).
FRIEDMAN’S TWO WAY ANOVA

Transfer the variables to the appropriate boxes by
clicking each variable and pressing the arrow button.
Click “Run”
P-value
The null hypothesis is rejected because the

p-value is less than .001 (<  = .05)
Click “K Related Samples”

Transfer the paired variables to the appropriate boxes one at a time

(or by highlighting all and clicking the arrow button; click “Statistics”
and check quartiles to generate the medians, click “Continue” then
Click “OK”
Results show that the median emotional exhaustion scores

of the three groups are significantly different. 2 = 25.900,
d.f. = 3, p < .001. Hence, the null hypothesis is rejected.
Pairwise Comparison Test Using the Bonferroni Approach
Compare two groups at a time using the Wilcoxon Signed Ranks Test
Click “Analyze” 
“Nonparametric
Tests”  “Legacy
Dialogues” ”2
Related Samples”
Pairwise Comparison Test Using the Bonferroni Approach
Select the pair of variables to be compared and transfer them to the

right box by clicking the arrow button.
Once done,
Click “OK”.
Outputs:
Criterion for declaring

significantly different pairs
using the Bonferroni
technique:
0.05
 0.0083
6
Hence two groups are
significantly different if the p-
value of the test statistic is
less than 0.0083
significant significant significant

THE CHI-SQUARE TEST
for Homogeneity of Samples
TWO INDEPENDENT SAMPLES
Example :
In a study conducted on the use of seat

belts in preventing fatalities, records of the
last 100 vehicular accidents were
reviewed. These 100 accidents involved
238 persons. Each person was classified
as using or not using seat belts when the
accident happened and as injured fatally or
a survivor.
Research Question:
Is there a significant difference in the proportion
of persons who are fatally injured between those
who wear seatbelts and those who do not?
Null Hypothesis:
There is no significant difference in the
proportion of persons who are fatally injured
between those who wear seatbelts and those
who do not.
Alternative Hypothesis (Non-directional):
There is a significant difference in the proportion
of persons who are fatally injured between those
who wear seatbelts and those who do not.
Alternative Hypothesis (Directional):

The proportion of persons who are fatally injured
is higher for those who do not wear seatbelts
than those who wear seatbelts.
Data:
Injured Wearing Seat Belt? Total

Fatally? Yes (1) No (0)
Yes (1) 9 88 97
No (0) 23 118 141
Total 32 206 238
Test Statistic:
o  e 
2
 
2

over all cells e
Where o is the “observed frequency” and e is the
expected frequency.
For contingency tables, the degrees of freedom is

given by
df= (r-1)(c-1)
where r is the number of categories of the row
variable while c is the number of categories of the
column variable.
The tabled data is actually a summary of the
responses of the respondents which can be
encoded directly using SPSS
• Use codes for the

responses: 1 – Yes; 0 – No
• Encode the frequency

counts in a column
• Put the corresponding

codes for the two variables
• So that SPSS will read the

data as frequency counts,
we have to set on the
“weight cases” command
1. Click “Data”; then “Weight Cases”
to show the window on top right.
2. Highlight the data “Freq”, Click “Weight cases by” then Click
the arrow button; Click “ok”
Check if “Weight On” already appears at the bottom right corner
of the worksheet
To Analyze the data: Click “Analyze” ”Descriptive Statistics”
 “Crosstabs” to obtain the window at the right.
Transfer the row variable “Injured Fatally” and the column

variable “Wearing Seatbelts” by highlighting them one at a
time and clicking the arrow button.
Click “Statistics” then check “Chi-
square”, then click “Continue”;
Click “Cells” then check “Column”;
then Click “Continue”; then Click
“OK”
Output (Descriptive Statistics):
Note that 28.1% of those who wore seatbelts were

injured fatally while 42.7% of those who did not wear
seatbelts were also injured fatally. Is 42.7% significantly
higher than 28.1%?
Output (Inferential Statistics):
If there is no cell with expected frequency of less than 5, we report the

Pearson Chi-square value. Since 2 = 2.443 and the associated
probability is .118 (which is greater than  = .05), the null hypothesis
is NOT rejected. There is no sufficient evidence to conclude that
wearing seat belts significantly reduces fatal injuries.
Assumption on the use of the Chi-square Test:
“No more than 20% of the cells must have expected

frequencies of less than 5”
For 2x2 tables, if the smallest expected frequency is less

than 5, or for very small sample size N  20, use the FISHER
EXACT PROBABILITY TEST.
In the SPSS output shown in the previous slide, there is no cell with
expected frequency of less than 5, so reporting the Chi-square value is
correct. If there is a cell with expected frequency of less than 5, we
have to report the Fisher’s Exact Test. Note that the Fisher’s Exact
Probability is .084 which is greater than 0.05 criterion. Hence the null
hypothesis is also NOT rejected.
FISHER EXACT PROBABILITY TEST
Do male and female presidents of the 10 State Colleges and

Universities (SUCs) differ on their opinion regarding the integration of
CHED supervised tertiary schools to the different SUCs in the region?
Each president was asked if he/she is for or against the planned
integration of the schools to the SUCs. The data are shown in the table
below
Based on the data, 4 out of 5 or 80% of the male presidents are in

favour while only 2 out of the 5 pr 40% of the female presidents are in
favour.
Null Hypothesis:
There is no significant difference between the proportion of male and
female SUC presidents who are in favor of integrating the CHED
supervised tertiary schools to the different SUCs
Alternative Hypothesis (Directional):

The proportion of male presidents who are in favour of the integration
is significantly higher than the female presidents.
Alternative Hypothesis (Non-Directional):

There is a significant difference between the proportion of male and
female SUC presidents who are in favor of integrating the CHED
supervised tertiary schools to the different SUCs
Hence, the probability (one-tailed) of obtaining a frequency

distribution as extreme or more extreme than the one
observed when the null hypothesis is true is given by
P = 0.238 + 0.0238 = 0.262
Since 0.262 is greater than  = 0.05, the null hypothesis is

not rejected. This means that the proportion of male and
female presidents who are in favor of the integration of
CHED supervised tertiary schools to the SUCs are not
significantly different.
SPSS OUTPUT:
THREE INDEPENDENT SAMPLES
Example :
Suppose we want to compare the effectiveness of three methods of

teaching physics namely, Lecture Method (Method 1), Modular
Method (Method 2) and Using CAI Materials (Method 3). Students
are randomly assigned to three groups and the three teaching
methods are randomly assigned students to the three groups also.
The dependent variable in this case is the performance of the

students in the final examination in Physics. If the data are scores,
the One way ANOVA will be applicable. But suppose the
performance of the student is categorized into one of the following
categories: Below Satisfactory (score of 74 and below), Fair (75 –
79), Satisfactory (80 – 84), and Above Satisfactory (85 and above).
It would be interesting to determine how many of the

students in each group would have scores falling within
each of these four categories. A comparison of the
frequencies or proportions can be done descriptively. But
if we want to test whether the proportions within each
category of the dependent variable differ significantly
from one another, the Chi-Square test of significance can
be used.
Ho: The distribution of grades of students exposed to the
three teaching methods will not differ significantly. (or
There is no significant difference between the
proportion of students in each of the three groups
who obtained Above Satisfactory ratings)
Ha: The distribution of grades of students exposed to the

three teaching methods will differ significantly. (or
There is a significant difference between the
proportion of students in each of the three groups
who obtained Above Satisfactory ratings)
Performance Method of Teaching TOTAL

Category
Lecture Modular CAI
(3) (2) (1)
Above Satisfactory (4) 9 20 18 47
Satisfactory (3) 12 18 21 51
Fair (2) 15 10 8 33
Below Satisfactory (1) 24 12 6 42
TOTAL 60 60 53 173
The tabled data is actually a summary of the
responses of the respondents which can be
encoded directly using SPSS
• Use codes for the responses:
• Perf_Cat: 1 – Above Satisfactory
2 – Satisfactory
3 – Fair
4 – Below Satisfactory
• Method: 1 – Lecture
2 – Modular
3 – CAI
• Encode the frequency counts in a

column
• Put the corresponding codes for the

two variables
• So that SPSS will read the data as

frequency counts, we have to set on
the “weight cases” command
1. Click “Data”; then “Weight Cases”
to show the window on top right.
2. Highlight the data “Freq”, Click “Weight cases by” then Click
the arrow button; Click “OK”
Check if “Weight On” already appears at the bottom right corner
of the worksheet
To Analyze the data: Click “Analyze” ”Descriptive Statistics”
 “Crosstabs” to obtain the window at the right.
Transfer the row variable “Performance Category” and the

column variable “Method of Teaching” by highlighting them
one at a time and clicking the arrow button.
Click “Statistics”  check “Chi-
square”; click “Continue”;
Click “Cells”  check “Column”;
Click “Continue”; then Click “OK”
Output (Descriptive Statistics):
Among the three groups, those exposed to CAI method

posted the highest proportion (34.0%) followed by those
exposed to Modular (33.3%), then by those exposed to
the Lecture Method (15.0%).
Output (Inferential Statistics):
If the number of cells with expected frequency of less than 5 is only

20% or less, we report Pearson Chi-square value. Since 2 = 20.647
and the associated probability is p = .002 (which is less than  = .05),
the null hypothesis is rejected.
If more than 20% of the cells have expected frequencies of less than 5,
either we collapse categories (provided the resulting category has
meaning) or we gather more data.
MCNEMAR TEST
McNemar Test
The McNemar test for the significance of changes

is applicable to “before-and-after” designs in
which each subject is its own control and in which
the measurements are made on either a nominal
or ordinal scale.
McNemar Test
To test the significance of any observed change

using this test, a fourfold table of frequencies is
used to represent the first and second sets of
responses from the same individuals. In this
table, + and – signs are used to denote different
responses arranged as shown below:
McNemar Test
After
– +
Before + A B
– C D
A - denotes the number of individuals whose responses were

POSITIVE on the first measure and NEGATIVE on the second
measure;
D - the number of individuals whose responses changed from

NEGATIVE to POSITIVE
B and C are the respondents who responded the same (POSITIVE

for B and NEGATIVE for C) on both measures.
EXAMPLE:
How consistent are people in their voting habits? Do people vote for
the same party from election to election? Below are the results of a
poll in which people were asked if they had voted for NP or LP in each
of the last two presidential elections.
1992 1998 Elections Total

Elections LP NP
NP 27 117 144
LP 178 23 201
Total 205 140 345
144
Note: % of people who voted for NP in 1992 was  41.74%
345
140
% of people who voted for NP in 1998 was  40.58%
345
Ho: There is no significant difference between the
proportion of those who voted for NP during
the 1992 elections and those who voted the
same party affiliation during the 1998
elections.
H1: The proportion of those who voted for NP

during the 1998 elections was significantly
lower than those who voted for NP during the
1992 elections.
We use the following codes for the party affiliation:
1 – NP; 2 – LP and encode the data as follows:
1992 1998 FREQ

1 2 27
1 1 117
2 2 178
2 1 23
We encode the data in similar manner as we did for the Chi-square test
To Analyze the data: Click “Analyze”
”Nonparametric Tests”  “Legacy
Dialogs”  “2 Related Samples” to
obtain the window shown in the nesxt
slide.
Transfer the paired variables to the right; Uncheck
“Wilcoxon”, Check “McNemar”, then Click “OK”.
IBM SPSS
LEGACY DIALOGS
COCHRAN’S Q TEST
Assumptions:
1. Responses are binary and from k matched

groups.
2. The subjects are independent of one another and

were selected at random from a larger population.
1. The sample size is sufficiently “large”. (A rule of

thumb: the number of subjects for which the
responses are not all 0’s or 1’s, n, should be  4
and nk  24. This assumption is not required for
the exact binomial McNemar Test.
Example:
An experimental study was conducted to

identify the method of teaching that would best
improve the conceptual understanding of the
students in Chemistry. Fifteen (15) sets of matched
individuals were selected and randomly assigned to
the three groups. The dependent variable of the
study was the students’ performance in the test to be
given after the experiment. Each student’s
performance was coded as 1 if the student passes
the test and 0 if he fails. The data are shown below.
DATA Pair METHOD A METHOD B METHOD C
1 1 1 1
1 – PASS 2 1 0 1
0 - FAIL 3 0 0 0
4 1 1 0
5 0 0 1
6 0 0 0
7 1 0 0
8 1 1 0
9 1 0 1
10 1 0 0
11 0 0 0
12 1 0 0
13 1 1 1
14 1 0 1
15 0 0 0
Ho: There is no significant difference in the
proportion of subjects who pass the test in each
of the three groups.
H1: There is a significant difference in the proportion

of subjects who pass the test in each of the
three groups

Transfer the paired variables to the right, then click “RUN”.
IBM SPSS
Using the Legacy Dialogues:.
Using the Legacy Dialogues:.
Transfer the paired variables

to the right; Uncheck
“Friedman” then Check
“Cochran’s Q”  Click
“Statistics”  Check
“Descriptives”  Click “OK”
LEGACY DIALOGS
Based on the output, the three proportions (66.7%, 26.7%, and 40.0%)
are significantly different, Q = 6.222, d.f. = 2, p-value = .045.
PAIRWISE COMPARISON USING THE MCNEMAR TEST:
Click “Analyze”  Click “Nonparametric Tests”  “Legacy Dialogues”  “2

Related samples”. Then transfer the paired variables to the right one at a
time. Check “McNemar”, then click “OK”.
Outputs:
Using the Bonferroni technique with the

criterion set at .05/3 = 0.0167, no two
groups are significantly different. Note that
the Cochran’s Q Test was significant yet
Mcnemar using Bonferroni approach failed to
detect two groups which are significantly
different.
McNemar is less powerful since only the

discordant pairs are used in the computation
of the test statistics
ALTERNATIVE TO MCNEMAR: Minimum Required Difference (MRD)
Where k = number of groups; N = number of paired cases

Pair METHOD A METHOD B METHOD C Sum Sum2
N  k 
1
2
1
1
1
0
1
1
3
2
9
4
T   


i 1 

Yi , j   20

3 0 0 0 0 0
j 1 
4 1 1 0 2 4 2
5 0 0 1 1 1 N  k 
6 0 0 0 0 0 R  


i 1 

Yi , j   42

7 1 0 0 1 1 j 1 
8 1 1 0 2 4
9 1 0 1 2 4 .05
10 1 0 0 1 1 adj   0.0167
3
11 0 0 0 0 0
12 1 0 0 1 1
zadj  2.13
13 1 1 1 3 9
14 1 0 1 2 2
15 0 0 0 0 0 Obtained using the
Total 10 4 6 20 42 normal table
% 66.7% 26.7% 40.0%
 kT  R 
MRD  zadj 2 2 
 N k (k  1) 
 (3)(20)  42 
 (2.13) 2  2 
 15 (3)(2) 
 .3478
 34.78%
Comparison i j Absolute Minimum Decision
Difference Required
Absolute
Difference
A vs. B 66.7 26.7 40.0 34.7 Reject

A vs. C 66.7 40.0 26.7 34.7 Do not reject
B vs. C 26.7 40.0 13.3 34.7 Do not reject
Conclusion: Only the effects of Method A and Method B are significantly different.
Thank you !
Establishing Relationship Between Variables:
(Descriptive Correlational Research Designs)
Variable X Variable Y Measure of Correlation

At Least Interval At least Interval Pearson’s r
At Least Ordinal At Least Ordinal Spearman’s Rho
Nominal At Least Interval Point-Biserial
(Dichotomous)
Nominal At Least Interval Eta (Measure of Non-linear Correlation)
(At least Three (Dependent Variable)
Categories)
Categorical Categorical Chi-Square Based Measures
Contingency Coefficient (r x r tables)
Cramer’s Coefficient (r x c tables)
Phi Coefficient (2 x 2 tables)
Categorical Categorical PRE-Based Measure
(Unordered) (Unordered) Lambda Coefficient
Categorical Categorical PRE-Based Measure
(Ordered) (Ordered) Gamma

Data Analysis Using SPSS - Evsu PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Data Analysis Using SPSS - Evsu PDF

Caricato da

Copyright:

Formati disponibili

COMPUTER-AIDED DATA

ANALYSIS FOR RESEARCH

Interpretation of Data Analysis of Data

Population: Public School Teachers in Region 8

Population: Senior High School Students currently

 Quantitative – a variable that

 Qualitative – a variable that cannot

Dependent Variable (response or

 According to CONTINUITY OF SCALE

Examples: Height, Weight, Age, Teaching Experience

Discrete Variable – variable that can take on only

Examples: Number of Children in a Family,

 According to SCALE OF MEASUREMENT

Variable: Attitude towards Teaching

• Establishes equivalence or difference

• In this scale, numbers are used as mere

• The numbers cannot be meaningfully

Examples: Sex: 1 – male; 0 – female

Religion: 1 – Roman Catholic

• This scale possesses all characteristics of

• The numbers can be meaningfully

• But difference between successive

Examples: SES: 1 – Low; 2 – Average; 3 - High

Salary Grade: 1, 2, 3, ... , 30

• This scale possesses all characteristics

• The differences between successive

• But there is no true zero point of the

Temperature in Degree Celsius

Achievement Score in Standardized

• This scale possesses all characteristics

• The scale has a true zero point (the

Nominal Scale Nominal Nominal Data

1 – male SEX 20 males and

Ordinal Scale Ordinal Ordinal Data

3 – High SES High SES = 10

Interval Scale Interval Interval Data

Ratio Scale Ratio Variable Ratio Data

49, 50, 50, 53, Weight 49, 50, 50, 53,

Interval/Ratio Score on a 60, 85, 90

Power of a statistical test is the probability of

Independent or uncorrelated samples

Pretest Scores Intervention Posttest Scores

Dependent or correlated samples

Dependent or correlated samples

The concept of probability is important in

P-value or attained level of significance is

Research Hypothesis: A given coin is biased

Null Hypothesis: The Coin is Fair; i.e.,

One way of testing the null hypothesis is to toss

If the null hypothesis is true, an outcome such

Note that if the null hypothesis is true, the

Moreover, if the null hypothesis is true, an

Also, if the null hypothesis is true, the

Note: If hypothesis is true, P(H) = P(T) = ½ or 0.50

P(>10H) = P(10H) + P(11H) + P(12H)

On the other hand, the probability of obtaining 6 heads

P(>6H) = P(6H) + P(7H) + P(8H) + P(9H) + P(10H) + P(11H) + P(12H)

What probability of an outcome (criterion) are we

Thus, if the probability of an outcome is less than

If the null hypothesis is TRUE, a small probability

Example: Suppose  = .05

If p-value = .049 < , H0 is rejected. (Any p-value which is

If p-value = .051 > , H0 is not rejected. (Any p-value