SPSS STATS LAB: DESCRIPTIVE STATISTICS AND DATA SCREENING

California State University, Northridge Department of Psychology Statistical Methods Lab: PSY320L
SPSS 12.0 COURSE READER
Developed and Written by: Janice C. McMurray, M.A. Teaching Assistant
Supervising Professor: Gary S. Katz, Ph.D.
Contents
1 Using SPSS 1.1 SPSS Availability in Sierra Hall 1.2 Finding Course Files on the Internet Becoming Familiar with SPSS 2.1 Introduction 2.2 Two Views 2.3 Entering Data 3 3 3 4 4 4 5 6 6 6 9 11 11 12 12 15 15 15 15 16 16 17 17 17 19 19 19 20 20 21 23 23 23 23 24 26 26 28 28 29 32 35
3 Data Screening 3.1 Descriptive Information 3.1.1 SPSS Explore 3.1.2 Output from Explore 3.1.3 Linearity and Homoscedasticity: Two-Variable Relationships 3.1.4 What if the Data are Not Normal? 3.2 Measures of Central Tendency 3.2.1 Analyze Central Tendency and Variability of Scores 4 Data Management 4.1 Sort Your Data 4.2 Correct Data Entry Errors 4.3 Analyzing Linear Transformations of Your Data 4.3.1 Linear Transformations of a Current Variable Using Addition 4.3.2 Linear Transformations of a Current Variable Using Subtraction 4.3.3 Linear Transformations of a Current Variable Using Multiplication 4.3.4 Linear Transformations of a Current Variable Using Division 4.3.5 Linear Transformations of a Variable Using Subtraction and Division Percentiles 5.1 Definition 5.2 Computing Percentiles 5.3 Select Cases 5.3.1 Select Cases if a Defined Condition is Satisfied 5.3.2 Select a Random Sample of Cases Examining Relationships Between Variables 6.1 Producing scatterplots 6.1.1 Matrix Scatterplots 6.1.2 Simple Scatterplots 6.2 Analyzing Correlation Coefficients Predicting Outcomes 7.1 The Linear Regression Analysis Testing Hypotheses by Comparing Means 8.1 The One-Sample t Test 8.2 The Two Independent Samples t Test 8.3 One-Way Analysis of Variance (ANOVA) 8.4 General Linear Model (GLM) 2
7 8
Chapter 1
Using SPSS
1.1
SPSS Availability in Sierra Hall

SPSS is loaded on all the computers in the open computer lab at the east end of the third floor (SH392). Lab hours are Monday - Thursday 7:45 am - 11 pm, Friday 7:45 am - 5 pm. There are limited hours available with a tutor in this lab (SH341). Tutor lab hours will be posted on the window of the door into the lab. Additional tutoring is available in SH385 (there are no computers in that room). The tutoring schedule will be posted on the door into the room.
1.2
Finding Course Files on the Internet

Open Internet Explorer All files used for this course may be retrieved at the Course Web Site: http://www.csun.edu/~gk45683/psy320/ Open a file from the Course Web Site Click once on that file A pop-up File Download screen will appear Choose Open and the file will open on your Desktop
Chapter 2
Becoming Familiar with SPSS
2.1 Introduction
SPSS for Windows is just like any other Windows program. Many familiar Windows features are available in SPSS (e.g., menu bar, tool bar, cut, copy, and paste). Many features of the data editor are similar to those found in spreadsheet applications (e.g., MS Excel). SPSS is designed to run analyses with: Subjects entered one per row going down the screen, and Variables entered one per column going across the screen. Variables
subject 1 2 3 4 5 anxiety 1 1 1 1 1 tension 1 1 1 2 2 trial 1 18 19 14 16 12 trial 2 14 12 10 12 8 trial 3 12 8 6 10 6
2.2
Two Views
Take a moment now to click on the Variable View tab near the bottom left of the screen (Figure 2.2 next page) to see the difference in the set-up screen. Before entering new data, we need to define each variable in the data set. Not only may we define variables, but we may also define levels of these variables. There are a few restrictions in naming your variables: Variable names must begin with a letter not a number Variable names cannot end with a period Variable names that end with an underscore should be avoided The length of the name (in the name column) cannot exceed eight characters Blanks and special characters cannot be used Each variable name must be unique
Click back to Data View now. Move your mouse WITHOUT CLICKING over a few of the variable names at the top of your columns. SPSS will show you the variable labels defined for this data set.
Subjects
Figure 2.2: SPSS Data View
2.3
Entering Data
Open a blank worksheet. Click on: File New Data Using the data from our lab survey define your variables using the Variable View.
Chapter 3
Data Screening
Before any analysis is performed, you must do everything possible to ensure a data file is clean (e.g., meets the necessary assumptions of a statistical procedure). It is also important to organize your input so it is easier to extract meaning from the data.
3.1
3.1.1
Descriptive Information
SPSS Explore The best method for data screening via descriptive information is the Explore procedure. To start, download a course file and click on: Analyze Descriptive Statistics Explore (Figure 3.1)
Figure 3.1: Explore Command
The box on the left lists all the variables from your data set (Figure 3.2) Variables categorized by numbers have a # in front of them (age). Variables categorized by words have an A in front of them (gender). Click once on a variable to highlight it, and then click on the arrow button to move it over to the Dependent List. Figure 3.2: Explore Dialog Box
Click on the rectangular Statistics button (Figure 3.2). Check: Descriptives (Figure 3.3) Outliers; by checking Outliers, SPSS will produce the five highest and lowest scores with their case numbers. Continue Figure 3.3: Statistics Dialog Box
Variables
Click on the Plots button (Figure 3.2). Check: Histogram (Figure 3.4) Uncheck Stem-and-leaf (now that computers have the ability to generate highresolution charts, it is not useful). Leave Factor levels together checked. Continue Figure 3.4: Plots Dialog Box
Click on the Options button. Choose (Figure 3.5): Exclude cases pairwise (correlations are computed using only those cases for each subject that have values for all correlated variables). You can always go back and exclude cases listwise if you like (all data are completely discarded for subjects with a missing value for any variable at all). Continue OK Figure 3.5: Options Dialog Box
Print your graphs using the menu bar: File, Print. Make sure that All Visible Output is selected when you print. Once you have printed, close the SPSS Viewer Window. 8
3.1.2
Output from Explore Case Processing Summary is produced first (Figure 3.6). This table shows the number of total cases for each variable with the number of cases that you are missing.
Figure 3.6: Explore Output

Case Processing Summary Cases Missing N Percent 0 .0%
N cups of coffee drink / day
Valid Percent 75 100.0%
Total Percent 75 100.0%
Descriptives is produced next (Figure 3.7), showing the mean, median, variance, standard deviation, and range for each variable.

Descriptives Statistic .7067 .4868 .9265 .6037 .0000 .913 .95540 .00 4.00 4.00 1.0000 1.295 1.131 Std. Error .11032
cups of coffee drink / day
Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis
Lower Bound Upper Bound
.277 .548
Next, the Extreme Values Table is produced (Figure 3.8) showing the five highest and lowest values along with their case numbers. This is very useful in detecting outliers. Figure 3.8: Explore Output
Extreme Values Case Number 41 30 71 75 52 36 24 25 29 21 Value 4.00 3.00 3.00 3.00 .a .00 .00 .00 .00 .b
Highest
Lowest
1 2 3 4 5 1 2 3 4 5
a. Only a partial list of cases with the value 2 are shown in the table of upper extremes. b. Only a partial list of cases with the value 0 are shown in the table of lower extremes.
The Histograms and Boxplots are produced next (Figures 3.9 and 3.10). It is always a good idea to have a picture, and these are also useful in detecting outliers.

Histogram
50
5
Figure 3.10: Explore Output Outliers
40
41
30 71 75
30
2
20
1
Frequency
10
Std. Dev = .96 Mean = .7 N = 75.00 0.0 1.0 2.0 3.0 4.0
-1
N= 75
cups of coffee drink
10
3.1.3
Linearity and Homoscedasticity: Two-Variable Relationships (later in this course) To check for linearity and homoscedasticity, use scatterplots by clicking: Graphs Scatter Matrix (Figure 3.11) Define Click and move your variables over (Figure 3.12) (If you notice something strange use Simple to get a larger view of the problem.) OK Figure 3.11: Scatterplot Box
Figure 3.12: Scatterplot Matrix Dialog Box
3.1.4
What if the Data are Not Normal? Rely on the robustness of a method of analysis (do nothing approach) Transform the variables (later statistics courses)
11
3.2
Measures of Central Tendency and Variability

These measures include the mean, the median, the mode, the range, and the standard deviation.
3.2.1
Analyze Central Tendency and Variability of Scores The best method for looking at these measures in univariate statistics is by using Frequencies (Figure 3.13). Frequencies is an efficient means of computing descriptive statistics for continuous variables. Download a course file and click: Analyze Descriptive Statistics Frequencies Click once on each variable you are interested in and arrow it over to the Variable box (Figure 3.14)
Figure 3.13: Frequencies Command
12
Figure 3.14 Frequencies Dialog Box
Choose your methods to analyze variability of scores by clicking on: Statistics button (Figure 3.14) Under Central Tendency check Mean, Median, and Mode (Figure 3.15) Under Dispersion check Standard Deviation, Range, Minimum, and Maximum Continue BE SURE TO unclick the Display Frequency Tables (Figure 3.14) Figure 3.15: Statistics Dialog Box
13
Click on the Charts button (Figure 3.14) Select Histograms (Figure 3.16) Check With normal curve (to get an overlaid drawing of a normal curve) Continue OK
Figure 3.16: Charts Dialog Box
SPSS output will show a statistics table (Figure 3.17) and a histogram (Figure 3.18). Figure 3.18: Frequencies Histogram
Group IQ Test Score
88 0 100.26 100.00 95 12.985 62 75 137
14 12 10 8 6 4
Figure 3.17: Frequencies Output

Statistics Group IQ Test Score N Valid Missing Mean Median Mode Std. Deviation Range Minimum Maximum
Frequency
2 0 75.0 80.0 85.0 90.0 95.0 105.0 115.0 125.0 135.0 100.0 110.0 120.0 130.0
Std. Dev = 12.98 Mean = 100.3 N = 88.00
Group IQ Test Score
14
Chapter 4
Data Management
4.1 Sort Your Data
Many times, it is helpful to sort your data set into a certain order. This makes it easier to extract meaning from your data. To do this, download a course file and click: Data Sort Cases Click once on the variable (column) you want sorted and arrow it over to the Variable box (Figure 4.1) Select Descending (5, 4, 3, 2, 1) or Ascending (1, 2, 3, 4, 5) for Sort Order OK Figure 4.1: Sort Cases Dialog Box
Take a look at the variable (column) in your data set it is sorted!
4.2
Correct Data Entry Errors

Change any number in your data set (click once in the cell and type in the new number). In essence, you are creating an outlier. Run the Frequencies analysis on your revised data set to see how the outlier skews your analysis (Table 3.13): Analyze Descriptive Statistics Frequencies OK
4.3
Analyzing Linear Transformations of Your Data

Remember to change the number you revised in 4.2 back to the original number. 15
4.3.1
Linear Transformations of a Current Variable Using Addition

To increase the values of a variable using addition click: Transform Compute In the Target Variable box type in the name for your new variable (Figure 4.2) Click once on the variable to be increased and arrow it over to the Numeric Expression box After the variable type in the amount of increase (e.g., +50) OK Figure 4.2: Compute Variable Dialog Box
Scroll over in your Data View and take a look at the new column you have just defined Run the Frequencies analysis on your new variable (Section 3.2.1) Analyze Descriptive Statistics Frequencies Click once on any/each variable you are interested in and arrow it over to the Variable box OK
4.3.2
Linear Transformations of a Current Variable Using Subtraction

To decrease the values of a variable using subtraction click: Transform Compute Reset 16
In the Target Variable box type in the name for your new variable Click once on the variable to decrease and arrow it over to the Numeric Expression box After the variable type in the amount of decrease (e.g., iq-30) OK 4.3.3 Scroll over in your Data View and take a look at the new column you have just defined Run the Frequencies analysis on your new variable (Section 3.2.1)
Linear Transformations of a Current Variable Using Multiplication

To increase the values of a variable using multiplication click: Transform Compute Reset In the Target Variable box type in the name for your new variable Click once on variable to increase and arrow it over to the Numeric Expression box After the variable type in the amount of increase (e.g., iq*5) OK Scroll over in your Data View and take a look at the new column you have just defined Run the Frequencies analysis on your new variable (Section 3.2.1)
4.3.4
Linear Transformations of a Current Variable Using Division

To decrease the values of a variable using division click: Transform Compute Reset In the Target Variable box type in the name for your new variable Click once on variable to decrease and arrow it over to the Numeric Expression box After the variable type in the amount of decrease (e.g., iq/3) OK Run the Frequencies analysis on your new variable (Section 3.2.1) Scroll over in your Data View and take a look at the new column you have just defined
4.3.5
Linear Transformations of a New Variable Using Subtraction and Division

To decrease the values of a variable using subtraction and division click: Transform Compute 17
Reset In the Target Variable box type in the name for your new variable Click once on variable to be decreased and arrow it over to the Numeric Expression box After the variable type in the amount of the multiple decrease using parentheses [e.g., (iq-100)/13]. OK Scroll over in your Data View and take a look at the new column you have just defined Run the Frequencies analysis on your new variable (Section 3.2.1)
18
Chapter 5
Percentiles
5.1 Definition
Percentiles are the point (not percentage) below which a specified percentage of the observations fall. For example, if a student beats 91% of the class on an exam, he/she is at the 91st percentile (even though he/she may have earned an 86%).
5.2
Computing Percentiles
We will be using the Frequency analysis again, but now we will be adding percentile values to our analysis (Figure 3.15). Analyze Descriptive Statistics Frequencies Click and arrow over the variable you are testing (Figure 3.9) Uncheck the Display frequency tables (Save trees! We dont need this info). Choose your methods to analyze spread of scores by clicking: Statistics button (Figure 5.1) Central Tendency: check Mean Dispersion: check Standard Deviation Percentile Values: Check Quartiles this will give you the 25th, 50th, and 75th percentiles Check Percentiles add the 1st, 5th, 10th, 90th, 95th, and 99th percentiles by typing each number into the white box and clicking add Continue Figure 5.1: Statistics Dialog Box
19
Choose your graph by clicking on (Figure 3.11): Charts button Histogram With normal curve Continue OK
5.3
Select Cases
Select Cases allows SPSS to perform analysis on only certain cases. Cases can be selected by certain criteria you may want to consider in your study such as gender, age, IQ, having ADHD or not, etc. Cases may be selected several ways. It is helpful to first go to the Variable View of your data set and double-check how you numerically defined the levels you want to consider (for gender, 1 = male and 2 = female) Figure 5.2. Figure 5.2: Value Labels Dialog Box
5.3.1
Select Cases if a Defined Condition is Satisfied To tell SPSS to only analyze incidences within each variable that match certain criteria you want to look at, click: Data Select Cases Click the radio button If condition is satisfied (Figure 5.3) Click the If button Click and arrow over the variable you want to analyze (Figure 5.4) Type in an equation that defines the way/s you want to limit your variable (dropout=0, or dropout=1) Continue OK
20
Figure 5.3: Select Cases Dialogue Box
Figure 5.4: Select Cases II Dialog Box
Run your percentile analysis (Section 5.2)
5.3.2 Select a Random Sample of Cases To allow SPSS to randomly select varying samples within your variable click: Data Select Cases 21
Click the radio button Random sample of cases (Figure 5.5) Click the Sample button Click the radio button Exactly (Figure 5.6) Define the Exact cases by clicking in the first block (10) and From the first (88) cases by clicking in the second block Continue OK Figure 5.5: Select Cases Dialog Box
Figure 5.6: Random Sample Dialog Box
Run your percentile analysis Repeat Section 5.3.2 as many times as you want to obtain several random samples
22
Chapter 6
Examining Relationships Between Variables
6.1
6.1.1
Producing Scatterplots
Matrix Scatterplots A matrix scatterplot is a collection of all possible scatterplots that can be produced to show the relationship between the variables in your study. To create a matrix scatterplot open a course file and click: Graphs Scatter Matrix (Figure 3.2) Define Click and arrow over ALL the variables (Figure 6.1) OK Figure 6.1: Scatterplot Matrix Dialog Box
6.1.2
Simple Scatterplots Simple scatterplots show all possible relationships between two variables. To create a simple scatterplot click: Graphs Scatter Simple (Figure 3.2) 23
Define Click and arrow over one variable to the Y-Axis box, and another variable to the XAxis box (Figure 6.2). When running analyses, you should place your independent variable (input variable, variable that is independent of the subject, variable you manipulate) on the X-Axis, and your dependent variable (output variable, variable that is dependent on the subject, variable you measure) on the Y-Axis. If you want to look at differences in levels of a variable, move it over into Set Markers by so you can plot different shapes or colors for the levels (e.g., Gender - males vs. females). OK Figure 6.2: Simple Scatterplot Dialog Box
6.2
Analyzing Correlation Coefficients

The correlation coefficient, Pearson r, is a quantitative measure of the relationship between two interval or ratio variables (and the value it gives us is between -1 and +1). The Pearson r tells us the direction of the relationship (either positive or negative), and indicates the strength of the relationship (close to 0 means no relationship, close to 1/+1 means a strong relationship). What we want to know about the variables is when one increases in value, does the other? As a countrys expenditure on health care increases, does its citizens life expectancy also increase? To calculate a correlation coefficient click: Analyze Correlate Bivariate Click and arrow over the two variables from your first simple scatterplot from Section 6.1.2 (Figure 6.3) Make sure Pearson, Two-tailed, and Flag significant correlations are checked OK 24
Figure 6.3: Bivariate Correlations Dialog Box
The SPSS output will tell you several important things about the relationship between the two variables (Figure 6.4). Figure 6.4: Correlation Coefficient Output
Correlations 9th grade GPA -.615** .000 88 1 . 88
ADD Score
9th grade GPA
Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N
ADD Score 1 . 88 -.615** .000 88
**. Correlation is significant at the 0.01 level (2-tailed).
First, you should look at whether or not the two variables are significantly related to each other (shown with arrows above). If your Sig. (significance) value is less than .05 (p < .05), then your two variables are significantly related to each other, and we reject the null hypothesis (which states the two are not related). In this case, p = .000, and is < .05, so we reject H0. *Report this as: There is a significant relationship between ADD scores and 9th grade GPA, p = .000 [be sure to use your p value]. Second, by looking at the Pearson Correlation value (circled above), you can tell whether the relationship between the two variables is negative or positive. The circled Pearson Correlation value in Figure 6.4, -.615, shows a negative relationship. 25
Chapter 7
Predicting Outcomes
7.1 The Linear Regression Analysis
When we are interested in looking at whether or not the relationship between one (or more) variable/s is strong enough so that we can predict an outcome we use the regression analysis. For example, does stress predict poor mental health? To run a regression in SPSS download a course file and click: Analyze Regression Linear Click and arrow over your dependent (Y) and independent (X) variables (Fig. 7.1) OK Figure 7.1: Linear Regression Dialog Box
Similar to the correlation analysis, this output will tell us several things. First, you should look at the ANOVA table to see whether the dependent (outcome) variable is significantly predicted by the independent (predictor) variable (Figure 7.2). Again, if your Sig. (significance) value is less than .05 (p < .05), your independent variable significantly predicts your dependent variable and you reject H0 (which states that there is zero prediction). ANOVA also reports the value of the F test. *Report this as: Brain size significantly predicts IQ, F (1, 39) = 5.573, p = .023 [use your numbers]. 26
In this case, a persons brain size (number of pixels on MRI) significantly predicts their full scale IQ (FSIQ). In other cases, we may look at whether cell phone use predicts cancer, or whether successfully earning a college degree increases chances of success in marriage. Figure 7.2: Regression ANOVA Table
ANOVAb Sum of Squares 2892.989 19724.911 22617.900
Model 1
df 1 38 39
Regression Residual Total
Mean Square 2892.989 519.077
F 5.573
Sig. .023a
a. Predictors: (Constant), Number of pixels on MRI b. Dependent Variable: FSIQ
Second, by looking at the Coefficients table (Figure 7.3) we can see whether the prediction of the dependent variable is negative (a decrease) or positive (an increase). Third, by looking at the slope (Figure 7.3) of the regression line (.000192 in this case) we can tell that for a one-unit increase in a persons brain size (number of pixels on MRI) we can significantly predict a .000192 increase in full scale IQ. Figure 7.3: Coefficients Table
Coefficientsa Unstandardized Coefficients B Std. Error 5.168 46.008 1.192E-04 .000 Standardized Coefficients Beta .358
Model 1
t .112 2.361
(Constant) Number of pixels on MRI
Sig. .911 .023
a. Dependent Variable: FSIQ
Intercept Slope
(E-04 means move decimal point 4 places left)
Fourth, by looking at the standard error of estimate, r2 value (Figure 7.4), we can tell how variable the errors within our prediction are. For example, in this case, r2 = .128, meaning that 12.8% of our prediction of IQ is due to the Number of pixels on the MRI (brain size) and the other 87.3% is related to other factors. Figure 7.4: Regression Coefficient Model Summary
Model Summary Adjusted R Square .105 Std. Error of the Estimate 22.7833
Model 1
R .358a
R Square .128
a. Predictors: (Constant), Number of pixels on MRI
27
Chapter 8
Testing Hypotheses by Comparing Means
8.1 The One-Sample t Test
Here we are analyzing whether our sample comes from a particular population. For example, do children being tested actually come from a population of ADHD children? Also, we can perform repeated measures tests on the same person to tell whether a treatment is effective: does family therapy intervention result in weight gain in anorexic teenage girls. We perform this analysis by opening a course file and clicking: Analyze Compare Means One-Sample T Test Click and arrow over the variable you want to test (Figure 8.1) Type in your Test Value () OK Table 8.1: One-Sample T Test Dialog Box
The output tables show several things (Figure 8.2):

One-Sample Statistics Std. Error Mean 1.384
N Group IQ Test Score 88
Mean 100.26
Std. Deviation 12.985
One-Sample Test Test Value = 100 95% Confidence Interval of the Difference Lower Upper -2.49 3.01
t Group IQ Test Score .189
df 87
Sig. (2-tailed) .851
Mean Difference .26
28
This output shows: N = 88 Mean = 100.26 Standard Deviation = 12.985 t value (not T-score) = .189 df = 87 p =.851 Keep in mind what the value of the (mean) you are testing is (in this case, it was 100), Figure 8.1. We see that our significance value of .851 is NOT < .05, so we retain H0 and state that there is no significant difference between our sample mean and the population mean our sample did come from a population with a mean of 100. The df (degrees of freedom) is obtained by starting with the number of observations we have, n=88, and subtracting the number of observations we estimate, 1 (a single mean).
8.2
The Two Independent Samples t Test

Here we are no longer comparing samples from the same person rather we are comparing samples from two different (independent) people or groups. For example, we may want to study whether 12-year-old boys are more socially inept than 12-yearold girls. In this case, we would need a sample of boys and a second, independent sample of girls. Since we are no longer comparing one person to him/herself, we need to ensure that our two groups are similar enough (homogeneous enough) to allow a comparison. We must meet this assumption in order to use an Independent Samples t Test as our analysis. SPSS tests homogeneity between two independent groups by Levenes Test of equality of variances. One of the most common uses of the t test involves testing the difference between the means of two independent groups. In a memory study, we might want to compare levels of retention for a group of college students asked to recall a list of visually presented nouns and a group asked to recall a list of orally presented nouns. In SPSS we perform an Independent Samples t Test by opening a course file and clicking: Analyze Compare Means Independent Samples T Test Click and arrow over the Test Variable you want to study the effect of (Figure 8.3) Click and arrow over the Grouping Variable from which you will derive your two groups (Figure 8.3) 29
Figure 8.3: Independent-Samples T Test Dialog Box
Click Define Groups to define each of your groups (check your data set to see how you originally defined your groups), Figure 8.4 Continue OK
Figure 8.4: Define Groups
30
Your output tells you whether your two independent samples come from the same population (Figure 8.5). Figure 8.5: Independent Sample T Test Output
Group Statistics Std. Error Mean 1.868 1.944
AROUSAL
Group Membership Homophobic Nonhomophobic
N 29 35
Mean 24.07 14.29
Std. Deviation 10.060 11.498
Independent Samples Test Levene's Test for Equality of Variances
t-test for Equality of Means 95% Confidence Interval of the Difference Lower Upper 4.326 4.394 15.241 15.172
F AROUSAL Equal variances assumed Equal variances not assumed .144
Sig. .706
t 3.583 3.629
df 62 61.796
Sig. (2-tailed) .001 .001
Mean Difference 9.78 9.78
Std. Error Difference 2.730 2.696
First, we check the significance level of Levenes Test for equality of variances. Opposite of what we usually want to see as a significance level (p < .05), here we do NOT want the significance level to be < .001. Here, the significance is .706, so we use the top line in the output table, Equal variances assumed, for our analysis. (Were the significance actually < .001, we would use the SPSS-adjusted second line, Equal variances not assumed, for our analysis.)
Now that we have determined which line of our analysis we will be using, we look at whether or not the means of our two groups are equal. In this case our significance (2tailed) is .001, which is < .05 and therefore is significant. This tells us that the means of our two groups are significantly different from each other, so we reject H0 (which states that they are not different from each other: 1 = 2). We also see the degrees of freedom = 62 (our total observations or n = 64, minus the two observations we estimate the two group means), and the t value. Finally, to discover which group mean is greater than the other, we look at the first output table, the Group Statistics. Here we see that the mean for the Homophobic Group, M = 24.07, is greater than the mean for the Nonhomophobic Group, M = 14.29. *Report this as: Homophobic males (M = 24.07, SD = 10.060) have significantly higher arousal then nonhomophobic males (M = 17.29, SD = 11.498), t (62) = 3.583, p = .001.
31
8.3
One-Way Analysis of Variance (ANOVA)

This is the most popular analysis for psychological research for two main reasons. First, instead of comparing just two means (as with the Independent Samples t Test), ANOVA can compare any number of means to see if they differ. Second, with ANOVA we can compare a multiple number of independent variables all at once. Not only can we see an effect of each individual variable, but we can also see any interaction effects between the variables. The ANOVA analysis is performed by opening a course file and clicking: Analyze Compare Means One-Way ANOVA Click and arrow over the Factor variable with the levels you want to compare (Figure 8.6) Click and arrow over the Dependent variable you want to study the effect on. Figure 8.6: One-Way ANOVA Dialog Box
Click the Options button (Figure 8.7) Check Descriptive Check Homogeneity-of-variance Check Means plot Leave Exclude cases analysis by analysis checked Continue OK
32
Figure 8.7: One-Way ANOVA Options Box
The Descriptives output appears first and is answers questions such as, Which group scored higher? This is the first question you answer whenever you have a significant ANOVA analysis (F test), Figure 8.8. Figure 8.8: SPSS ANOVA Output
Descriptives
Group IQ Test Score 95% Confidence Interval for Mean Lower Bound Upper Bound 96.58 110.99 97.63 104.15 84.12 98.48 97.51 103.01
N college prep general remedial Total 14 64 10 88
Mean 103.79 100.89 91.30 100.26
Std. Deviation 12.479 13.054 10.034 12.985
Std. Error 3.335 1.632 3.173 1.384
Minimum 81 79 75 75
Maximum 127 137 105 137
Next, we see the test of homogeneity, Levenes test, which should be NOT significant at the .001 level (Figure 8.9).
Figure 8.9: SPSS ANOVA Output

Test of Homogeneity of Variances Group IQ Test Score Levene Statistic .547 df1 2 df2 85 Sig. .581
33
Third, we see the ANOVA summary table (Figure 8.10). In this case, our p value was .049, which is significant, < .05 (although just barely). Figure 8.10: SPSS ANOVA Output
ANOVA Group IQ Test Score Sum of Squares 1002.297 13666.692 14668.989 df 2 85 87 Mean Square 501.149 160.785 F 3.117 Sig. .049
Between Groups Within Groups Total
*Report this as: There is a significant difference between the level of English language skills a person has in the 9th grade and that persons IQ test score, F (2, 87) = 3.117, p = .049. From this test, we only know that at least one group is different, we do not yet know which group/s is/are different. Our next question is, Which group scored highest? and we find this information in the Descriptives table (Figure 8.8) by looking at the means. *Report this as: The 9th grade group with the college prep level language skills had the highest IQ (M = 103.79, SD = 12.497), the group with the general level language skills had the next highest IQ (M = 100.89, SD = 13.054), and the group with the remedial level language skills had the lowest IQ (M = 91.30, SD = 10.034). The means plot is a pictorial rendering of the data in the descriptives table.
106 104
102 100
Mean of Group IQ Test Score
98
96 94
92 90 college prep general remedial
English Level 9th Grade
34
8.4
General Linear Model (GLM)

The ANOVA analyses we performed in the last chapter looked at the interaction of one dependent variable with more than two independent variables. Now we will be analyzing mean differences on our dependent variable among one independent variable, between and across another independent variable. We can choose from two analyses in order to look at these mean differences: 1. General Linear Model (GLM) Univariate Analysis we use this method when our analysis has no repeated measures. 2. General Linear Model (GLM) Repeated Measures Analysis we use this method when our analysis involves any repeated measures (any repeated tests). Looking for a repeated measures variable is the first and key thing to look at when deciding which analysis to use. For example, we may give an ADHD child a continuous performance test (CPT) and score how they perform. Then, we may administer a specific level of Ritalin to the child and have them re-take the same CPT an hour later. If you are not sure whether or not your data set contains any repeated measures there is an easy way for you to tell! Think of each row in your Data View as one subject. If your Data View looks like this, you can see that each subject has a single score for each variable and is a member of only one group: Figure 8.11: Data Set Containing Non-Repeated Measures Data
35
If, on the other hand your Data View looks like this, you can see that each subject was measured several times, repeatedly: Figure 8.12: Data Set Containing Repeated Measures Data
The GLM Factorial Analysis is performed by opening a course file and clicking: Analyze General Linear Model Univariate (Figure 8.13) Arrow over the Dependent Variable (what you want to measure) into the Dependent Variable box (Figure 8.13) Arrow over the Fixed Factor Variables (your independent variables, what you want to measure the effect of) into the Fixed Factor box Arrow over the Random Factor Variable (independent variable that only some subjects received as a treatment) into the Random Factor box. Arrow over the Covariate Variable (a variable whose effects you want to statistically control for) into the Covariate box Figure 8.13: GLM Univariate Dialog Box
36
Click on the Plots button (Figure 8.14) Arrow your top variable over into the Horizontal Axis box Arrow your next variable over into the Separate Lines box Click on the Add button Arrow your bottom variable over into the Horizontal Axis box Arrow your top variable over into the Separate Lines box Click on the Add button [It is a good idea to create separate plots showing each of the contrasts you are measuring between each set of variables.] Continue Figure 8.14: GLM Univariate Profile Plots
Click on the Post Hoc button [Only independent variables with at least three levels will appear in the Factor(s) box.] (Figure 8.15) Arrow all of your variables over into the Post Hoc Tests for box Click in the Scheffe box [This is the most conservative and flexible adjustment we can make in order to avoid making a Type I error in our analysis.] Continue
37
Figure 8 15: Univariate: Post Hoc Multiple Comparisons
Click on the Options button (Figure 8:16) Click in the Descriptive statistics box, do not click anything else Continue OK Figure 8.16: Univariate: Options
38
Your GLM output will tell you several things. First, we will look at whether our analysis showed there to be a significant relationship among the variables (Fig. 8. 17). Figure 8.17: Tests of Between-Subjects Effects
Tests of Between-Subjects Effects Dependent Variable: SCORE Source Corrected Model Intercept SCREEN LIQUID SCREEN * LIQUID Error Total Corrected Total Type III Sum of Squares 12053.250a 66822.250 10609.000 1024.000 420.250 4668.500 83544.000 16721.750 df 3 1 1 1 1 12 16 15 Mean Square 4017.750 66822.250 10609.000 1024.000 420.250 389.042 F 10.327 171.761 27.270 2.632 1.080 Sig. .001 .000 .000 .131 .319
Significant Both are not significant
*1 *2 *3
a. R Squared = .721 (Adjusted R Squared = .651)
*1 This line tells us whether there is a significant difference in scores (our dependent variable) between course and fine screen (our first independent variable) averaged across the concentration of liquid (our second independent variable). This is the main effect of screen. *2 This line tells us whether there is a significant difference in scores between low and high concentrations of liquid averaged across the screen type. This is the main effect of liquid. *3 This line tells us whether the pattern of difference on the scores among concentrations of liquid is different between course and fine screen. This is the interaction effect of screen and liquid. If one of our p values is significant our next question is, which level of the significant variable(s) is higher? To answer this question we look at the means in the Descriptive Statistics table (Figure 8.18). For this analysis, we see that at each concentration of liquid, the means are all highest for the fine screen condition. Figure 8.18: Descriptive Statistics
Descriptive Statistics Dependent Variable: SCORE SCREEN Course LIQUID Low High Total Low High Total Low High Total Mean 41.7500 36.0000 38.8750 103.5000 77.2500 90.3750 72.6250 56.6250 64.6250 Std. Deviation 25.55223 17.83255 20.62895 18.91208 15.08587 21.15884 39.01991 26.83248 33.38837 N 4 4 8 4 4 8 8 8 16
Fine
Total
39
The GLM Repeated Measures Analysis is used when our study involves any repeated measures (any repeated tests). The Repeated Measures Analysis is performed by opening a course file and clicking: Analyze General Linear Model Repeated Analysis (Figure 8.19) Type the name of your repeated measures variable into the top box Type in the number of levels of the variable at which you are looking Click Add Define Figure 8.19: Repeated Measures Define Factor(s)
In the Repeated Measures Dialog Box (Figure 8.20), you must choose which of your variables to arrow over into each level of your repeated measure. To do this, click on the first variable you want to arrow over, then click the little arrow button; repeat this until the appropriate variables have all been arrowed over. Figure 8.20: Repeated Measures Dialog Box
40
Click on the Options button Click in the Descriptive Statistics box Continue OK By looking at the Between-Subjects Effects table (Figure 8.20), we can see that we do have a significant effect on the means in this analysis. Figure 8.21: Tests of Between-Subjects Effects
Tests of Between-Subjects Effects Measure: MEASURE_1 Transformed Variable: Average Source Intercept Error Type III Sum of Squares 36226.082 6334.611 df 1 15 Mean Square 36226.082 422.307 F 85.781 Sig. .000
To see in which direction the means were affected, we look at the Descriptive Statistics table (Figure 8.22). Figure 8.22: Descriptive Statistics
Descriptive Statistics Mean 25.000 23.938 19.000 15.562 13.813 12.625 11.313 10.313 9.500 8.375 7.813 7.500 6.813 Std. Deviation .0000 1.8786 6.6332 7.5097 7.7908 7.4911 7.5958 7.2730 7.0048 6.2276 5.8790 5.8310 5.5644 N 16 16 16 16 16 16 16 16 16 16 16 16 16
DAY1 DAY2 DAY4 DAY5 DAY6 DAY7 DAY8 DAY10 DAY11 DAY12 DAY13 DAY14 DAY15
APA format for reporting results of analyses. You will need to use this format for PSY321, all upper-division psychology courses, your own research, and when you are published. 41

SPSS STATS LAB: DESCRIPTIVE STATISTICS AND DATA SCREENING

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

SPSS STATS LAB: DESCRIPTIVE STATISTICS AND DATA SCREENING

Caricato da

Copyright:

Formati disponibili

California State University, Northridge Department of Psychology Statistical Methods Lab: PSY320L

SPSS 12.0 COURSE READER

Developed and Written by: Janice C. McMurray, M.A. Teaching Assistant

Supervising Professor: Gary S. Katz, Ph.D.

SPSS Availability in Sierra Hall

Finding Course Files on the Internet

Figure 2.2: SPSS Data View

Figure 3.1: Explore Command

Figure 3.6: Explore Output

N cups of coffee drink / day

Valid Percent 75 100.0%

Total Percent 75 100.0%

Figure 3.7: Explore Output

cups of coffee drink / day

Lower Bound Upper Bound

cups of coffee drink / day

Figure 3.9: Explore Output

Figure 3.10: Explore Output Outliers

cups of coffee drink

cups of coffee drink / day

Figure 3.12: Scatterplot Matrix Dialog Box

Measures of Central Tendency and Variability

Figure 3.13: Frequencies Command

Figure 3.14 Frequencies Dialog Box

Figure 3.16: Charts Dialog Box

Figure 3.17: Frequencies Output

Std. Dev = 12.98 Mean = 100.3 N = 88.00

Group IQ Test Score

Take a look at the variable (column) in your data set it is sorted!

Correct Data Entry Errors

Analyzing Linear Transformations of Your Data

Linear Transformations of a Current Variable Using Addition

Linear Transformations of a Current Variable Using Subtraction

Linear Transformations of a Current Variable Using Multiplication

Linear Transformations of a Current Variable Using Division

Linear Transformations of a New Variable Using Subtraction and Division

Figure 5.3: Select Cases Dialogue Box

Figure 5.4: Select Cases II Dialog Box

Run your percentile analysis (Section 5.2)

Figure 5.6: Random Sample Dialog Box

Analyzing Correlation Coefficients

Figure 6.3: Bivariate Correlations Dialog Box

9th grade GPA

Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N

ADD Score 1 . 88 -.615** .000 88

**. Correlation is significant at the 0.01 level (2-tailed).

Regression Residual Total

Mean Square 2892.989 519.077

a. Predictors: (Constant), Number of pixels on MRI b. Dependent Variable: FSIQ

(Constant) Number of pixels on MRI

Sig. .911 .023

a. Dependent Variable: FSIQ

(E-04 means move decimal point 4 places left)

a. Predictors: (Constant), Number of pixels on MRI

The output tables show several things (Figure 8.2):

N Group IQ Test Score 88

Std. Deviation 12.985

t Group IQ Test Score .189

Sig. (2-tailed) .851

Mean Difference .26

The Two Independent Samples t Test

Figure 8.3: Independent-Samples T Test Dialog Box

Figure 8.4: Define Groups