Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Chapter 12
Psyc 201 — Research Methods in Psychology — Spring 2019
Lecture Outline
‣ Describing Single Variables
‣ Describing Statistical Relationships
‣ Expressing Your Results
‣ Conducting Your Analyses
Statistics
‣ Why do we need statistics?
‣ Statistics help us organize our world and make sense of it!
‣ It is much easier to understand our natural world if we can
objectively interpret what we are seeing.
‣ Statistics can be broken into two broad categories
‣ Descriptive Statistics help us summarize and display our data
‣ Inferential Statistics help us understand if our results are “statistically
significant”; that is, are our results really different?
Describing Data
‣ What can we say about the following data?
22 22 15 19 24 20 22 18 21 21
21 20 18 16 21 23 19 20 22 22
23 23 24 22 23 22 22 21 19 24
21 21 21 22 22 23 20 20 18 16
frequency table; i.e., there are too many 240 - 259 1 .050
0 frequency values.
220 - 239 2 .100
149 155 170 170 170 172 181 182 183 188
191 194 196 198 198 201 210 222 229 244 200 - 219 2 .100
‣ In this case, we can create grouped 180 - 199 9 .450
intervals for the frequency table.
160 - 179 4 .200
‣ Intervals always have equal width, and
140 - 159 2 .100
always start on a multiple of the width
Psyc 201 — Research Methods in Psychology — Spring 2019
figures. 6
Frequency of Score
10
allows us to easily see the 8
shape of the distribution. 6
Outliers
‣ An outlier is defined as an extreme score score that is much
higher or lower than the rest of the scores in the distribution.
‣ Not all extreme scores are outliers… some may be an
important part of our data, such as looking at the ages of
students in a class.
‣ Other times, outliers can scores that are due to errors,
misunderstandings, equipment failures, etc.
‣ It is important to deal with these types of outliers as they may interfere
with the analysis of our data.
distribution. 20 5 .125
19 3 .075
‣ We can use our Frequency Table to 18 3 .075
all be equal.
‣ If you have a skewed distribution, the mean Negatively Skewed
will be pulled toward the skew, and the median Mode
Median
will be between the mean and the mode. Mean
Measures of Variability
‣ Although measures of central tendency are 20
18
16
12
10
16
have the same mode, median, and mean (and 14
Frequency
12
shows much more variability Figure 12.3. Hypothetical distributions with the
same central tendencies, but with low variability
(top) and high variability (bottom)
Psyc 201 — Research Methods in Psychology — Spring 2019
Restrictions on Measures of Variability
‣ There are many different measurements of variability,
including the range, median absolute deviation (MAD), and
the standard deviation (SD)
‣ Just like measures of central tendency, however, the scale of
measurement of your data dictates which measure of
variability you can use.
‣ Nominal data has no measure of variability.
‣ Ordinal data can use the range and the MAD.
‣ Interval and Ratio data can use the range, MAD, and SD
12
10
12
10
data.
X
Figure 12.3. Hypothetical distributions with the
same central tendencies, but with low variability
(top) and high variability (bottom)
Psyc 201 — Research Methods in Psychology — Spring 2019
Measures of Location
‣ Finally, descriptive statistics allows us to determine the
location of a value within a distribution.
‣ Two measures of location include the percentile rank and the
z-score.
‣ The percentile rank can be used for ordinal, interval, and ratio
data.
‣ The z-score can only be used for interval and ratio data.
Percentile Rank
‣ The precise definition of percentile rank Table 12.3. Frequency Distribution of Scores on the Rosenberg
Self-Esteem Scale Including Cumulative Frequencies
(PR) is a bit contentious*.
Self-Esteem Freq. Rel. Freq. C.F. C.R.F.
‣ We will use the definition of the percentage of
24 3 .075 40 1.000
values that are at or below a specific value. 23 5 .125 37 .925
‣ The easiest way to compute the percentile 22 10 .250 32 .800
rank is to expand our frequency table just a 21 8 .200 22 .550
bit… we are going to include the cumulative 20 5 .125 14 .350
frequency and the relative cumulative 19 3 .075 9 .225
frequency. 18 3 .075 6 .150
‣ The percentile rank is just the cumulative 17 0 .000 3 .075
relative frequency (C.R.F.), on the same line 16 2 .050 3 .075
as our value. i.e, the PR(23) = 92.5 15 1 .025 1 .025
‣ If our dataStandard
is normally distributed,
Normal Distribution then Illustrated
our -scores follow a
very specific pattern that can be used for inferential testing
‣ The area under the curve between the !z-score values of 0.0 and +1.0 in
and other astatistics.
normal distribution is always = 0.3413 of the total area
Lecture Outline
‣ Describing Single Variables
‣ Describing Statistical Relationships
‣ Expressing Your Results
‣ Conducting Your Analyses
Fear Rating
6
Condition Mean SD 4
Education 4.83 1.52 2
0
Exposure 3.47 1.77 Education Exposure Control
Control 5.56 1.21 Type of Treatment
Figure 12.4. Mean fear ratings following treatment
Effect Size
‣ Another important measurement of the difference between
groups is what is called effect size.
‣ Similar to a z-score, effect size determines the distance
between groups in units of standard deviation.
‣ The most widely used measure of effect size is Cohen’s d
M1 − M2
d=
SD
‣ Technically, it is the difference between populations, as
estimated by the samples.
Presenting Relationships
‣ Typically, correlations or relationships are plotted either using a line
graph (like the previous graph) or by using a scatter plot.
‣ Line graphs are used when we can organize our data into a small
number of distinct values (such as quartiles)
‣ Scatterplots are used when there are a large number of values on the
x-axis.
Correlation Example
X zX Y zY zX × zY
MX = 4 MY = 40 r = 0.53
APA Format
‣ When publishing results, Psychology tends to follow the
American Psychological Association (APA) format for text,
tables, and figures.
‣ If possible, one should try to present results
‣ within the text first (cheapest option), but only good for small amounts of
data
‣ within a table next, for moderate amounts of data, and if there are no
interactions within the data
‣ within a figure last… usually reserved for large amounts of data or if there
is an interaction in the data.
Correlation Matrix
Bar Graphs
‣ Bar graphs are used
when the IV is either
nominal or ordinal
data.
‣ If the IV is nominal,
the order of the
conditions is
somewhat arbitrary.
Scatterplots
‣ Scatterplots are used to plot
correlations (the relationship
between two variables).
‣ Each point represents an
individual score, as opposed to a
mean.
‣ The dotted line in this graph
indicates the regression line
which is the line of best fit.
Psyc 201 — Research Methods in Psychology — Spring 2019
Lecture Outline
‣ Describing Single Variables
‣ Describing Statistical Relationships
‣ Expressing Your Results
‣ Conducting Your Analyses
exclude anyPrepare data, do not Your throwDatathem away or delete them because you or another researcher might want to see
‣ Therelater.
them Instead,
are several stages tosetyour
them aside and keep notes about why you decided to exclude them because you will need
data analysis.
‣ First, make sure that your data is anonymized prior to any analyses
to‣ report this information.
Back-up your data!
‣ Check your raw data for consistency…
Now Are you are scores?
you missing
exclude a participant
readyDoto youenter your
have obvious data
errors? Do youin need
a spreadsheet
to program or, if it is already in a computer file, to forma
it ‣for analysis.
Finally, create your Youprocessedcandatause
file. a general spreadsheet program like Microsoft Excel or a statistical analysis program
‣ This is typically in a specific format that can be read into a statistical analysis
like SPSS
package suchtoascreate
Excel, SPSS,your
or R data file. (Data files created in one program can usually be converted to work with othe
programs.) The most common format is for each row to represent a participant and for each column to represen
Psyc 201 — Research Methods in Psychology — Spring 2019
a variable (with the variable name at the top of each column). A sample data file is shown in Table 12.6. The firs
column contains participant identification numbers. This is followed by columns containing demographic infor
mation (sex and age), independent variables (mood, four self-esteem items, and the total of the four self-esteem
items), and finally dependent variables (intentions and attitudes). Categorical variables can usually be entered a
category labels (e.g., “M” and “F” for male and female) or as numbers (e.g., “0” for negative mood and “1” fo
positive mood). Although category labels are often clearer, some analyses might require numbers. SPSS allow
you to enter numbers but also attach a category label to each number.
Sample Data File
Table 12.6 Sample Data File
ID SEX AGE MOOD SE1 SE2 SE3 SE4 TOTAL INT ATT
1 M 20 1 2 3 2 3 10 6 5
2 F 22 1 1 0 2 1 4 4 4
3 F 19 0 2 2 2 2 8 2 3
4 F 24 0 3 3 2 3 11 5 6
If you have multiple-response measures—such the self-esteem measure in Table 12.6—you could combine the
items by hand and then enter the total score in your spreadsheet. However, it is much better to enter each response
as a separate variable in the spreadsheet—as with the self-esteem measure in Table 12.6—and use the software to
combine them (e.g., using the “AVERAGE” function in Excel or the “Compute” function in SPSS). Not only i
this approach more accurate, but it allows you to detect and correct errors, to assess internal consistency, and to
analyze individual responses if you decide to do so later.
Preliminary Analyses
‣ It is very important to look at your data using preliminary data
analyses prior to doing any inferential analyses.
‣ First, you should always graph your data… this will tell you if
there is something odd, such as a multimodal distribution.
‣ Then, you can look for internal consistency of your data using
Chronbach’s ⍺ or Cohen’s κ.
‣ Finally, compute your descriptive statistics on the individual
conditions, and identify any outliers… and then decide what to
do with them.
Psyc 201 — Research Methods in Psychology — Spring 2019