Sei sulla pagina 1di 11

COMM 301: Empirical Research in Communication

Descriptive Statistics: Measures of Variability


Descriptive Statistics: Describin Data !ith "umbers # $%R& '
Measures of Variability
Variability refers to how dispersed are the data points in a distribution, and how similar or
different each data point is from the other data points.
There are three common measures of variability: range, variance, and standard deviation.
Rane
What is it?
The range is the distance or difference between the lowest score and the highest score.
How to find the range?
For any given set of scores, subtract the lowest score from the highest score.
For eample, a data set has !" data points #on a $%point scale&:
', (, ', ', (, ', ), (, !, $
The range would be: $ % ! * +.
When is it used?
Type of ,uestion answered
What does this set of scores loo- li-e?
How variable are the scores in the distribution?
Type of data
Variables: .ne #!& continuous variable
/easurement levels: 0nterval, ratio
What do you need to -now?
1ll the above, plus
the range is affected by etreme scores2
recognition of 3433 output.
How to report descriptive statistics?
3ee the following eample.
!
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Variance an( Stan(ar( Deviation
What are they?
The variance is a measure of the amount that a set of data varies about its mean. Variance is a
-ey concept, and it forms the heart of all statistics.
3tandard deviation is the undoing of the s,uaring that we did to find the variance. The
standard deviation therefore is really a sort of 5average distance5 of each point from the
mean.. very important concept in normal distribution.
The standard deviation is the s,uare root of the variance
How to find the variance and standard deviation?
For a given population or universe of scores, the formula for variance is
6#first score % mean score&
7
8 #second score % mean score&
7
8 9 8 #last score % mean score&
7
:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
number of scores
0f the set of scores is a sample, then the formula for variance is
6#first score % mean score&
7
8 #second score % mean score&
7
8 9 8 #last score % mean score&
7
:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
number of scores ) 1
The standard deviation is the s,uare root of the variance.
When are they used?
Type of ,uestion answered
What does this set of scores loo- li-e?
How variable are the scores in the distribution?
Type of data
Variables: .ne #!& continuous variable
/easurement levels: 0nterval, ratio
What do you need to -now?
1ll the above, plus
variance and standard deviation are meaningful only for continuous variables,
measured with interval or ratio scales2
7
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
recognition of 3433 output.
How to report descriptive statistics?
3ee the following eample.
4rocedures in 3433
1naly<e = >escriptive 3tatistics = Fre,uencies 9
3elect variable.
3elect the appropriate chart.
?lic- @3tatisticsA.
?hec- @/eanA, @/edianA @/odeA.
?hec- @/inimumA, @/aimumA, @3td. deviationA, @VarianceA, @BangeA.
?lic- @4aste.A
Co to the synta file. Highlight the appropriate section, and clic- D.
E
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Result (example from lect13_4 data set # also show bar chart eample&
Statistics
EXAMSCOR
N Valid 20
Missing 0
Mean 79.2000
Median 80.0000
Mode 80.00
Std. Deviation 7.34!0
Va"ian#e 3.9789
S$e%ness .8&&
Std. E""o" o' S$e%ness .&2
()"tosis 2.498
Std. E""o" o' ()"tosis .992
Range 34.00
Mini*)* !!.00
Ma+i*)* &00.00
EXAMSCOR
&00.0 9.0 90.0 8.0 80.0 7.0 70.0 !.0
EXAMSCOR
,
"
e
-
)
e
n
#
.
8
!
4
2
0
Std. Dev / 7.3
Mean / 79.2
N / 20.00
Beport
0n this group of 7" students, the mean score was F$.7" #standard deviation * F.E'&.
The range was E(, with the highest score * !"", and the lowest score * )).
(
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Describin Data !ith $ictures
There are several ways to describe data. .ne of way is to use pictures. These techni,ues
include fre,uency distributions, histograms and bar graphs. We will focus on histograms and
bar graphs.
*istorams
What are they?
1 histogram shows in a picture form how many times a given score appears in a data set.
There are two main aes on the histogram. The hori<ontal ais #the G ais& is where the
scores are represented. Typically the scores are grouped into score intervals. Hach score
interval is represented by one rectangular bar, and the mid%point of the score interval is
highlighted. The rectangular bars touch each other, because the data is continuous.
The vertical ais #the I ais& indicates the fre,uency of those scores occurring.
1 tall bar indicates a high fre,uency of occurrence, meaning that the score occurs many
times. 1 short bar indicates a low fre,uency of occurrence, meaning that the score occurs
few times.
Hample
90.0 87. 8.0 82. 80.0 77. 7.0 72. 70.0 !7.
Ra% s#o"e in #o)"se
,"
e-
)e
n#
.
&0
8
!
4
2
0
Std. Dev / .20
Mean / 84.&
N / 20.00
Beport
The raw scores of the 7" students ranged from )).+$ to $!."!, with a mean raw score
of +(.!. 1 histogram depicting the continuum of scores reveals that the pea- score interval is
between +E.F' and ++.F'. The distribution of the scores is negatively s-ewed, with most of
the scores on the higher ranges.
'
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
+ar Charts
What are they?
1 bar chart shows in a picture form how many times a given score appears in a data set.
Jar charts are almost identical to histograms, ecept that they deal with categorical data,
rather than continuous data.
The hori<ontal ais #the G ais& is where the categories are represented. Hach category is
represented by one bar. The vertical ais #the I ais& indicates the fre,uency of membership
in a category. 1 tall bar indicates a high fre,uency of membership, meaning that the category
has many members. 1 short bar indicates a low fre,uency of membership, meaning that the
category has few members.
Hample
Ra% g"ade 'o" #o)"se
Ra% g"ade 'o" #o)"se
D C C01l)s 2 *in)s 2 201l)s A *in)s
,
"
e
-
)
e
n
#
.
&2
&0
8
!
4
2
0
Beport
The above bar charts show the distribution of the raw grades earned by the students.
.ne student earned a grade of 1%2 four students earned a grade of J82 ten students earned a
grade of J2 two students earned a grade of J%2 one student earned a grade of ?82 one student
earned a grade of ?2 one student earned a grade of >.
)
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
"ormal Distribution an( ,ts Measures
What are they?
When a set of data is collected, the data forms a distribution, meaning that each data point
has a particular value along some dimension. Together the data points can form a visual
shape of the distribution, as depicted by the histogram and bar chart.
The shape of the distribution can be normal, loo-ing li-e a bell. 0t means that most of the data
points are clustered near one set of middle scores, and that the data points gradually and
symmetrically decrease in fre,uency in both directions away from the middle.
>istributions can be non%normal. .ne of ways distributions can be non%normal is by being
s-ewed. 3-ew refers to the symmetry of a distributionKs tails. 0f one of the distributionKs tail
is stretched out at one end, and compressed at another end, then the distribution is s-ewed.
Lurtosis is also used to chec- the normality of a data set #see net page for s-ewness and
-urtosis&
?f. 3tandard normal distribution
F
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Mote on s-ewness
!& 3-ewness characteri<es the degree of asymmetry of a distribution around its mean.
7& 4ositive s-ewness indicates a distribution with an asymmetric tail etending towards
more positive values #right s-ewed&
E& Megative s-ewness indicates a distribution with an asymmetric tail etending towards
more negative values #left s-ewed&
(& /ost often, median is used as a measure of central tendency when data sets are
s-ewed.
'& Mormal distributions will have a s-ewness value of approimately <ero.
)& Typically, the s-ewness value will range from negative E to positive E.
F& 1s a rule of thumb, if s-ewness is more than 8N% E #more accurately, if the absolute
value of s-ewness is more than twice the standard error of s-ewness 6ses:& , consider
using median rather than #or along with& mean. Jut, this is a rule of thumb. Ose of
particular statistics is at the Pudgment of a researcher.
+& Hample Q sometimes, the eistence of super rich people li-e Jill Cates ma-es the
distribution highly positively s-ewed, which ma-es median a better choice for
describing the data than mean.
+
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Mote on -urtosis
!& Lurtosis characteri<es the relative pea-edness or flatness of a distribution #i.e., how
narrow or broad the distribution is& compared to the normal distribution.
7& 4ositive -urtosis indicates a relatively pea-ed distribution. #or relatively wider tail
than normal distribution&
E& Megative -urtosis indicates a relatively flat distribution #or relatively less tail than
normal distribution&
(& Mormal distributions produce a -urtosis statistic of about " #in fact, itKs E, but people
standardi<ed -urtosis by subtracting E. 3o, it becomes "&
'& Values of 7 standard errors of -urtosis #se-& or more #regardless of sign& #more
accurately, if the absolute value of -urtosis is more than twice the standard error of
-urtosis 6se-:& probably differ from the normal distribution to a significant degree.
)& Repto-urtic: very highly pea-ed #highly positive -urtosis score& 2 4layty-urtic:
somewhat flattened #negative -urtosis&
Cf- Sho! ho! .urtosis varies accor(in to (ata manipulation /use sample normal (ata0-
$
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Hample of Mormal distribution #lecture !(;7 dataset&
Statistics
VAR0000&
N Valid !9
Missing 0
Mean 4.9982
Std. Deviation &.72849
Va"ian#e 2.987!7
S$e%ness .003
Std. E""o" o' S$e%ness .&02
()"tosis .00&
Std. E""o" o' ()"tosis .204
VAR0000&
9.0 8.0 7.0 !.0 .0 4.0 3.0 2.0 &.0
VAR0000&
,
"
e
-
)
e
n
#
.
200
&00
0
Std. Dev / &.73
Mean / .0
N / !9.00
Beport
1nalysis of the distribution shows that the data are almost normally distributed #s-ewnees * .
""E2 Lurtosis * .""!&.
!"
COMM 301: Empirical Research in Communication
Descriptive Statistics: Measures of Variability
Hample of Mon%normal distribution
Statistics
EXAMSCOR
N Valid &42
Mean 9!.93!!
Std. Deviation .4943&
Va"ian#e 30.&8744
S$e%ness 03.&08
Std. E""o" o' S$e%ness .203
EXAMSCOR
&00.0 9.0 90.0 8.0 80.0 7.0 70.0
80
!0
40
20
0
Std. Dev / .49
Mean / 9!.9
N / &42.00
Beport
1nalysis of the distribution shows that the data are highly negatively s-ewed #s-ewnees *
% E.!!&. This suggests the non%normality of the data distribution.
!!

Potrebbero piacerti anche