Sei sulla pagina 1di 43

Descriptive Statistics

Chapter Ten
Dr Nek Kamal Yeop Yunus
Faculty of Business & Economics
Sultan Idris Education University

Descriptive Statistics
Chapter Ten

Statistics vs. Parameters


A parameter is a characteristic of a
population.
It is a numerical or graphic way to
summarize data obtained from the
population

A statistic is a characteristic of a sample.


It is a numerical or graphic way to
summarize data obtained from a sample

Types of Numerical Data


There are two fundamental types of
numerical data:
1)

2)

Categorical data: obtained by determining


the frequency of occurrences in each of
several categories
Quantitative data: obtained by determining
placement on a scale that indicates amount
or degree

Techniques for Summarizing


Quantitative Data
Frequency Distributions
Histograms/Stem and Leaf Plots
Distribution curves
Averages/Spread
Variability/Correlations

Frequency Polygons
Places data in some sort of order
A frequency distribution lists scores from high
to low (Table 10.1)
This results in a grouped frequency
distribution (Table 10.2)
Since the information is not very visual, a
graphical display called a frequency polygon
can help with this (Figure 10.1)
Frequency polygons can be negatively or positively
skewed (Figure 10.2)
They can be useful in comparing two or more
groups

Example of a Frequency Distribution (Table 10.1)


Raw Score
64
63
61
59
56
52
51
38
36
34
31
29
27
25
24
21
17
15
6
3

Frequency
2
1
2
2
2
1
2
4
3
5
5
5
5
1
2
2
2
1
2
1
n = 50

Technically, the table should include all scores, including those for which there
are zero frequencies. We have eliminated those to simplify the presentation.

Example of a Grouped Frequency Distribution


(Table 10.2)
Raw Score
(Intervals of Five)

Frequency

64
63
61
59
56
52
51
38
36
34
31
29
27
25
24
21
17
15
6
3

2
1
2
2
2
1
2
4
3
5
5
5
5
1
2
2
2
1
2
1
n = 50

Example of a Frequency Polygon (Figure 10.1)

Example of a Positively Skewed


Polygon (Figure 10.2)

Example of a Negatively Skewed


Polygon (Figure 10.3)

Two Frequency Polygons Compared


(Figure 10.4)

Histograms and Stem-and-Leaf Plots


A histogram is a bar graph used to display
quantitative data at the interval or ratio
level of measurement (Table 10.2)
A Stem-leaf Plot (stem plot) looks like a
histogram, except instead of bars, it shows
values for each category
They are helpful for comparing and contrasting
two distributions (Table 10.1)

Histogram of Data in Table 10.2


(Figure 10.5)

The Normal Curve


This distribution curve shows a generalized distribution
of scores vs. straight lines (frequency polygon)
Distribution of data tends to follow a specific shape
called a normal distribution (see Figure 10.6)
This distribution is considered bell shaped and allows
the plotting of the following averages:
Mean
Medium
Mode

*These measures of central tendencies enable one to summarize the data in


a frequency distribution with a single number

The Normal Curve (Figure 10.6)

Example of the Mode, Median and Mean


in a Distribution (Table 10.3)
Raw Score
98
97
91
85
80
77
72
65
64
62
58
45
33
11
5

Frequency
1
1
2
1
5
7
5
3
7
10
3
2
1
1
1
n = 50

Mode = 62; median = 64.5; mean = 66.7

Averages Can Be Misleading (Figure 10.7)

Different Distributions Compared


(Figure 10.8)

Variability
Refers to the extent to which the scores on a
quantitative variable in a distribution are spread
out.
The range represents the difference between the
highest and lowest scores in a distribution.
A five number summary reports the lowest, the
first quartile, the median, the third quartile, and
highest score.
Five number summaries are often portrayed
graphically by the use of box plots.

Box plots (Figure 10.9)

Standard Deviation
Considered the most useful index of variability.
It is a single number that represents the spread
of a distribution.
See p. 348 to calculate the mean of the
distribution.
Table 10.5 will illustrate the calculation of the SD
of a distribution.
If a distribution is normal, then the mean plus or
minus 3 SD will encompass about 99% of all
scores in the distribution.

Calculation of the Standard Deviation of a


Distribution (Table 10.5)
Raw
Score
85
80
70
60
55
50
45
40
30
25

Mean
54
54
54
54
54
54
54
54
54
54

XX
31
26
16
6
1
-4
-9
-14
-24
-29

(X X)
961
676
256
36
1
16
81
196
576
841

(X X)
Variance (SD ) =
n
2

Standard deviation (SD) =

3640
= 364a
10

(X X)
n

Standard Deviations for Boys and Mens


Basketball Teams (Figure 10.10)

Facts about the Normal Distribution


55% of all the observations fall on each side
of the mean. (Figure 10.11)
68% of scores fall within 1 SD of the mean in
a normal distribution.
27% of the observations fall between 1 and 2
SD from the mean.
99.7% of all scores fall within 3 SD of the
mean. (Figure 10.12)
This is often referred to as the 68-95-99.7
rule

Fifty Percent of All Scores in a Normal


Curve Fall on Each Side of the Mean
(Figure 10.11)

Probabilities Under the Normal Curve


(Figure 10.12)

Standard Scores
Standard scores use a common scale to indicate how
an individual compares to other individuals in a group.
The simplest form of a standard score is a Z score.
A Z score expresses how far a raw score is from the
mean in standard deviation units. (see Figure 10.13)
Standard scores provide a better basis for comparing
performance on different measures than do raw scores.
A Probability is a percent stated in decimal form and
refers to the likelihood of an event occurring.
T scores are z scores expressed in a different form (z
score x 10 + 50).

Probability Areas Between the Mean and


Different Z Scores (Figure 10.13)

Examples of Standard Scores


(Figure 10.14)

Correlation
Researchers seek to determine whether a
relationship exists between two or more
quantitative variables.
A Scatterplot is a pictorial representation of the
relationship between two quantitative variables.
(see Figure 10.15)
Outliers are scores that deviate or fall
considerably outside most of the other scores in a
distribution or pattern.
They indicate an unusual exception to a general
pattern (See Figure 10.16)

Correlation coefficients express the degree of


relationship between two sets of scores.

Pearson Product-Moment Correlation Coefficient


Eta

Scatterplot of Data from Table 10.7


(Figure 10.15)

Relationship Between Family Cohesiveness and


School Achievement in a Hypothetical Group of
Students (Figure 10.16)

Examples of Scatterplots (Figure 10.17)

A Perfect Negative Correlation


(Figure 10.18)

Positive and Negative Correlations


(Figure 10.19)

Examples of Nonlinear (Curvilinear)


Relationships (Figure 10.20)

Techniques for Summarizing


Categorical Data
The Frequency Table
Bar Graphs and Pie Charts
The Crossbreak Table

Frequency and Percentage of


Responses to Questionnaire
Response
Frequency
Lecture
15
Class discussions
10
Demonstrations
8
Audiovisual
presentations
6
Seatwork
5
Oral reports
4
Library research
2
Total

50

Percentage
of Total (%)
30
20
16
12
10
8
4
100

Example of a Bar Graph


(Figure 10.21)

Example of Pie Chart (Figure 10.22)

Any questions?

Thank You

Potrebbero piacerti anche