Sei sulla pagina 1di 60

Measures of

Central
Tendency
Dr. Tapan Kr. Dutta
Panskura Banamali College
Please
reset to
silent mode
all the
mobile
phone in
this session

Thank you
Predict Mean,
Median and Mode
from the given
dataset by the
shortcut method
Standard Notation

Measure Sample Population

Mean X

Stand. Dev. S

Variance S2 2

Size n N
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Variation Shape
Tendency
Mean Range Skew
Median Variance
Mode Standard Deviation
Example - 1
Several different
measures of
central tendency
are defined below.
The mean of a sample or a
population is computed by adding all
of the observations and dividing by
the number of observations.
Returning to the example of the five
women, the mean weight would
equal (X1 + X2 + X3 + X4 + X5)/5
= Mean.
To find the median, we arrange the
observations in order from smallest
to largest value. If there is an odd
number of observations, the median
is the middle value. If there is an
even number of observations, the
median is the average of the two
middle values.
The mode is the most
frequently appearing value
in the population or
sample. Suppose we draw
a sample of five women
and measure their weights.
How do I know
which measure of
central tendency
to use?
MEAN
Use the mean to describe the middle of a
set of data that does not have an outlier.
Advantages:
Most popular measure in fields such as
business, engineering and computer
science.
It is unique - there is only one answer.
Useful when comparing sets of data.
Disadvantages:
Affected by extreme values (outliers)
Mean
1. Measure of Central Tendency
2. Most Common Measure
3. Acts as Balance Point
4. Affected by Extreme Values
(Outliers)
5. Formula (Sample Mean)
n

X
i X X X
X i 1 1 2 n
n n
MEDIAN
Use the median to describe the middle of a
set of data that does have an outlier.
Advantages:
Extreme values (outliers) do not affect
the median as strongly as they do the mean.
Useful when comparing sets of data.
It is unique - there is only one answer.
Disadvantages:
Not as popular as mean.
Median
1. Measure of Central Tendency
2. Middle Value In Ordered Sequence
If Odd n, Middle Value of Sequence
If Even n, Average of 2 Middle Values

3. Position of Median in Sequence


n 1
Positionin g Point
2
4. Not Affected by Extreme Values
Median Example
Odd-Sized Sample
Raw Data:24.1 22.6 21.5 23.7 22.6
Ordered: 21.5 22.6 22.6 23.7 24.1
Position: 1 2 3 4 5

n +1 5 +1
Positioning Point 3
2 2
Median = 22.6
Median Example
Even-Sized Sample
Raw Data:10.3 4.9 8.9 11.7 6.3 7.7
Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
Position: 1 2 3 4 5 6

n +1 6 +1
Positioning Point 3 .5
2 2
7.7 + 8.9
Median 8.3
2
MODE
Use the mode when the data is non-numeric or when
asked to choose the most popular item.
Advantages:
Extreme values (outliers) do not affect the mode.
Disadvantages:
Not as popular as mean and median.
Not necessarily unique - may be more than one
answer
When no values repeat in the data set, the mode is
every value and is useless.
When there is more than one mode, it is difficult to
interpret and/or compare.
Mode
1. Measure of Central Tendency
2. Value That Occurs Most Often
3. Not Affected by Extreme Values
4. May Be No Mode or Several Modes
5. May Be Used for Numerical &
Categorical Data
Mode Example
No Mode
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
One Mode
Raw Data: 6.3 4.9 8.9 6.3 4.9 4.9
More Than 1 Mode
Raw Data: 21 28 28 41 43 43
Central Tendency Solution*
Median
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8
n 1 8 1
Positionin g Point 4 .5
2 2
16 16
Median 16
2
What will happen to the
measures of central tendency if
we add the same amount to all
data values, or multiply each
data value by the same amount?
Data Mean Mode Median
Original Data Set: 6, 7, 8, 10, 12, 14, 14, 15, 16, 20 12.2 14 13
Add 3 to each data
9, 10, 11, 13, 15, 17, 17, 18, 19, 23 15.2 17 16
value
Multiply 2 times each
12, 14, 16, 20, 24, 28, 28, 30, 32, 40 24.4 28 26
data value

When added: Since all values are shifted the same


amount, the measures of central tendency all shifted
by the same amount. If you add 3 to each data value,
you will add 3 to the mean, mode and median.
When multiplied: Since all values are affected by the
same multiplicative values, the measures of central
tendency will feel the same affect. If you multiply
each data value by 2, you will multiply the mean,
mode and median by 2.
Summary of
Central Tendency Measures

Measure Equation Description


Mean Xi / n Balance Point
Median (n+1) Position Middle Value
2 When Ordered
Mode none Most Frequent
Range
1. Measure of Dispersion
2. Difference Between Largest &
Smallest Observations
Range X X
l arg est smallest

3. Ignores How Data Are Distributed

7 8 9 10 7 8 9 10
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Variation Shape
Tendency
Mean Range Skew
Median Variance
Mode Standard Deviation
Variance &
Standard Deviation
1. Measures of Dispersion
2. Most Common Measures
3. Consider How Data Are Distributed
4. Show Variation About Mean (X or
) X = 8.3

4 6 8 10 12
Sample Variance Formula
n

c h 2
X X
i
S 2 i 1
n 1
Sample Variance Formula

c h 2
X X n - 1 in denominator!
i (Use N if Population
S 2 i 1 Variance)
n 1


c X
1
X
h c
2
X
2
X
h 2

c X
n
X
h 2

n 1
Sample Standard Deviation
Formula
S S 2

n
cXi X h
2

i 1

n 1


cX 1 X h cX
2
2 X h 2
c
Xn X h 2

n 1
Variance Example
Raw Data:10.3 4.9 8.9 11.7 6.3 7.7
n n

c h
2
X X X
i i
S 2 i 1 where X i 1 8 .3
n 1 n

S2
a f a f a
10 . 3 8 . 3 2 4 . 9 8 . 3 2 7 .7 8 . 3 2 f
6 1
6 . 368
Exercise
Youre a financial
analyst. You have
collected the following
closing stock prices of
new stock issues: 17,
16, 21, 18, 13, 16, 12,
11.
What are the variance
and standard deviation
of the stock prices?
Variation Solution*
Sample Variance
Raw Data: 17 16 21 18 13 16 12 11
n n

cXi X h Xi
2

i 1 i 1
S2 where X 15.5
n 1 n

S 2

a17 15.5 f a16 15.5 f
2 2
a
11 15.5 f 2

8 1
1114
.
Variation Solution*
Sample Standard Deviation
n

cXi X h
2

i 1
S S
2
1114
. 3.34
n 1
68, 59, 57, 64, 52, 60, 62, 57, 61,
61, 71, 51, 59, 54, 65, 67, 54, 62,
58, 60, 54, 62, 65, 71, 63, 60, 61,
56, 67, 64, 57, 61, 60, 62, 59, 57,
64, 58, 61, 63, 62, 62, 60, 58, 67,
63, 64, 61, 60, 65, 67, 70, 58, 51,
61, 62, 65, 52, 60, 55, 63, 62, 60,
67, 55, 62, 61, 64, 57, 59.
By the informations
from above data
following frequency
table can be made
21
7
3

Sample Size (N) = 70


K
R
i

Highest score value = 71


Lowest score value = 51
Size of the class interval (i) = 3
Range of the score (R) = 71 51+1 = 21
Find out the no. of class (K) =
R
K
i
21
7
3
Class Mid value Tally Marks Frequency Class x fx
Interval Boundary
51 53 52 IIII 4 50.5 53.5 -3 -12
54 56 55 IIII I 6 = fb 53.5 56.5 -2 -12
57 59 58 IIII IIII III 13 56.5 59.5 -1 -13
LL 60 62 UL 61 = IIII IIII IIII IIII 25 = fm 59.5 62.5 0 0
63 65 Am IIII
13 62.5 65.5 +1 +13
66 68 64 IIII IIII III
6 = 65.5 68.5 +2 +12
67 IIII I fa
69 71 68.5 71.5 +3 +9
70 III 3
N = 70 fx = -3
Mode = 3 Median 2 Mean
= 3 60.94 2 61.12
= 182.82 122.24
= 60.58
Example - 2
Measures of Position
Measures of position are different
techniques that divide a set of data
into equal groups.
To determine the measurement of
position, the data must be sorted
from lowest to highest. The
different measures of position are:
Quartiles
The quartiles divide the data set
into four equal parts.
Deciles
The deciles divide the data set into
ten equal parts.
Percentiles
Percentiles divide the data set into
one hundred equal parts.
Measures of Dispersion
The measures of dispersion report
on how far the values of the
distribution are from the center.
The measures of dispersion are:
Range
The range is the difference
between the highest and lowest
data of a statistical distribution.
Average Deviation
The average deviation is the
arithmetic mean of the absolute
values of the deviations from
the mean.
Variance
The variance is the
arithmetic mean of the
squared deviations from
the mean.
Standard Deviation
The standard deviation is
the square root of the
variance.