Statistika Chap 2

Central Tendency and Variability
The two most essential features of a

distribution
Numerical Data
Properties & Measures
Numerical Data
Properties
Central
Tendency
Mean
Variation
Shape
Median
Range
Variance
Mode
Standard Deviation
Skew
Variables have distributions

A variable is something that changes or
has different values (e.g., anger).
A distribution is a collection of
measures, usually across people.
Distributions of numbers can be
summarized with numbers (called
statistics or parameters).
Central Tendency refers to the

Middle of the Distribution
Variability is about the Spread
Mean
Sum of scores divided by the number of
people. Population mean is (mu)
and sample mean is X (X-bar).
We calculate the sample mean by:
Arit
Geo
X
N
X n
X n
FX
Ungrouped Data
Number of a family
children in Sleman
No of Child
Frequency
20
15
The height (to the nearest mm) of

each of a number of seedlings
31
36
40
46
33
33
31
17
20
46
39
29
38
34
37
Grouped Data
Example
The heights (in cm) of a group of
students are summarized below. Draw a
histogram and polygon to illustrate
these data
Mean
1. Measure of Central Tendency

2. Most Common Measure
3. Acts as Balance Point
4. Affected by Extreme Values
(Outliers)
5. Formula (Sample Mean)
n
i 1
X
n
Deviation from the mean

x = X X . Deviations sum to zero.
Deviation score deviation from the
mean
9
Raw scores
8 9 10
7
10 11
-1
-1
0
0
0
1
1
Deviation scores
-2
Median
Score that separates top 50% from bottom 50%
Ungrouped Data
Even number of scores, median is half way between two
middle scores.
Letak Med1= n /2
Letak Med2 = (n+2)/2
Med = (Med1+Med2)/2
1 4 6 8 9 10 17 18 Median is (8+9)/2 = 8.5
Odd number of scores, median is the middle number

Letak Med = (n + 1)/2
1 4 6 8 9 10 17 Median is 8
Median
2. Middle Value In Ordered Sequence
If Odd n, Middle Value of Sequence
If Even n, Average of 2 Middle Values
3. Position of Median in Sequence
Positioning Point
n 1
2
4. Not Affected by Extreme Values
Median Example
Odd-Sized Sample
Raw Data: 24.1 22.6 21.5 23.7

22.6
Ordered:
21.5
22.6
22.6
23.7
24.1
Position:
Positioning
Median = 22.6
Point
n +1
2
5 +1
2
Median Example
Even-Sized Sample
Raw Data: 10.3 4.9 8.9 11.7 6.3

7.7
Ordered:
Position:
4.9
1
Positioning
Median
6.3
2
Point
7.7 + 8.9
2
7.7
3
n +1
2
8.3
8.9
4
10.3
5
6 +1
2
11.7
6
3 .5
Mode
2. Value That Occurs Most Often
3. Not Affected by Extreme Values
4. May Be No Mode or Several Modes
5. May Be Used for Numerical &
Categorical Data
The mode the most frequently

occurring score. Midpoint of most
populous class interval. Can have
bimodal and multimodal distributions.
Grouped/Classified Data
Mode Example
No Mode
Raw Data:
10.3 4.9 8.9 11.7 6.3 7.7
One Mode
Raw Data: 6.3 4.9 8.9 6.3 4.9 4.9
More Than 1 Mode
Raw Data: 21 28
28
41
43
43
Thinking Challenge
Youre a financial analyst.
You have collected the
following closing stock
prices of new stock issues:
17, 16, 21, 18, 13, 16, 12,
11.
Describe the stock prices
in terms of central
tendency.
ODD & EVEN DATA
Classified Data
Comparison of mean, median

and mode
Mode
Good for nominal variables

Good if you need to know most frequent
observation
Quick and easy
Median
Good for bad distributions
Good for distributions with arbitrary
ceiling or floor
Comparison of mean, median

& mode
Mean
Used for inference as well as description;

best estimator of the parameter
Based on all data in the distribution
Generally preferred except for bad
distribution. Most commonly used
statistic for central tendency.
Best Guess interpretations

Mean average of signed error will be
zero.
Mode will be absolutely right with
greatest frequency
Median smallest absolute error
Shape of a Distribution
Describes how data are distributed
Measures of shape
Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean < Median
Mean = Median
Median < Mean
Statistics for Business

and Economics, 6e
Chap 3-26
Influence of Distribution
Shape
Review
What is central tendency?

Mode
Median
Mean
Review
Range
Average deviation
Variance
Standard Deviation
Z score
Variation
Numerical Data
Properties & Measures
Numerical Data
Properties
Central
Tendency
Variation
Shape
Mean
Range
Median
Variance
Standard Deviation
Mode
Skew
4 Statistics: Range, Average Deviation,

Variance, & Standard Deviation
Range = high score minus low score.
12 14 14 16 16 18 20 range=20-12=8
Average Deviation mean of absolute

deviations from the median:
| X Md |
AD
N
Note difference between Hays & undergrad textdeviation from Median vs. Mean
Variance
2
(
X

Population Variance:
N
Where 2means population variance,
means population mean, and the other
terms have their usual meaning.
The variance is equal to the average squared
deviation from the mean.
To compute, take each score and subtract the
mean. Square the result. Find the average
over scores. Ta da! The variance.
2
Computing the Variance

(N=5)
15
10
X X (X X )
-10
100
15
-5
25
15
15
20
15
25
25
15
10
100
Total:
75
250
Mean:
Variance
Is
50
Standard Deviation
Variance is average squared deviation
from the mean.
To return to original, unsquared units,
we just take the square root of the
variance. This is the standard
deviation.
2
Population formula:
( X )
Standard Deviation
Sometimes called the root-mean-square
deviation from the mean. This name
says how to compute it from the inside
out.
Find the deviation (difference between
the score and the mean).
Find the deviations squared.
Find their mean.
Take the square root.
Computing the Standard

Deviation
2
(X X )
(N=5) X
X
5
10
15
20
25
Total:
Mean:
Sqrt
15
15
15
15
15
75
Variance
SD
X X
-10
-5
0
5
10
0
Is
Is
100
25
0
25
100
250
50
50 7.07
Example: Age Distribution

Distribution of Age
Central Tendency, Variability, and Shape
16
Median = 23
Average Distrance from Mean
Mode = 21
12
Frequency
Mean=25.73
SD = 6.47
0
10
20
30
age
40
50
Standard or z score
A z score indicates distance from the
mean in standard deviation units.
Formula:
X X
z
S
X
z
Converting to standard or z scores does

not change the shape of the distribution.
Z-scores are not normalized.
Skewness and Kurtosis

Skewness and kurtosis describe the shape of your
data set's distribution. Skewness indicates how
symmetrical the data set is, while kurtosis indicates
how heavy your data set is about its mean compared
to its tails.
Perfectly symmetrical data sets will have a skewness
of zero (skewness = 0), and a normally distributed
data set will have a kurtosis of approximately three
(kurtosis=3).
SKEWNESS
KURTOSIS
EQUATION
skewness: g1 = m3 / m23/2
kurtosis: a4 = m4 / m22
Example
Calculation of Skewness ON
CLASSIFIED DATA
Finally, the skewness is

g1 = m3 / m23/2 = 2.6933 / 8.52753/2 = 0.1082
Interpretation
If skewness = 0, the data are perfectly symmetrical. But a skewness of exactly
zero is quite unlikely for real-world data, so how can you interpret the
skewness number?
Bulmer, M. G., Principles of Statistics (Dover, 1979) a classic suggests this
rule of thumb:
If skewness is less than 1 or greater than +1, the distribution is highly skewed.
If skewness is between 1 and or between + and +1, the distribution is
moderately skewed.
If skewness is between and +, the distribution is approximately
symmetric.
With a skewness of 0.1098, the sample data for student heights are
approximately symmetric.
Calculation of Kurtosis
Influence of Distribution
Shape

Statistika Chap 2

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Statistika Chap 2

Caricato da

Copyright:

Formati disponibili

Central Tendency and Variability

The two most essential features of a

Variables have distributions

Central Tendency refers to the

Variability is about the Spread

The height (to the nearest mm) of

1. Measure of Central Tendency

Deviation from the mean

Odd number of scores, median is the middle number

3. Position of Median in Sequence

4. Not Affected by Extreme Values

Raw Data: 24.1 22.6 21.5 23.7

Raw Data: 10.3 4.9 8.9 11.7 6.3

The mode the most frequently

10.3 4.9 8.9 11.7 6.3 7.7

ODD & EVEN DATA

Comparison of mean, median

Good for nominal variables

Comparison of mean, median

Used for inference as well as description;

Best Guess interpretations

Mean < Median

Median < Mean

Statistics for Business

What is central tendency?

4 Statistics: Range, Average Deviation,

Average Deviation mean of absolute

Computing the Variance

Computing the Standard

Example: Age Distribution

Average Distrance from Mean

Converting to standard or z scores does

Skewness and Kurtosis

Finally, the skewness is

Potrebbero piacerti anche