Sei sulla pagina 1di 33

Basic Business Statistics

C
hap
3-1

CHAPTER 2
NUMERICAL DESCRIPTIVE
MEASURES

© 2003 Prentice-Hall, Inc.


Chapter Topics
C
hap
3-2

Measures of Central Tendency


 Mean, Median, Mode, Geometric Mean
Quartile
Measure of Variation
 Range, Interquartile Range, Variance and Standard Deviation,
Coefficient of Variation
Shape
 Symmetric, Skewed, Using Box-and-Whisker Plots

© 2003 Prentice-Hall, Inc.


Chapter Topics
C
hap
3-3 (continued
The Empirical Rule and the Bienayme-Chebyshev)

Rule
Coefficient of Correlation
Pitfalls in Numerical Descriptive Measures and
Ethical Issues

© 2003 Prentice-Hall, Inc.


Summary Measures
C
hap
3-4

Summary Measures

Central Tendency Quartile Variation

Mean Mode
Median Range Coefficient of
Variation
Variance

Standard Deviation
Geometric Mean
© 2003 Prentice-Hall, Inc.
Measures of Central Tendency
C
hap
3-5

Central Tendency

Mean Median X = ( X × X
1 2
Mode
× ... × X n )1/ n
n

X i
Geometric Mean
X  i 1

n
N

X i
 i 1

N
© 2003 Prentice-Hall, Inc.
Mean (Arithmetic Mean)
C
hap
3-6

Mean (Arithmetic Mean) of Data Values


 Sample mean

n Sample Size
X i
X1  X 2  L  X n
 PopulationX 
mean
i 1

n n
N Population Size
X i
X1  X 2  L  X N
 i 1

N N
© 2003 Prentice-Hall, Inc.
Mean (Arithmetic Mean)
C
hap
3-7 (continued
The Most Common Measure of Central )
Tendency
Affected by Extreme Values (Outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6

© 2003 Prentice-Hall, Inc.


Mean (Arithmetic Mean)
C
hap
3-8 (continued
)
Approximating the Arithmetic Mean
 Used when raw data are not available
c

 m
j 1
j fj
X 
n
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
f j  frequencies of the jth class
© 2003 Prentice-Hall, Inc.
Median
C
hap
3-9

Robust Measure of Central Tendency


Not Affected by Extreme Values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5
In an Ordered Array, the Median is the ‘Middle’
Number
 If n or N is odd, the median is the middle number
 If n or N is even, the median is the average of the 2 middle
numbers

© 2003 Prentice-Hall, Inc.


Mode
C
hap
3-10

A Measure of Central Tendency


Value that Occurs Most Often
Not Affected by Extreme Values
There May Not Be a Mode
There May Be Several Modes
Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

No Mode
© 2003 Prentice-Hall, Inc.
Mode = 9
Geometric Mean
C
hap
3-11

Useful in the Measure of Rate of Change of a


Variable Over Time

X G   X 1  X 2 L  X n 
1/ n

Geometric Mean Rate of Return


 Measures the status of an investment over time

RG    1  R1    1  R2  L   1  Rn  
1/ n
1
© 2003 Prentice-Hall, Inc.
Quartiles
C
hap
3-12

Split Ordered Data into 4 Quarters

25% 25% 25% 25%


 Q1   Q2   Q3 
Position of i-th Quartile i  n  1
 Qi  
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 9  1  12  13
Q1 
and ofare
Position Measuresof2.5 Q1  Location
Noncentral  12.5
4
 = Median, a Measure 2
of Central Tendency
Q1 Q3
Q2
© 2003 Prentice-Hall, Inc.
Measures of Variation
C
hap
3-13

Variation

Variance Standard Deviation Coefficient


of Variation
Range Population Population
Variance Standard
Sample Deviation
Variance Sample
Standard
Interquartile Range
Deviation
© 2003 Prentice-Hall, Inc.
Range
C
hap
3-14

Measure of Variation
Difference between the Largest and the Smallest
Observations:

Range  X Largest  X Smallest


Ignores How Data are Distributed

Range = 12 - 7 = 5 Range = 12 - 7 = 5

7 8 9 10 11 7 8 9 10 11
12 12
© 2003 Prentice-Hall, Inc.
Interquartile Range
C
hap
3-15

Measure of Variation
Also Known as Midspread
 Spread in the middle 50%
Difference between the First and Third Quartiles

DataAffected
Not in Ordered
byArray: 11 Values
Extreme 12 13 16 16 17 17 18 21
Interquartile Range  Q3  Q1  17.5  12.5  5

© 2003 Prentice-Hall, Inc.


Variance
C
hap
3-16

Important Measure of Variation


Shows Variation about the Mean
 Sample Variance: n

 X X
2
i
S2  i 1

n 1
 Population Variance:
N

 X 
2
i
 
2 i 1

N
© 2003 Prentice-Hall, Inc.
Standard Deviation
C
hap
3-17

Most Important Measure of Variation


Shows Variation about the Mean
Has the Same Units as the Original Data
 Sample Standard Deviation:
n

 X X
2
i
S i 1

 Population Standard Deviation:


n 1
N

 X 
2
i
 i 1

© 2003 Prentice-Hall, Inc.


N
Standard Deviation
C
hap
3-18 (continued
)
Approximating the Standard Deviation
 Used when the raw data are not available and the only
source of data is a frequency distribution
c

 m  X  fj
2

j
j 1
S
n 1
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
f  frequencies of the jth class
j Inc.
© 2003 Prentice-Hall,
Comparing Standard Deviations
C
hap
3-19

Data A Mean = 15.5


s = 3.338
11 12 13 14 15 16 17 18 19 20 21

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57

© 2003 Prentice-Hall, Inc.


Coefficient of Variation
C
hap
3-20

Measure of Relative Variation


Always in Percentage (%)
Shows Variation Relative to the Mean
Used to Compare Two or More Sets of Data
Measured in Different Units

 S
CV    100%
Sensitive to 
X Outliers
© 2003 Prentice-Hall, Inc.
Comparing Coefficient
of Variation
C
hap
3-21

Stock A:
 Average price last year = $50
 Standard deviation = $2

Stock B:
 Average price last year = $100
 Standard deviation = $5

Coefficient of Variation:
 Stock A:

 S  $2 
CV   100%   100%  4%
 Stock B:  X  $50 
 S  $5 
CV   100%   100%  5%
© 2003 Prentice-Hall, Inc.  X  $100 
Shape of a Distribution
C
hap
3-22

Describe How Data are Distributed


Measures of Shape
 Symmetric or skewed

Left-Skewed Symmetric Right-Skewed


Mean < Median < Mode Mean = Median =Mode Mode < Median < Mean

© 2003 Prentice-Hall, Inc.


Exploratory Data Analysis
C
hap
3-23

Box-and-Whisker
 Graphical display of data using 5-number summary

Median( Q2) Xlargest


X smallest Q Q3
1

4 6 8 10 12
© 2003 Prentice-Hall, Inc.
Distribution Shape &
Box-and-Whisker
C
hap
3-24

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1Q2Q3 Q1 Q2 Q3

© 2003 Prentice-Hall, Inc.


Exploratory Data Analysis

◆ Stem-and-leaf display: An exploratory

data analysis technique that

simultaneously rank orders quantitative

data and provides insight about the

shape of the distribution.

Chap
© 2003 Prentice-Hall, Inc. 3-25
Stem-and-leaf display
NUMBER OF QUESTIONS ANSWERED CORRECTLY
ON AN APTITUDE TEST
112 72 69 97 107
73 92 76 86 73
126 128 118 127 124
82 104 132 134 83
92 108 96 100 92
115 76 91 102 81
95 141 81 80 106
84 119 113 98 75
68 98 115 106 95
100 85 94 106 119

Chap
© 2003 Prentice-Hall, Inc. 3-26
Stem-and-leaf display
Number of questions Stem-and-Leaf Plot

Frequency Stem & Leaf


2.00 6 . 89
6.00 7 . 233566
8.00 8 . 01123456
11.00 9 . 12224556788
9.00 10 . 002466678
7.00 11 . 2355899
4.00 12 . 4678
2.00 13 . 24
1.00 14 . 1
Stem width: 10.00
Each leaf: 1 case(s)

Chap
© 2003 Prentice-Hall, Inc. 3-27
The Empirical Rule
C
hap
3-28

For Most Data Sets, Roughly 68% of the


Observations Fall Within 1 Standard Deviation
Around the Mean
Roughly 95% of the Observations Fall Within 2
Standard Deviations Around the Mean
Roughly 99.7% of the Observations Fall Within 3
Standard Deviations Around the Mean

© 2003 Prentice-Hall, Inc.


The Bienayme-Chebyshev Rule
C
hap
3-29

The Percentage of Observations Contained Within


Distances of k Standard Deviations Around the Mean
Must Be at Least

 1  1/ k 2

Applies regardless of the shape of the data set


 100%
 At least 75% of the observations must be contained within
distances of 2 standard deviations around the mean
 At least 88.89% of the observations must be contained within
distances of 3 standard deviations around the mean
 At least 93.75% of the observations must be contained within
distances of 4 standard deviations around the mean

© 2003 Prentice-Hall, Inc.


Coefficient of Correlation
C
hap
3-30

Measures the Strength of the Linear Relationship


between 2 Quantitative Variables

  X i  X   Yi  Y 
r i 1
n n

 X X  Y Y 
2 2
i i
i 1 i 1

© 2003 Prentice-Hall, Inc.


Features of Correlation Coefficient
C
hap
3-31

Unit Free
Ranges between –1 and 1
The Closer to –1, the Stronger the Negative Linear
Relationship
The Closer to 1, the Stronger the Positive Linear
Relationship
The Closer to 0, the Weaker Any Linear
Relationship
© 2003 Prentice-Hall, Inc.
Scatter Plots of Data with Various
C
Correlation Coefficients
hap
3-32

Y Y Y

X X
r = -1 r = -.6 r=0 X
Y Y

X X
r = .6 r=1
© 2003 Prentice-Hall, Inc.
Pitfalls in Numerical Descriptive
Measures and Ethical Issues
C
hap
3-33

Data Analysis is Objective


 Should report the summary measures that best meet the
assumptions about the data set
Data Interpretation is Subjective
 Should be done in a fair, neutral and clear manner
Ethical Issues
 Should document both good and bad results
 Presentation should be fair, objective and neutral
 Should not use inappropriate summary measures to distort
the facts
© 2003 Prentice-Hall, Inc.

Potrebbero piacerti anche