Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Numerical Descriptive
Measures
X i
X1 X2 Xn
X i1
n n
Sample size Observed values
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Mean = 13 Mean = 14
11 12 13 14 15 65 11 12 13 14 20 70
13 14
5 5 5 5
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Median = 13 Median = 13
n 1
Median position position in the ordered data
2
If the number of values is odd, the median is the middle number.
Note that
n 1 is not the value of the median, only the position of
2
the median in the ranked data.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 9 No Mode
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 8
Measures of Central Tendency:
Review Example
DCOVA
House Prices: Mean: ($3,000,000/5)
$2,000,000 = $600,000
$ 500,000
$ 300,000
Median: middle value of ranked
$ 100,000 data
$ 100,000 = $300,000
Sum $ 3,000,000 Mode: most frequent value
= $100,000
X G ( X1 X 2 X n ) 1/ n
RG [(1 R1 ) (1 R2 ) (1 Rn )]1/ n 1
Where Ri is the rate of return in time period i.
Arithmetic
mean rate (.5) (1) Misleading result
X .25 25%
of return: 2
Geometric RG [(1 R1 ) (1 R2 ) (1 Rn )] 1
1/ n
More
mean rate of [(1 ( .5)) (1 (1))]1 / 2 1 representative
return: [(.50) ( 2)]1 / 2 1 11 / 2 1 0% result
X i
XG ( X1 X2 Xn )1/ n
X i 1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
S 2 i 1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 18
Measures of Variation:
The Sample Standard Deviation
DCOVA
Most commonly used measure of variation.
Shows variation about the mean.
Is the square root of the variance.
Has the same units as the original data.
n
Sample standard deviation: (X i X) 2
S i 1
n -1
If the values are all the same (no variation), all these
measures will be zero.
S
CV 100%
X
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 25
Measures of Variation:
Comparing Coefficients of Variation
DCOVA
Stock A:
Mean price last year = $50.
S $5
CVA 100% 100% 10%
X $50 Both stocks have
Stock B: the same
standard
Mean price last year = $100. deviation, but
stock B is less
Standard deviation = $5. variable relative
to its mean price.
S $5
CVB 100% 100% 5%
X $100
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 26
Measures of Variation:
Comparing Coefficients of Variation (con’t)
Stock A:
DCOVA
Mean price last year = $50.
S $5
CVA 100% 100% 10%
X $50 Stock C has a
much smaller
Stock C:
standard
Mean price last year = $8. deviation but a
much higher
Standard deviation = $2. coefficient of
variation
S $2
CVC 100% 100% 25%
X $8
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 27
Locating Extreme Outliers:
Z-Score
DCOVA
To compute the Z-score of a data value, subtract the
mean and divide by the standard deviation.
Skewness
Statistic < 0 0 >0
Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)
Bell-Shaped
(Kurtosis = 0)
Flatter Than
Bell-Shaped
(Kurtosis < 0)
Constructing a boxplot.
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
values are smaller and 75% are larger.
Q2 is the same as the median (50% of the values are
smaller and 50% are larger).
Only 25% of the values are greater than the third quartile.
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5.
Measures like Q1, Q3, and IQR that are not influenced
by outliers are called resistant measures.
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
> ≈ <
> ≈ <
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q 2 Q3
00 2233 5 5 27 27
X i
X1 X2 XN
i1
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 50
Numerical Descriptive Measures
For A Population: The Variance σ2
DCOVA
Average of squared deviations of values from
the mean.
N
Population variance: (X μ)
i
2
σ2 i1
N
N
Population standard deviation:
i
(X μ) 2
σ i1
N
68%
μ
μ 1σ
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 54
The Empirical Rule
Approximately 95% of the data in a bell-shaped
DCOVA
distribution lies within two standard deviations of the
mean, or µ ± 2σ.
95% 99.7%
μ 2σ μ 3σ
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 55
Using the Empirical Rule
DCOVA
Suppose that the variable Math SAT scores is bell-
shaped with a mean of 500 and a standard deviation
of 90. Then:
Approximately 68% of all test takers scored between 410
and 590, (500 ± 90).
At least Within
The Covariance.
The Coefficient of Correlation.
( X X)( Y Y)
i i
cov ( X , Y ) i1
n 1
Only concerned with the strength of the relationship.
No causal effect is implied.
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 59
Interpreting Covariance
DCOVA
Covariance between two variables:
cov(X,Y) > 0 X and Y tend to move in the same direction.
cov(X,Y) < 0 X and Y tend to move in opposite directions.
cov (X , Y)
r
SX SY
Where,
n
(X X)(Y Y)
n n
i i (X X)
i
2
i
(Y Y ) 2
cov (X , Y) i1
SX i1
SY i1
n 1 n 1 n 1
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 61
Features of the
Coefficient of Correlation
DCOVA
The population coefficient of correlation is referred as ρ.
The sample coefficient of correlation is referred to as r.
Either ρ or r have the following features:
Unit free.
Range between –1 and 1.
The closer to –1, the stronger the negative linear relationship.
The closer to 1, the stronger the positive linear relationship.
The closer to 0, the weaker the linear relationship.
X X
r = -1 r = -.6
Y
Y Y
X X X
r = +1 r = +.3 r=0
Copyright © 2017 Pearson Education, Ltd. Chapter 1 - 63
The Coefficient of Correlation Using
Microsoft Excel Function
DCOVA
Test #1 Score Test #2 Score Correlation Coefficient
78 82 0.7332 =CORREL(A2:A11,B2:B11)
92 88
86 91
83 90
95 92
85 85
91 89
76 81
88 96
79 77
r = 0.733.
Scatter Plot of Test Scores
100
There is a relatively 95
Test #2 Score
strong positive linear 90
#2. 75
70
70 75 80 85 90 95 100
Test #1 Score
Students who scored high
on the first test tended to
score high on second test.