Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1-7
1-9
Average score
( Y )
Standard deviation
(sY)
Small
657.4
19.4
238
Large
650.0
17.9
182
1.
2.
3.
1-10
1. Estimation
1
nsmall
Yi
=
nsmall i1
is just notation for
Ysmall Ylarge
1
nlarge
nlarge
Y
i1
(this
the
means!)
= 657.4 650.0
= 7.4
1-11
2. Hypothesis testing
Difference-in-means
Ys Yl
ss2
ns
sl2
nl
Ys Yl
SE(Ys Yl )
(remember this?)
s
1
2
ss2
(Y
Y
)
ns 1 i1 i s
1-12
sY
small
657.4
19.4
238
large
650.0
17.9
182
= 4.05 t
Ys Yl
ss2
ns
sl2
nl
657.4 650.0
19.42
238
17.92
182
7.4
1.83
3. Confidence interval
A 95% confidence interval for the difference
between the means is,
(Ys Yl
Ys Yl
) 1.96SE(
1-14
1-15
1-17
Population distribution of Y
The probabilities of different values of Y that occur
in the population, for ex. Pr[Y = 650] (when Y is
discrete)
or: The probabilities of sets of these values, for ex.
Pr[640 Y 660] (when Y is continuous).
1-18
Y E(Y Y)2
variance =
= Y
1-19
Moments,
ctd.
E Y Y
skewness =
Y3
= measure of asymmetry of a distribution
E Y Y
kurtosis =
Y4
= measure of mass in tails
1-20
1-21
1-22
So is the correlation
1-23
corr(X,Z) =
var( X ) var(Z ) X Z
cov( X , Z )
= rXZ
1 corr(X,Z) 1
corr(X,Z) = 1 mean perfect positive linear association
corr(X,Z) = 1 means perfect negative linear association
corr(X,Z) = 0 means no linear association
1-24
1-25
1-28
Estimation
3.
Testing
4.
Confidence Intervals
Estimation
Y of
a) What are the properties
b) Why should we use
Y
example:
median(Y1,, Yn)
Y
The starting point is the sampling distribution of
1-30
Y
(a) The sampling
distribution
ofand its properties are
Y is a random variable,
Y
determined by the sampling
distribution of
The individuals in the sample are drawn at random.
Thus the values of (Y1, , Yn) are random
Y variance
Y of are the mean and variance of
The mean and
its sampling distribution, E( ) and var( ).
The concept of the sampling distribution underpins all of
econometrics.
1-31
Y
The sampling distribution
of
Example: Suppose Y takes on 0 or 1 (a Bernoulli random
, ctd.
variable) with the probability distribution,
Pr[Y = 0] = 0.22, Pr(Y =1) = 0.78
Then
E(Y) = p1 + (1 p) 0 = p = 0.78
AND
Y2
depends on n.
= 0) = 0.222 = 0.05
is,
Y
1-32
Y of
The sampling distribution
78):
when Y is Bernoulli (p = .
1-33
Y then
) = true = .78,
Y
Does
is an unbiased estimator
become Y
close to when n is large?
is a consistent estimator of
Y
n i1 i
n i1
n i1) =
mean: E( ) = E(
=
=
Y
Variance:
var(
) = E[
Y
E(
= E[
)]2
1 ]
Yi Y
n i1
2
=E
n
1
(Yi Y )
= E n i1
1-35
Y ) = n E(Yi Y )
var(
i1
1 n
1 n
= E (Y ) (Y )
i
Y
j
Y
n
n
i1
j1
n
n
1
=
E (Yi Y )(Y j Y )
2
n i1 j1
= 1 n n
cov(Yi ,Y j )
2
n i1 j1
=
1 n 2
n
So
i1
Y2
n
1-36
) = Y
var(
Implications:
1.
)=
Y
Y
Y2
n
2. var(
Y ) is inversely proportional to n
1.
Y
distribution
The sampling
when n is large
of
Y
For small sample sizes, the distribution
of
is
complicated, but if n is large, the sampling
distribution is simple!
Y
1. As n increases, the distribution of
becomes
more tightly centered around Y (the Law of
Large Numbers)
1-38
If (Y1,,Y
Y n) are i.i.d. pand
of Y, that is,
Y
p
Pr[|
< , then
is a consistent estimator
Y| < ] 1 as n
Y2Y
Y
n
Y means converges
in probability to Y).
(theYmath: as n , var(
Pr[|
)=
Y| < ] 1.)
1-39
2
andY
is
2
Y
approximately
<
< , then when n is
is2 well approximated by a
Y
distributed N(Y,
) (normal
Y
distribution with mean
n
and variance
/n)
Y Y
Y Y E(Y )
( Y)/Y is approximately
distributed N(0,1) (standard
Y
var(Y )
Y / n
normal)
That is, standardized
Y E(Y )
Same example: sampling distribution of
var(Y )
1-41
Y
Summary: The Sampling Distribution
of
2
<
< ,
Y
The exact (finite sample) sampling2 distribution
of
(
is an unbiased estimator of
YY) and variance
has
Y mean Y
/n
Y
Other than its mean and variance, the exact distribution
of
is
complicated and depends on the distribution of Y (the population
distribution)
When n is large, the sampling distribution simplifies:
Y E(Y )
var(Y )
Y
(b) Why Use
To Estimate
Y Yis?unbiased:Y E( ) = Y
p
Y is consistent:
Y
Y
min m (Yi m)
solves,
i1
so,
minimizes the sum of squared residuals
Y
optional
derivation (also see App. 3.2)
d n
(Yi m)2
dm i1
d=
dm (Yi m)2
i1
n
2 (Yi m)
i1
Set derivative to zero and denote
optimal value
of m
m
n
n
by
:
1 n
Y m nm
i1
i 1
n
or
i1
=
1-43
Y
Why Use
To Estimate Y,
has a smaller variance than all other linear unbiased
ctd.
Y
n
1
estimators: consider
r,
, where
Y
aithe
Yi estimato
Y
n i 1
{ai} are such
is unbiased; then var(
) var(
)
Y that
Y
(proof:
SW, Ch. 17)
Y
isnt the only estimator of Y can you think of a time you
might want to use the median instead?
NEXT STEPS:
1.The probability framework for statistical inference
2.Estimation
3.Hypothesis Testing
4.Confidence intervals
1-44
Hypothesis Testing
The hypothesis testing problem (for the
mean):
Make a provisional decision based on the
evidence at hand whether a null hypothesis
is true, or instead that some alternative
hypothesis is true.
That is, test
H0: E(Y) = Y,0 vs. H1: E(Y) > Y,0 (1-sided,
>)
H0: E(Y) = Y,0 vs. H1: E(Y) < Y,0 (1-sided,
1-45
Y
Calculating the p-value based
on
act
p-value = PrH0 [| Y Y ,0 || Y Y ,0 |]
WhereY act
is the value
Y of
1-46
PrH [| Y Y ,0 || Y act , Y ,0 |]
0
PrH [|
0
PrH [|
Y Y ,0
Y / n
Y Y ,0
||
Y act Y ,0
Y / n
|]
Y act Y ,0
||
|]
Y
Y
probability under left+right N(0,1) tails
~where
= std. dev. of the distribution of = Y/
=
0
n
1-47
Y
In practice,
n= 1 i1
= sample variance of Y
Fact:
2
Y,
BecausesY
is a sample average; see Appendix 3.3
Technical note: we assume E(Y4) < because here the
average is not of Yi, but of its square; see App. 3.3
1-49
estimated:
act
p-value = PrH [| Y Y ,0 || Y Y ,0, |]
0
= PrH [|
0
Y Y ,0
Y / n
Y Y ,0
||
Y act Y ,0
Y / n
Y act Y ,0
|]
~ PrH [|
||
|]
=
sY / n
sY / n
(large n)
0
so
~
=
probability under normal tails that is outside |tact|
Y Y ,0
where t =
(the usual t-statistic)
sY / n
1-50
1-53
Comments on Student t
distribution, ctd.
2. If the sample size is moderate (several dozen) or large
(hundreds or more), the difference between the tdistribution and N(0,1) critical values is negligible. Here are
some 5% critical values for 2-sided tests:
degrees of freedom
(n 1)
5% t-distribution
critical value
10
2.23
20
2.09
30
2.04
60
2.00
1.96
1-54
1-55
1-56
Ys Yl
ss2
ns
sl2
nl
Ys Yl
SE(Ys Yl )
1-57
For n > 30, the t-distribution and N(0,1) are very close (as n
grows large, the tn1 distribution converges to N(0,1))
The t-distribution is an artifact from days when sample sizes were
small and there were no/few computers
For historical reasons, statistical software typically uses the tdistribution to compute p-values but no difference when the
sample size is moderate or large.
For these reasons, in this class we will focus on the large-n
approximation given by the CLT
NEXT STEPS:
1-58
Confidence Intervals
A 95% confidence interval for Y is an interval
that contains the true value of Y in 95% of
repeated samples.
Note: What is random here? The values of Y1,...,Yn
and thus any functions of them including the
confidence interval.
The confidence interval will differ from one sample
to the next.
The population parameter, Y, is not random; we
just dont know it.
1-59
sY / n
sY / n
1.96} = {Y: 1.96
1.96}
sY
n
= {Y: 1.96 sY Y 1.96sY
= {Y (
1.96
+ 1.96
)}
is
Link to Z Table
1-60
Summary:
From the two assumptions of:
1. simple random sampling of a population, that is,
{Yi, i =1,,n} are i.i.d.
1-62