Chapter 3

Introduction
to Econometrics
Chapter 3
Review of Statistics
Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Remember we said econometrics =
disciplined data analysis + statistical inference

1-2
Three Main Types of Statistical Methods
Estimation
Computing best guess for unknown characteristic of
population distribution
Often a function of a sample of data randomly drawn from
population
e.g. mean or variance
Hypothesis Testing
Formulating hypothesis about a population then using data to
learn/infer whether it is true
Confidence Intervals
Using a set of data to estimate an interval or range for an
unknown population characteristic

1-3
Review of Probability and Statistics
(SW Chapters 2, 3)
Empirical problem: Class size and educational

output
Policy question: Does class size have a significant impact
on educational outcomes?
What is the effect of reducing class size by one student
per class on test scores?
We must use data to find out
Is there any way to answer this without data?

1-4
The California Test Score Data Set
K-6 and K-8 California school districts (n = 420)
Variables:
5th grade test scores (Stanford-9 achievement test,
combined math and reading), district average
Student-teacher ratio (STR) = no. of students in the
district divided by no. full-time equivalent teachers

1-5
Initial look at the data:
This table doesn t tell us anything about the relationship

between test scores and the STR.

1-6
Do districts with smaller classes have
higher test scores?
Scatterplot of test score v. student-teacher ratio

1-7
We need to get some numerical evidence
on whether districts with low STRs have
higher test scores but how?
1. Compare average test scores in districts with low

STRs to those with high STRs ( estimation )

1-8
1. Estimation
nGroupA nGroupB
1 1
YGroupA YGroupB = Y i Y i
nGroupA i=1
nGroupB i=1
Two main questions :
1) Is this a large difference in a statistical sense?

i.e. Can we say with certainty the difference is not
explained by sampling)?
2) Is this a large difference in a real-world sense?

i.e. Is the difference economically significant
BUTWhy Do We Use Y To Estimate Y?
Want the estimator to get as close as possible to

unknown true value
Want its sampling distribution to be as tightly centered
on unknown value as possible
Desirable characteristics of an estimator:

- 1. Unbiasedness: If you took a bunch of samples
on average the mean estimator would be right
answer
- Recall: Central Limit Theorem

1-10
Desirable characteristics of an estimator:
- 2. Consistency Probability that estimator is

within small interval of true value approaches 1
as sample size increases
- i.e. When N is large, uncertainty about the estimator
due to random variations in the sample is small.
- Recall: Law of Large Numbers
- 3. Variance and efficiency

- Assuming multiple estimators are consistent and
unbiased.choose the one with tightest sampling
distribution
- i.e. Minimum variance

1-11
Your old buddyCentral Limit Theorem (CLT):
2
If (Y1,,Yn) are i.i.d. and 0 < Y < , then when n is large the
distribution of sample mean is well approximated by a
normal distribution.
2

Y is approximately distributed N(Y, Y )
n
2

Normal with mean( Y ) and var ( Y / n)
( Y Y)/Y is approximately distributed N(0,1)

(standard normal)
As sample size grows variance shrinks

2-12
Y as Least Squares Estimator for Y
Consider the problem of finding the estimator m

that minimizes the sum of squared errors:
n
2
(Y
i m)
i=1
You dont have to know the proof but it is:

d n 2
n n
(Yi
m) = 2 (Yi m) = 2Yi + 2nm
dm i=1 i=1 i=1
Set equal to zero for:

n n
1
2Yi + 2nm = 0 => m = Yi
i=1 n i=1
1-13
Initial data analysis: Compare districts with
small (STR < 20) and large (STR 20) class sizes:
Class Size Average score Standard deviation n

( Y ) (sBYB)
Small 657.4 19.4 238

Large 650.0 17.9 182
1. Estimation of = difference between group

means

1-14
1. Estimation - Is this difference large?
nlarge
1
nsmall
1
Ysmall Ylarge = Y Y i
nsmall i=1
i nlarge i=1
= 657.4 650.0 = 7.4

The average score for small class districts is about
1.1% higher than large class districts
Difference between 60th and 75th percentiles of

test score distribution is 667.6 659.4 = 8.2
Variation within groups much larger:

St Dev across small class districts = 19.4
St Dev across large class districts = 17.9
1-15
OK so you find that average test score for students
in smaller classes is higher than the average test
score for students in larger classes. We can thus
conclude that smaller classes lead to higher test
scores.
!
A. True'
B. False'

2. Hypothesis testing
The hypothesis testing problem (for the mean):

- Based on the sample data, test whether a null
hypothesis is true.or instead that some
alternative hypothesis is true.
H0: E(Y) Y,0 vs. H1: E(Y) > Y,0 (1-sided, alt is >)
H0: E(Y) Y,0 vs. H1: E(Y) < Y,0 (1-sided, alt is <)
H0: E(Y) = Y,0 vs. H1: E(Y) Y,0 (2-sided, alt is > or <)

1-17
Hypothesis testing terminology:
Type I error is the incorrect rejection of a true

null hypothesis
"false positive
We reject the null hypothesis when it is actually true
Type II error is incorrectly retaining a false null

hypothesis
"false negative
We accept the null hypothesis when it is actually
false

7-18
More hypothesis testing terminology:
The significance level of a test is a pre-specified

probability of incorrectly rejecting the null, when the
null is true.
Pre-specified allowance for probability of a Type I error
Critical value of test statistic is the value of the test
statistic needed to reject null at a given significance
level
The power of the test is the probability that the test

correctly rejects the null when it is false.
As power increases, the probability of Type II error
decreases

1-19
Comments on Students t distribution
If the sample size is moderate (several dozen) or large

(hundreds or more), the difference between the t-distribution
and N(0,1) critical values is negligible. Here are some 5%
critical values for 2-sided tests:
degrees of freedom 5% t-distribution

(n 1) critical value
10 2.23
20 2.09
30 2.04
60 2.00
1.96

1-22
Difference-in-means - unpooled variance
YA YB d0 YA YB d0
t= =
SE(YA YB ) s 2A
+ sB2
nA nB
where d0= hypothesized difference between

group means (often 0)
where SE(YA YB ) is the standard error
of (YA YB ) , the subscripts A and B refer to
the two groups of interest
where s2 = sample variance for each group
unpooled means dont assume same variance
Difference-in-means - pooled variance
YA YB d0 YA YB d0
t= =
SE(YA YB ) S(Y ) 1 + 1
i n n A B
where d0= hypothesized difference between group

means (often 0)
where SE(YA YB ) is the standard error of (YA YB )
where S(Yi) = sample standard deviation when both
groups are pooled together (this formula is a bit
involved )
pooled means assume same variance
We need to get some numerical evidence
on whether districts with low STRs have
higher test scores but how?

2. Test the null hypothesis that the mean test

scores in the two types of districts are the same,
against the alternative hypothesis that they
differ ( hypothesis testing )

1-25

( Y ) (sBYB)
Small 657.4 19.4 238

Large 650.0 17.9 182
1. Estimation of = difference between group means
2. Test the hypothesis that = 0

1-26
Ceteris!paribus!,!which!of!the!following!is!most!accurate?!
A. As'the'dierence'in'average'test'scores'
between'large'and'small'classrooms'
increases''you'are'less'likely'to'reject'the'
null'
B. The'dierence'in'average'test'scores'
between'large'and'small'classrooms'has'
no'impact'on'your'likelihood'of'accep@ng'
or'rejec@ng'the'null'
C. As'variance'in'large'classroom'test'scores'
falls'B'you'are'less'likely'to'reject'the'null'
D. As'variance'in'large'classroom'test'scores'
falls'B'you'are'more'likely'to'reject'the'null'

Compute the difference-of-means t-statistic:
Size Y sBYB n
small 657.4 19.4 238
large 650.0 17.9 182
Ys Yl d0 657.4 650.0 0 7.4

t= = = = 4.05
ss2
+ sl2 19.42
+ 17.92 1.83
ns nl 238 182
|t| > 1.96, so reject (at the 5% significance level)

the null hypothesis that the two means are the
same.

1-28
OK so you find that the calculated |t| is greater
than 1.96. So you conclude that class size is
economically significant with respect to test scores.
!
A. True'
B. False'

More hypothesis testing terminology:
p-value = probability of drawing a statistic (e.g. Y )
at least as adverse to the null as the value actually
computed, assuming the null hypothesis is true.
i.e. probability that the difference between some
random sample mean and the hypothesized population
mean is GREATER than the difference between the
actual sample mean we observe and the hypothesized
population mean
Calculating the p-value based on Y :

p-value = PrH [| Y Y ,0 |>| Y act Y ,0 |]
0
where Y act is the value of Y actually observed

1-30
Calculating the p-value with Y known:
For large n, p-value = the probability that a N(0,1)

act
random variable falls outside |( Y Y,0)/ Y |
In practice, Y is unknown it must be estimated

1-31
What is the link between the p-value and
the significance level?
The significance level is prespecified. For

example, if the prespecified significance level is
5%,
you reject the null hypothesis if |t| 1.96.
Equivalently, you reject if p 0.05.
The p-value is sometimes called the marginal
significance level.
Often, it is better to communicate the p-value
than simply whether a test rejects or not the p-
value contains more information than the yes/
no statement about whether the test rejects.

1-32
Students t distribution for
small samples
- If Yi, i = 1,, n is i.i.d. N(Y, ), then the t-

statistic has the Student t-distribution
Y2 with n 1
degrees of freedom.
- The critical values of the Student t-distribution is
tabulated in the back of all statistics books.
Remember the recipe?
1. Compute the t-statistic
2. Compute the degrees of freedom, which is n 1
3. Look up the 5% critical value
4. If the t-statistic exceeds (in absolute value) this critical
value, reject the null hypothesis.

1-33
Students t distribution for
small samples
So the Student-t distribution is highly relevant when

the sample size is very small;
- BUT - for it to be correct, you must be sure that
the population distribution of Y is normal.
- In economic data, the normality assumption is
rarely credible.
- Earnings
- Financial returns

1-34
1-35
3. Confidence Interval
An X% confidence interval for Y is an interval

that contains the true value of Y in X% of
repeated samples.
Digression: What is random here? The values of

Y1,...,Yn and thus any functions of them
including the confidence interval.
The confidence interval will differ from one sample

to the next. The population parameter, Y, is not
random; we just don t know it.

1-36
95% Confidence Interval
A 95% confidence interval for the difference

between the means is:
( YA Y ) 1.96 SE( YA YB )
B
Two equivalent statements:

1) The 95% confidence interval for doesnt include 0;
2) The hypothesis that = 0 is rejected at the 5%
level.

90 and 99% Confidence Intervals

between the means is:
( YA Y ) 1.64 SE( Y Y )
B A B

between the means is,
( YA YB ) 2.58 SE( YA YB )

We need to get some numerical evidence on
whether districts with low STRs have higher test
scores but how?


scores in the two types of districts are the same,
3. Estimate an interval for the difference in the

mean test scores, high v. low STR districts
( confidence interval )
1-39

( Y ) (sBYB)
Small 657.4 19.4 238

Large 650.0 17.9 182
1.Estimation of = difference between group

means
2.Test the hypothesis that = 0
3. Construct a confidence interval for

1-40
Summary:
From the two assumptions of:

1. simple random sampling of a population, that is,
{Yi, i =1,,n} are i.i.d.
2. 0 < E(Y4) <
we developed, for large samples (large n):

Theory of estimation (sampling distribution of Y )
Theory of hypothesis testing (large-n distribution of t-
statistic and computation of the p-value)
Theory of confidence intervals (constructed by inverting
the test statistic)
Are assumptions (1) & (2) plausible in practice? Yes

1-41
Let s go back to the original policy
question:
What is the effect on test scores of reducing STR by one
student/class?
Have we answered this question?

1-42
Excel: We want to know whether studying
results in higher exam grades.
1. Compare average test scores for students with

low Hours Studied to those with high Hours
Studied ( estimation )

scores in the two student groups are the same,
3. Estimate an interval for the difference in the

mean test scores, high v. low Hours Studied
( confidence interval )
OK so you conduct a hypothesis test and find
that the average exam grade for students who
study a lot is statistically significantly higher
than the average exam grade for students who
study a little. We can thus conclude that
studying more leads to higher exam grades.
!
A. True'
B. False'

Internal validity: the statistical inferences
about causal effects are valid for the
population being studied.
Often complicated by omitted variables bias

and/or selection bias
Omitted variables bias means - there variables

missing which are predictive of Y but
correlated with X (Chapter 6)
Selection bias occurs when there are inherent

differences between groups, which are
predictive of outcome, that have not been held
constant
Determining causality
Selection bias can be thought of as an

omitted variables biasjust individuals are
opting into the groups
We have selection bias when the average

outcome without intervention differs
between groups
Difference in group means =

Average causal effect + Selection bias
Determining causality
To the extent that the selection bias can be

fully explained by observable variables
we can use regression analysis.
What variables might we start with?
Will this get get to fully to ceteris paribus?

Chapter 3

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chapter 3

Caricato da

Copyright:

Formati disponibili

Introduction

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Empirical problem: Class size and educational

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

K-6 and K-8 California school districts (n = 420)

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

This table doesn t tell us anything about the relationship

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Scatterplot of test score v. student-teacher ratio

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

1. Compare average test scores in districts with low

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Two main questions :

1) Is this a large difference in a statistical sense?

2) Is this a large difference in a real-world sense?

Want the estimator to get as close as possible to

Desirable characteristics of an estimator:

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

- 2. Consistency Probability that estimator is

- 3. Variance and efficiency

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

( Y Y)/Y is approximately distributed N(0,1)

As sample size grows variance shrinks

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Consider the problem of finding the estimator m

You dont have to know the proof but it is:

Set equal to zero for:

Class Size Average score Standard deviation n

Small 657.4 19.4 238

1. Estimation of = difference between group

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

= 657.4 650.0 = 7.4

Difference between 60th and 75th percentiles of

Variation within groups much larger:

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

The hypothesis testing problem (for the mean):

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Type I error is the incorrect rejection of a true

Type II error is incorrectly retaining a false null

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

The significance level of a test is a pre-specified

The power of the test is the probability that the test

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

If the sample size is moderate (several dozen) or large

degrees of freedom 5% t-distribution

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

where d0= hypothesized difference between

where d0= hypothesized difference between group

1. Compare average test scores in districts with low

2. Test the null hypothesis that the mean test

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Class Size Average score Standard deviation n

Small 657.4 19.4 238

1. Estimation of = difference between group means

2. Test the hypothesis that = 0

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Ys Yl d0 657.4 650.0 0 7.4

|t| > 1.96, so reject (at the 5% significance level)

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Calculating the p-value based on Y :