Introductin To Econometrics

Econ 184: Introduction to Econometrics
(Fall 2015)
Alexandre Poirier
1 Department
of Economics
University of Iowa
8/25/2015
Why Study Econometrics?
Econometrics: the use of statistical methods and economic

theory to study economic problems (using data).
Study causal relationships between economic variables.

I
I
I
I
Economics is a quantitative and predictive science.

Economists often want to determine whether a change in one
variable causes a change in another.
For example, does another year of schooling increase earnings?
Other examples:
I
I
I
Does patent protection help foster innovation?

Does a minimum wage lower employment?
Will universal coverage lower the quality of health care?
More specifically, we might want to

I
Study whether the predictions of an economic model hold true

in reality.
I
Does demand slope downward? Is the stock market efficient?
Quantify the effect of an economic or social program on an

outcome of interest (poverty, inequality, wages, fertility,
educational achievements, innovation).
Forecast economic variables of interest.
I
Next quarter inflation, etc.
This course is geared toward providing you with an introduction to

the methods and tools of econometrics.
Four main examples used in the textbook
1. Does reducing class size improve elementary school education?

2. Is there racial discrimination in the market for home loans?
3. How much do cigarette taxes reduce smoking?
4. Will raising the beer tax reduce traffic fatalities?
I
We will analyze many more examples in class, sections and

problem sets....
Causation vs. Correlation
Does a change in X really cause a change in Y ? Or do they

just co-vary?
To evaluate policies and test theories, we need to establish

causation.
But in the real world, correlation and causation are often very
difficult to separate.
I
I
I
Does drinking red wine reduce the risk of a heart attack?

Does watching Oprah cause stress?
Does smoking cigarettes cause cancer?
Causation vs. Correlation: More Examples
We observe a positive relationship between crime and the

number of police officers.
I
I
We observe that unemployed people who attend a job training

program wait for shorter periods before finding a job.
I
Is it because police officers create crime?

Or (more likely!) is it because more police officers are assigned
to more troublesome neighborhoods?
Is it because the program helped them, or is it because those

who joined the program are the most skilled/motivated ones,
so that they would have waited less anyway?
So how do we establish causation?
Causation vs. Correlation

I
Ideally, we would like to have experimental data: for

example, two identical plots of land, where the same crop is
cultivated, using the same techniques, but where different
fertilizers are used.
Then if the outcome of interest (yield per acre, for example)

is different between the two plots, we can safely infer that a
different fertilizer causes differences in the average yield.
I
Classic example: clinical trials (compliance?).
BUT, in most cases, all what we have is observational data

(you cant have the same person with and without college, or
the same economy with and without a tax cut...).
In general, to find answers, we need econometric skills &

creativity.
Summary: Learning Goals
Conduct Statistical Analysis.
Understand the role of empirical evidence in evaluating

economic problems.
Understand the role of assumptions for the underlying models.
Probability Review: Introduction
Lets begin by introducing the notions of probability,

randomness, and random variables....
The use of probability to measure uncertainty and variability

began hundreds of years ago with the study of gambling.
Generally speaking, probability is the chance that something

(an event) will happen.
The probability of an event or outcome is the proportion of

the time it occurs in the long run - this is called the frequency
interpretation of probability.
Probability Review: Definitions

I
Sample space and Events: The set of all possible outcomes

is called the sample space. An event is a subset of the sample
space.
I
I
I
We typically use to denote the sample space.

Example: Coin Tossing. = {Heads, Tails }.
Example: # of times a computer will crash. = {0, 1, ...},
and event A = {0, 1} (i.e., the computer will crash no more
than once).
Random Variable: Is a numerical summary of a random

outcome.
I
Formally: A random variable is a real valued function, defined

on the set of possible outcomes (i.e. the sample space ),
that assigns a real number to every possible outcome.
Example: Coin Tossing. = {Heads, Tails }. A random
variable could be X {0, 1} such that X = 0 if Heads occur.
There are two major classes of random variables

I
A discrete random variable takes on only a discrete number of

values.
I
Discrete and Continuous
Number of phone calls you will receive today.
A continuous random variable takes on a continuum of values.

I
Amount of time you will spend on the phone.
Probability distribution: Is a number between 0 and 1 than

quantifies how likely is an event to occur. I.e., for an event A,
the number Pr(A) indicates the probability that A will occur.
Probability Review: Discrete Random Variables
We characterize or describe a discrete random variable X

with a probability function (pf).
A pf lists the probability for each possible discrete outcome.
The pf of a discrete random variable is defined as the function

f such that for every real number x,
f (x ) = Pr(X = x )
where X represents a random variable and x represents a
realization of that random variable.
The following slide contains a specific example. Note that the

same information is displayed in the table and graph.
Probability Review: Probability Function
Probability Review: Cumulative Distribution Function

(CDF)
I
Another way of characterizing the distribution of a random

variable is with a cumulative distribution function (cdf)
The cdf lists the probability that a random variable is less

than or equal to a specific value
F (x ) = Pr(X x )
x
Pr(X x )
0
0.8
F is:
I
I
I
F (x ) [0, 1] for all x.

F is non-decreasing.
F is right-continuous.
1
0.9
2
0.96
3
0.99
4
1.0
Probability Review: Cumulative Distribution Function

(CDF)
I
The cumulative distribution may also be referred to as the

distribution function.
x
Pr(X x )
0
0.8
1
0.9
2
0.96
3
0.99
1.2
1
0.8
0.6
0.4
0.2
0
0
4
1.0
Probability Review: Continuous Random Variables

I
A random variable Y that can take on any real value within

some range is a continuous random variable.
I
Time, temperature, height...
For continuous random variables, the probability of a

particular value occurring is equal to zero
Pr(Y = y ) = 0
We typically speak of interval probabilities (i.e. the probability

that Y will take on some subset of values)
Pr(a Y b )
Note that probability zero does not mean impossible.
Probability Review: Probability Density Function (pdf)

I
The probabilities associated with a continuous random

variable Y are determined by the pdf of Y .
The pdf of Y , denoted f (y ), has the following properties:
1. f (y ) 0, for all y .
2. The probability that the uncertain quantity Y will fall in the
interval (a, b ) is equal to the area under f (y ) between a and
b:
P (a < Y < b ) =
Zb
f (y )dy .
3. The total area under the entire curve of f (y ) is equal to 1

Z
f (y )dy = 1
0
1.1
0.9
3.3
3.2
3.1
2.9
2.8
2.7
2.6
2.5
2.4
2.3
2.2
2.1
1.9
1.8
1.7
1.6
1.5
1.4
1.3
0.2
1.2
0.1
0.8
0.2
0.7
0.3
0.5
0.5
0.4
0.6
0.5
0.4
0.4
0.6
0.1
0.7
0.3
0.8
0.2
0.9
3.3
3.2
3.1
2.9
2.8
2.7
2.6
2.5
2.4
2.3
2.2
2.1
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Probability Review: Probability Density Function (pdf)
0.9
0.8
0.6
0.7
0.3
0.1
P(0.8<y<1.6)
Probability Review: CDF of Continuous Random Variables
The definition of a cdf for a continuous random variable is the

same as that of a discrete random variable
F (y ) = Pr(Y y )
With a continuous random variable, the cdf is a continuous

function over the entire real line, so we can write down a
simple formula
F (y ) = Pr(Y y ) =
Zy
f (t )dt
Probability Review: CDF of Continuous Random Variables
Furthermore, it follows that, at each point at which f (x ) is

continuous, the pdf can be calculated as
F 0 (y ) =
dF (y )
= f (y )
dy
We can easily see that

Pr(Y
Pr(y1
> y ) = 1 F (y )
and
< Y y2 ) = F (y2 ) F (y1 )
In practice, the CDF allows us to calculate the probability for

any interval.
Pr(15 < CT 20) = Pr(CT 20) Pr(CT 15) =

F (20) F (15) = .58
Pr(CT > 20) = 1 F (20) = 1 .78 = .22
Probability Review: Measures of Central Tendency for

Distributions
I
Mode: The mode is the value that occurs with the greatest
probability.
I
Example: What is the modal age of students in this class?
Median: The median is the value such that the probability of

the random variable being less than or equal to that value is
at least 50% and the probability of the random variable being
greater than or equal to that value is at least 50%.
I
I
Example: What is the median age of students in this class?

Case 0
Age 19 20 21
%
.4 .2 .4
Case 1
Age 19 20 21 22
%
.4 .2 .2 .2
Case 2
Age 19 20 21 22
%
.4 .1 .3 .2
Probability Review: Mean (Expected Value)
Mean: The mean or expected value of a random variable X is

the weighted average of all its possible outcomes, weighted
by the probabilities of those outcomes.
Unlike the mode, its unique.
For a discrete random variable X ,

E (X ) = x1 Pr(x1 ) + ... + xk Pr(xk ) =
i =1
i =1
xi Pr(xi ) = xi f (xi )
Example: Expected value of throwing a die:
E (X ) = 61 1 + 16 2 + 16 3 + 16 4 + 16 5 + 16 6 =
1
6 (1 + 2 + 3 + 4 + 5 + 6) = 3.5
Probability Review: Mean (Expected Value)
However, one drawback of the mean (relative to the median)

is its sensitivity to outliers
I
When Bill Gates walks into a bar...
For a continuous random variable X with probability density

function f (x ), the mean of X (assuming it exists) is defined as
E (X ) =
xf (x )dx
The expected value or mean of X is typically denoted by

E (X ) or X .
Example
f (x ) = 2x for 0 < x < 1 (and 0 otherwise)
Then
Z
E (X ) =
xf (x ) dx =
Z1
x (2x ) dx =
Z1
2x 2 dx = 32 x 3 |10 =
2
3
Question: whats F (x )?
F (x ) =
Z x
f (t ) dt =
Z x
0
2tdt = t 2 |x0 = x 2 for x 1.
What about the median?

1
1
1
1
2 ? F ( 2 ) = 4 < 2 Nope.
2
3?
F ( 23 ) =
F ( 12 )
4
9
<
1
2
Nope.
1
2
2.5
2
1.5
1
0.5
0
0
0.2
0.4
0.6
0.8
1.2
1.4
Probability Review: Expected Value of a Function

I
We know that if X is a random variable with pdf f (x ) we can

calculate E (X ) as either
k
E (X ) =
xi f (xi )
or E (X ) =
i =1
xf (x )dx
But what if we want to calculate E (X 2 ) or E (ln(X ))?
In general, whats the expected value of a function g () of X ?
It can also be shown that (assuming it exists)

k
E (g (X )) =
g (xi )f (xi )
or E (g (X )) =
i =1
For example, for a continuous distribution

2
E (X ) =
x 2 f (x )dx
g (x )f (x )dx
Probability Review: Measures of Dispersion for

Distributions
I
The variance of a random variable measures the spread or

dispersion of the variable around its mean
i
h
Var (X ) = E (X X )2
In the discrete case, this is simply the weighted average of the

squared deviations of X from its mean
k
Var (X ) =
[xi E (X )]2 Pr(xi )
i =1
For a continuous random variable X with probability density

function f (x ), the variance of X is
Var (X ) =
[x E (X )]2 f (x )dx.
Probability Review: Variance & Moments
Another (equivalent) formula that can be used in either case

is:
Var (X ) = E (X 2 ) (E (X ))2
where the second moment E (X 2 ) is defined as
E (X 2 ) = (x1 )2 Pr(x1 ) + (x2 )2 Pr(x2 ) + ... + (xk )2 Pr(xk )
or
E (X 2 ) =
x 2 f (x )dx
Other (higher order) moments are defined similarly.
Probability Review: Standard Deviation
The variance of X is denoted Var (X ) or 2X .
The standard deviation of X , denoted X , is the square root

of Var (X ).
A large standard deviation and variance means that the

probability distribution is quite spread out: a large difference
between the outcome and the expected value is anticipated.
Probability Review: Example

x
1
2
3
4
5
6
Pr(x )
x Pr(x )
(x )2 Pr(x )
1
6
1
6
1
6
1
6
1
6
1
6
1
6
1
3
1
2
2
3
5
6
1
6
2
3
3
2
8
3
25
6
E (X ) = 3.5
E (X 2 ) = 15 61
k=6
I
I
E (X ) = ki=1 xi Pr(xi ) = 3.5

E (X 2 ) = ki=1 (xi )2 Pr(xi ) = 15 16
Var (X ) = E (X 2 ) (E (X ))2 = 15 16 3.52 = 2.92
X = 1.71
Probability Review: Example

f (x ) = 2x for 0 < x < 1 (and 0 otherwise)
We already know E (X ) = 23 . Whats Var (X )?
Var (X ) =
[x E (X )] f (x )dx =
Z1
Z1

2 2
2xdx
3
1

2x 3 83 x 2 + 98 x dx =
1 4
2x
89 x 3 + 94 x 2
or (using the other formula)

Var (X ) =
E (X 2 ) (E (X ))2
Z1
x 2 (2x ) dx ( 32 )2
Z1
0
2x 3 dx
4
9
1
2x

4 1
0
4
9
1
2
4
9
1
18
1
2
4
9
1
18
Probability Review: Expected Value and Variance of a

Linear Function
If Y = a + bX , then E (Y ) = a + bE (X )
Example: Suppose E (X ) = 5, Y = 3X 5, Z = 3X + 15
E (Y ) = 3 E (X ) 5 = 3 5 5 = 10
E (Z ) = 3 E (X ) + 15 = 3 5 + 15 = 0
If Y = a + bX , then Var (Y ) = b 2 Var (X )
Example: Var (X ) = 5, Y = 3X 5, Z = 3X + 15
Var (Y ) = 9 Var (X ) = 9 5 = 45
Var (Z ) = 9 Var (X ) = 9 5 = 45
These properties can be easily proven using the definitions of
expectation and variance...

Introductin To Econometrics

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Introductin To Econometrics

Caricato da

Copyright:

Formati disponibili

Econ 184: Introduction to Econometrics

Why Study Econometrics?

Econometrics: the use of statistical methods and economic

Study causal relationships between economic variables.

Economics is a quantitative and predictive science.

Does patent protection help foster innovation?

Why Study Econometrics?

More specifically, we might want to

Study whether the predictions of an economic model hold true

Does demand slope downward? Is the stock market efficient?

Quantify the effect of an economic or social program on an

Next quarter inflation, etc.

Why Study Econometrics?

This course is geared toward providing you with an introduction to

Four main examples used in the textbook

1. Does reducing class size improve elementary school education?

We will analyze many more examples in class, sections and

Causation vs. Correlation

Does a change in X really cause a change in Y ? Or do they

To evaluate policies and test theories, we need to establish

Does drinking red wine reduce the risk of a heart attack?

Causation vs. Correlation: More Examples

We observe a positive relationship between crime and the

We observe that unemployed people who attend a job training

Is it because police officers create crime?

Is it because the program helped them, or is it because those

So how do we establish causation?

Causation vs. Correlation

Ideally, we would like to have experimental data: for

Then if the outcome of interest (yield per acre, for example)

Classic example: clinical trials (compliance?).

BUT, in most cases, all what we have is observational data

In general, to find answers, we need econometric skills &

Summary: Learning Goals

Conduct Statistical Analysis.

Understand the role of empirical evidence in evaluating

Understand the role of assumptions for the underlying models.

Probability Review: Introduction

Lets begin by introducing the notions of probability,

The use of probability to measure uncertainty and variability

Generally speaking, probability is the chance that something

The probability of an event or outcome is the proportion of

Probability Review: Definitions

Sample space and Events: The set of all possible outcomes

We typically use to denote the sample space.

Random Variable: Is a numerical summary of a random

Formally: A random variable is a real valued function, defined

Probability Review: Definitions

There are two major classes of random variables

A discrete random variable takes on only a discrete number of

Discrete and Continuous

Number of phone calls you will receive today.

A continuous random variable takes on a continuum of values.

Amount of time you will spend on the phone.

Probability Review: Definitions

Probability distribution: Is a number between 0 and 1 than

Probability Review: Discrete Random Variables

We characterize or describe a discrete random variable X

A pf lists the probability for each possible discrete outcome.

The pf of a discrete random variable is defined as the function

The following slide contains a specific example. Note that the

Probability Review: Probability Function

Probability Review: Cumulative Distribution Function