Sei sulla pagina 1di 24

Métodos Econométricos

Interpreting and Comparing Regression Models

September, 2022

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 1 / 24


Catch Up
So far we have seen the OLS estimator. And we have concluded that it
can be Best Linear Unbiased Estimator, under a specific set of
assumptions.
We have seen that intuitively its simplest formulation means finding
the best fitting line (straight line).

Today we will see that we can relax that, as long as the model is linear
in parameters.
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 2 / 24
Preliminaries

The symbol ∆ stands for change. So ∆xj stands for a change in the
variable xj
∆x /x is a relative change
Relative changes are always measured in percentage while “change” is
measured in the same unit as the original variable
There is only one exception to the above rule: if a variable is in
percentage then its change is in “percentage points”
Example: If an interest rate increases from 20% to 21% it increases by
1 percentage point (change) or 5% (relative change)

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 3 / 24


Marginal effects

We are often interested in understanding how a small change in an


independent variable (say variable xj ) impacts on the dependent
variable y while holding all other variables constant
We know that if E (y |xi ) = f (xi ) then

∂f (x)
∆E (y |x) ≈ ∆xj
∂xj

Thus, we can conclude that when xj changes by one unit, E (y |x)


changes by ∂f∂x(x)
j
holding every other variables constant

The above is known as the partial effect of variable xj on E (y |x)


The “all other variables constant” condition is known as the ceteris
paribus condition

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 4 / 24


Marginal effects

Sometimes we want to understand the impact of a (ceteris paribus)


relative change of xj in terms of a relative change on E (y |x). This is
called an elasticity

∆E (y |x) ∂f (x) xj ∆xj ∂log(f (x)) ∆xj


≈ =
E (y |x) ∂xj f (x) xj ∂log(xj ) xj

(the last equality is true if E (y |x) > 0 and xj > 0)


A semi-elasticity measures the impact of a (ceteris paribus) change in
xj in terms of a relative change on E (y |x)

∆E (y |x) ∂f (x) 1 ∂log(f (x))


≈ ∆xj = ∆xj
E (y |x) ∂xj f (x) ∂xj

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 5 / 24


Interpreting the linear model: Level-Level Model

How do we interpret the coefficients of the model?

E (yi |xi ) = β1 + β2 x2i + β3 x3i + ... + βk xki


∂E (yi |xi )
Since ∂xji = βj then the partial effect of xj equals βj .

In other words: when xj increases one unit E (yi |xi ) increases βj units,
ceteris paribus
j x
But note that the elasticity of E (y |x) with respect to xj equals βj f (x)
and the semi-elasticity is βj /f (x). Both depend on specific values of x.
In this case the partial effect of xj has the “simpler” interpretation

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 6 / 24


Interpreting the linear model: Beyond the Linear Best
Fitting Line
In this graph a linear best fitting line seems broadly inappropriate.

One can resort to polynomial functions of the variable x to improve


the fitting and understanding of the relationship.
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 7 / 24
Interpreting the linear model: Beyond the Linear Best
Fitting Line
Two models with a different implied function for the relationship:
Straigh line: E (tricepsi |agei ) = β1 + β2 agei
3rd order Polynomial: E (tricepsi |agei ) = β1 + β2 agei + β3 age2i + β4 age3i

Both are linear in parameters so ok to use with OLS, but the polynomial
seems to do a better fitting.
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 8 / 24
Interpreting the linear model: Polynomial Regressors
Taylor Approximation tell us that any function can be well
aproximated by a polynomial function:

∂f (xi ) 1 ∂ 2 f (xi )
f (xi ) ≈f (a) + (xi − a) + (xi − a)2 +
∂x xi =a 2! ∂xi2 x =a
1 ∂ 3 f (xi )
+ (xi − a)3 + . . .
3! ∂xi3 xi =a

which if one thinks in an approximation around 0:

f (x ) ≈ f (0) + β2 xi + β3 xi2 + β4 xi3 + . . .

This is linear in parameters.


So we can use polynomials of the variable to approximate the function
that relates two variables, and obtain an approximation of the
marginal effect, even if we don’t know the actual relationship!
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 9 / 24
Interpreting the linear model: Polynomial Regressors

But what is the partial effect of x2 if

E (yi |xi ) = β1 + β2 x2i + β3 x2i2 + ... + βk xki


(yi |xi )
Now ∂E∂x 2i
= β2 + 2β3 x2i and thus the partial effect of x2 depends
of the value of x2
And what about the case

E (yi |xi ) = β1 + β2 x2i + β3 x2i x3i + ... + βk xki


(yi |xi )
Now ∂E∂x 2i
= β2 + β3 x3i and the partial effect of x2 depends on the
value of x3 .

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 10 / 24


Interpreting the linear model: Log-Log and Log-Level Model

How do we interpret the coefficients of the model?

E (log(yi |xi )) = β1 + β2 log(x2i ) + β3 log(x3i ) + ... + βk log(xki )


(log(yi |xi ))
Here ∂E∂log(x 2i )
= β2 and thus in this model the coefficients have a
direct interpretation as elasticies
Similarly in the model

E (log(yi |xi )) = β1 + β2 x2i + β3 x3i + ... + βk xki

i |xi ))
We have ∂E (log(y
∂x2i = β2 and the coefficients can be read directly as
semi-elasticities

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 11 / 24


The case of the US Exponential Growth

Many economic variables display exponential growth.

We can model this exponential growth as:

yt = y0 × e β2 t

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 12 / 24


The case of the US Exponential Growth
Notice that rewritting the equation of exponential growth in logs
makes it easier to estimate:
log(yt ) = β1 +β2 t
|{z}
=log(y0 )

In log form it becomes as the transformation indicates broadly linear

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 13 / 24


Regression on time-series

Regressions with time-series values often include time (t) as a regressor


(for simplicity we ommit other covariates)

E (yt |t) = β1 + β2 t

t assumes consecutive discrete values such as t = 1, 2, 3, 4, 5, ...


Assuming that t stands for years β2 can be interpreted as the “average
annual change” in y

E (log(yi |t) = β1 + β2 t

If t stood for years β2 would be an estimate of the “average annual


(continuous) growth rate”

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 14 / 24


The case of the US Exponential Growth
Fitting the model: log(yt ) = β1 +β2 t
|{z}
=log(y0 )

So if we run the above regression we obtained:


log(GDPt ) = β1 + 0.036139 × t
But if the US GDP has been growing at 3,68% per year in the 1790-2012
period, why the coefficient is 0.036139?
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 15 / 24
Log-Points Scale
The answer is the coefficient is expressed in log-points.
The log points scale is the log of the ratio of the variable:

GDPt
 
log = 0.036139
GDPt−1 | {z }
=β2

So we can say the US GDP is growing at 3,6139 log-points per year, or


3,68%.
The way to transform log-points in percentage changes is:
∆GDP
= e β2 − 1 (1)
GDPt−1

The log points scale is just another formulation to express changes in a


variable.
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 16 / 24
Log-Points Scale

In general:

log(y0 ) =β1 + β2 x1 + β3 x2 + β4 x3
log(y1 ) =β1 + β2 (x1 + 1) + β3 x2 + β4 x3

Then, we have:
y1
 
log = β2
y0
So we interpret β2 as:
the variable y has changed by 100 × β2 log-points with a 1-unit change
of x1
the variable y has a 100 × (e β2 − 1) percentage change with a 1-unit
change of x1
In general they are quite identical for low β2 .

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 17 / 24


Interpreting the m.e. of discrete variables

If xj is a discrete variable (say a binary variable) then the partial effect


can be computed by comparing E (y |x) at different settings of xj while
holding other variables constant
Suppose that xj is a discrete variable with two possible values of 0 and
1 (dummy variable) and

E (yi |xi ) = β1 + β2 x2i + β3 x3i + ... + βk xki

The marginal effect is now E (yi |xi )xj =1 − E (yi |xi )xj =0 = βj
If the dependent variable is in logs then the relative change
“exp(βj ) − 1” is the partial effect of the dummy

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 18 / 24


Stata command for marginal effects

The Stata command margins calculates marginal effects after most


estimation commands
margins allows you to calculate partial effects, elasticities, or
semi-elasticities for any model.
margins, dydx(*)
margins, eydx(*)
margins, eyex(*)
margins gives you several alternatives to evaluate the “x” value of the
marginal effects (eg: at the mean values of x, for specific values of x,
or as the average marginal effect evaluated for each xi )
margins, dydx(*)
margins, dydx(*) atmean
margins, dydx(*) at(var=value)

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 19 / 24


Selecting the set of regressors

What happens when a relevant variable is excluded?


If the omitted variable is uncorrelated with the included variables the
OLS estimators remain unbiased (although with higher variance)
If the omitted variable is correlated with the included variables then the
OLS estimators become biased - this is called the omitted variable bias

Needlessly including irrelevant variables is not as dangerous. The OLS


estimator remains unbiased but its precision is decreased (has higher
variance)

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 20 / 24


Omitted Variable Bias

Consider the correct model is:

yi = β1 + β2 x1i + β3 x2i + ϵi

But the econometrician only used x1 , thus estimating:

yi = β̃1 + β˜2 x1i + ei

Thus, in this simple linear regression:

Cov (yi , x1i )


β̃ˆ2 =
Var (x1i )

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 21 / 24


Omitted Variable Bias
While the econometrician ommitted a variable, the true model included
it so:
Cov (yi , x1i ) Cov (β1 + β2 x1i + β3 x2i + ϵi , x1i )
β̃ˆ2 = =
Var (x1i ) Var (x1i )

Given: (1) Cov (x , x ) = var (x ); (2) Cov (ϵi , x1i )) = 0; (3)


Cov (β, x ) = 0 then:

Cov (x2i , x1i )


β̃ˆ2 =β2 + β3
Var (x1i )

So in this simple example the bias depends:


the coefficient of the omitted variable in the main regression model
the covariance between the included and ommitted variable.
In multiple variable models the argument is slightly more involving.
Practicioners use this reasoning to infer if ommitting a given variable is
likely to upward/downward bias the results.
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 22 / 24
Selecting the set of regressors
Economic theory should drive the specification
It is ok to have non-significant regressors
If there is no guidance from economic theory and we must choose
between non-nested models we can use the following criteria:
Akaike’s Information Criterion (AIC)
 X 
1 N
2 k
AIC = log ei + 2
N i=1 N

Schwarz Bayesian Information Criterion


 X 
1 N k
BIC = log ei2 + log N
N i=1 N

Models with lower AIC or BIC are preferred


BIC is preferred asymptotically but AIC works better in small samples
Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 23 / 24
Stata commands

AIC and BIC are postestimation statistics


AIC and BIC can be obtained with the estat command. Type
estat ic
immediately after running a regression to obtain AIC and BIC

Interpreting and Comparing Regression Models Métodos Econométricos September, 2022 24 / 24

Potrebbero piacerti anche