Sei sulla pagina 1di 12

ECOM30002 Econometrics

Assignment on time series

Q1.a.
Plot(cars):

We can see a slight upward trend from 1994 to 2005 followed by a downward trend to
2017. There was a large dip around 2007-2008 during the Global Financial Crisis which
makes sense given the economic downturn of the time. The existence of a trend is evidence
of dependence.

Consumer confidence index:


There is no obvious trend in the consumer confidence index. It seems to be mean reverting.
However around the time of the GFC there is a large dip downwards in consumer
confidence. The lack of an obvious trend suggests that there is no evidence of dependence.

b. Autocorrelation function for cars:

There is strong evidence that this variable has dependence over time as every lag in the
above ACF has breached the 5% significance dotted line. This says that there is evidence to
suggest that the residual for each lag has correlation with the earlier data at a 5% level of
significance. The second spike from the left suggests that there is a significance relationship
between 1 .

c.
Above are the AIC (Akaike Information Criterion) for the first 12 lags respectively of the
AR(p) model for cars where p beggan at 1 and moved through to 12. It is evident that the
smallest (most ideal) AIC value corresponds to a lag of 4 periods. This lag length balances
the best trade-off between inclduing additional relevant data into the mdoel and keeping
the model simple. Additional lags were penalised and hence yielded greater AIC values.

Below is the acf for our model with chosen lag length of 4.

We can see that besides the first spike there are no other spikes breaching the 5%
significance dotted lines providing evidence that there is no significant correlation between
the residuals of our AR(4) model and the further lags. Hence our chosen lag length of 4
seems to be well suited to the data. The estimate for the AR(4) model is below:

d.
Our results suggest that a regression with 1 lag of the consumer confidence index and 4 lags
of the cars variable is the optimal choice for this data. Can see why this is the selected lag
lengths below.

Our AIC with each of the 144 combinations for lag lengths of 1-12 for consumer confidence
and lag lengths 1-12 of the sales of cars suggests that using 1 lag for consumer confidence
and 4 lags of car sales is our best model to forecast future sales. This is because it
corresponds to the lowest output value in our AIC table above. It has the best trade-off
between goodness of fit and the complexity of the model.

Additionally, the graph below shows the AIC output with a set lag length of 4 for car sales
and iterating through 1-12 for consumer confidence lag lengths. We can see that the
minimum AIC occurs at the very first lag.
We can see above that apart from lag 0 where the dependent variable is fully correlated
with itself at the same time period, there is no residual autocorrelation at the 5% level of
significance since no other bars present exceed the dotted blue lines. Therefore, AR(4,1) is
an appropriate model. The model is shown below:

We can see from this regression that the first lag of car purchases is highly significant (at the
.001 level of significance since p-value is much less than 0.001)

Additionally, lag 4 for car purchases and the 1 lag for consumer confidence are both
significant at the 5% level with p-values<0.05.

The coefficients in ARDL(4,1) tell us that on average, holding all else constant, an additional
car sale in the previous period causes an increase of 0.626 cars in the period of interest.
While an additional unit of consumer confidence on the index lead to approximately an
increase of 31 cars in the current period.

e. cars on 27th April for the ARDL(4,1) model:

2017= 76.336 1 + 0.623 37032 + 0.1048 37479 + 0.0698 39049 + 0.126


40440 + 30.87 113.2 = 38253.57

For the AR(4):

2017 = 3203 1 + 0.06397 37032 + 0.1112 37479 + 39049 0.0681 + 0.112


40440 = 38237.29 cars

2) a)

i) Economic theory tells us that the price of a good directly impacts the quantity
demanded of a good. Therefore, price should have a causal effect on quantity
demanded. The demand curve in economics is downward sloping. In other
words, a higher price leads to lower quantity demanded. Hence, the causal
effect will likely be negative.
ii) Price and quantity are determined simultaneously in the market equilibrium.
Therefore, not only does price likely influence quantity, but also, quantity
likely influences price. Therefore, quantity should have a causal effect on
price. This effect is likely to be negative, since an increase in the quantity, all
else equal, should reduce scarcity in the market and therefore decrease price,
and visa versa.
b)
i) The coefficient of log 1 1 = 1.246 and is significant at the 5% level,
suggesting that it is a good estimate for the correlation between cigarette
price and quantity demanded. The negative value on the slope coefficient
indicates that price has an inverse effect on quantity demand, as was
expected. Since there is a log-log relationship, the interpretation of the
coefficient is in elasticity terms. Therefore, the estimated price elasticity of
demand is 1 = 1.246. In other words, a 1% increase in price is correlated
with a 1.246% decrease in demand.
ii) The prediction for log(Q) is:
log() = 10.496 1.246 log 10
log() = 7.627
The units of measurement in this case is log units. In other words, a $10 price
per pack of cigarettes corresponds to, on average, 7.627 of log(cigarettes
demanded per capita per year).
c)
i. The coefficient for the causal equation (2) is likely to be different to the estimate
of the coefficient in equation (1). This occurs because a shock in the quantity of
cigarettes ( ) is likely correlated with the price of cigarettes ( ), since price and
quantity simultaneously affect each other. Therefore, there is likely simultaneity
bias.
The causal equation is:
log( ) = 0 + 1 log( ) +

E[log( )| ] = 0 + 1 log = 0 + 1 log + [ | ]


using LIE (1) and (2)
Since and are correlated (explained above), [ | ] = 0 + 1 log
0 + 1 log( ) = 0 + 1 log + 0 + 1 log( )
0 + 1 log = (0 + 0 ) + (1 + 1 ) log( )
Therefore, 1 = 1 + 1
The correlation between a shock in the quantity of cigarettes and the price of
cigarettes is likely to be negative, since a higher quantity in the market will, all
else equal, reduce scarcity and decrease price, as explained in (a) (ii). Therefore
1 < 0. Since 1 < 0, this suggests that |1 | > |1 | 1 < 1.

ii. This suggests that using 1 = 1.246 as the causal estimate for the impact of
cigarette prices on the quantity of cigarettes demanded is incorrect. The causal
impact of price should be lower in magnitude than what is estimated in equation
(1). This is because the estimate in equation (1) contains simultaneity bias
(explained in part i.).

d)

i. Relevance: = 0 + 1 where 1 0
Exogeneity: [ | ] = 0

ii. Relevance: it is likely that tax is a relevant to the price of cigarettes. Intuitively,
an increase in taxes for suppliers will be passed on, at least in part, to consumers
through an increase in the price of cigarettes. Therefore, higher taxes are likely
correlated with higher prices, indicating that Tax is a relevant IV.

Exogeneity: It is likely that demand shocks (U) are uncorrelated with taxes. That
is, sales tax can only affect the quantity demanded of cigarettes indirectly,
through the price of cigarettes, and not other demand shocks. This is plausible.
Tax can only really impact the demand for through changing the price of
cigarettes. Moreover, the choice of tax on cigarettes is often driven by political
considerations, not the demand for cigarettes. As a result, it is likely that tax is
uncorrelated with demand shocks, and therefore exogeneity likely holds.

e)

i) 1 = 1.118. The sign is negative, which again makes sense due to a



downward sloping demand curve in economic theory. Moreover, | 1 | =
1.118 < | 1 | = 1.246. This result makes sense, as explained in part ci). The
results are also significant at the 5% level. The above indicate that Tax is likely a
valid IV for Price, since they are consistent with the theoretical impact of an IV in
reducing simultaneity bias.

ii) The 95% confidence interval is with 48-2=44 degrees of freedom:


1 (2.013)(
1 ) < 1 <
1 + (2.013)( 1 )
1.118 (2.013)(0.287) < 1 < 1.118 + (2.013)(0.287)
1.696 < 1 < 0.540
Therefore, we can say with 95% certainty that the true causal value of 1lies
between -1.696 and -0.540. This indicates that it is highly likely that the price of
cigarettes has a negative influence on the quantity of cigarettes demanded.

The standard errors used are heteroskedastic standard errors. This accounts for
any pattern/correlation in the residual. Since heteroskedastic standard errors are
equally relevant if the errors are homoscedastic, the standard errors used are
valid. If non-heteroskedastic standard errors were used, then, if there is
homoscedasticity in the model, the standard errors will be incorrect, giving
incorrect confidence intervals. This can either increase the standard errors and
increase the confidence interval length or decrease the standard errors and
decrease the confidence interval length. There is no one direction of effect when
incorrectly using homoscedastic standard errors.

iii) 1 = 1.118 > 1 = 1.246 and |


1 | = 1.118 < | 1 | = 1.246. This is
consistent with the relationship discussed in part (b)(i), as was expected. This
indicates that the use of tax and 2 as IVs for price is likely valid, since it
reduces the bias in coefficient of log(p) caused by simultaneity.

f)

i) Relevance condition:
Since there are multiple IVs in the 2SLS regression, a Wald (joint) test should be
used to assess the relevance of the IVs and therefore the explanatory power of
the regression as a whole. The results of the Wald test in (C) table 1.2 give a p-
value of 0.000. Therefore, we may reject the null hypothesis at all levels of
significance and conclude that 1 0 and/or 2 0. As such, we may conclude
that, as a whole, tax satisfies the relevancy condition for being an IV. Here, the
Chi-squared with 2 degrees of freedom was used since there are 2 IVs being
tested. Column C of table 1.1 suggests that there is a significant coefficient for
the effect of 2 on logP at the 5% level of significance. This is more likely to be
a relevant IV than tax, which is not significant.

Exogeneity:
Since we have more IVs than endogenous variables, we can use an over
identification test. The first step is to conduct a regression of the residuals (U)
onto tax and 2 (column D from table 1.1). This gives estimated coefficients of
1 = 0.019 and
2 = 0.002, both not significant at the 5% level. The next step
is to conduct a Joint Wald test to determine the joint significance of the IVs.
Here, the degrees of freedom = dim(IV matrix) dim(independent variable
matrix) = 2-1=1. Therefore, 1 degree of freedom should be used. Table 1.2 (D)
gives a p-value of 0.641. Therefore, we cannot reject the null at any significance
level. As such, we are unable to conclude that a correlation exists between the
residuals and tax and/or 2 . This does not guarantee exogeneity, but rather
fails to determine endogeneity.

ii) The results indicate that at least one of the IVs is relevant to the price of
cigarettes. Moreover, we fail to reject exogeneity. We are still not able to
conclude exogenity. However, when combined with the theory about the
exogeneity of tax on demand, it is reasonable in this case to assume that it is
exogenous. Therefore, it is likely an appropriate IV choice. Using an appropriate
IV will lead to a better causal estimate for 1. This suggests that column (B) in
table 1.1 provides a better estimate of the impact of price on the quantity of
cigarettes demanded than that given in (A) table 1.1.

g)

The prediction table 1.1 (A) is 1 = 7.627 whilst the prediction using the 2SLS regression
1 = 7.3087. Under the assumption that the IVs used in the causal
in table 1.1 (B) is
equation in (B) are good, The 2SLS regression (B) removes the simultaneity bias present
in (A). This leads to a coefficient which gives an estimate for the causal effect of price on
quantity. However, when determining an estimate for the quantity traded of a good, the
impact of price on quantity is only one determining factor. Consideration also needs to
be given to the impact of quantity on price. When there was simultaneity bias (A), the
regression was considering both forces together, which interact to give the estimate for
quantity sold. This is more accurate than removing the simultaneity bias (B), and
therefore removing the effect of quantity on price from the equation, since it gives an
estimate only based on the demand for cigarettes. As a result, it would not be preferable to
replace the prediction from the OLS equation in (A), which contains bias, with the 2SLS
prediction in (B), which has removed bias, using IVs.
R Code:

library(MASS)
library(AER)
library(texreg)
library(xtable)
library(dynlm)
dt <- read.csv("Cars.csv")
cars <- ts(dt$Cars,frequency = 12, start=c(1994,1), end=c(2017,3))
ts.plot(cars, gpars=list(xlab="year", ylab="# passenger car sales"))
cc <- ts(dt$CC, frequency = 12, start=c(1994,1), end=c(2017,3))
ts.plot(cc, gpars=list(xlab="year", ylab="Consumer confidence index"))
acf(cars)

maxlag <- 12
aic <- matrix(nrow=maxlag,ncol=1)
for (j in 1:maxlag){
aic[j,1] <- AIC(dynlm(cars~L(cars,1:j), start = c(1995,1), end=c(2017,3)))
}

plot(aic, type="n", ylim=c(min(aic),max(aic)))


text(1:maxlag, aic[,1], 1:maxlag)

print(which.min(aic))

ar1 <- dynlm(cars~L(cars,1:which.min(aic)))


acf(ar1$residuals)
write.csv(as.data.frame(summary(ar1)$coefficients), file="ar1_as3.csv")

aic2 <- matrix(nrow=maxlag,ncol=maxlag)


for (j in 1:maxlag){
for (i in 1:maxlag){
aic2[j,i] <- AIC(dynlm(cars~L(cars,1:j)+L(cc,1:i), start = c(1995,1), end=c(2017,3)))
}
}

mininrow <- matrix(nrow=maxlag, ncol = 1)


for (j in 1:maxlag){
mininrow[j] <- which.min(aic2[j,])
}

aic3 <- matrix(nrow=maxlag,ncol=1)


for (j in 1:maxlag){
aic3[j,1] <- AIC(dynlm(cars~L(cars,1:4)+L(cc,1:j), start = c(1995,1), end=c(2017,3)))
}
plot(aic3, type="n", ylim=c(min(aic3),max(aic3)))
text(1:maxlag, aic3[,1], 1:maxlag)

plot(aic2[,2], type="n", ylim=c(min(aic2),max(aic2)))


text(1:maxlag, aic2[,2], 1:maxlag)

ardl <- dynlm(cars~L(cars,1:4)+L(cc,1))


acf(ardl$residuals)

beta.hat <- ardl$coefficients

n <- length(cars)
Zn <- c(1,cars[n], cars[n-1], cars[n-2], cars[n-3], cc[n])
cars.2017.4 <- beta.hat %*% Zn

beta2.hat <- ar1$coefficients


Zn2 <- c(1,cars[n], cars[n-1], cars[n-2], cars[n-3])
cars.2017.4_ar1 <- beta2.hat %*% Zn2

Potrebbero piacerti anche