Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introductory Econometrics
ECON2206/ECON3290
Tutorial Program
Sample Answers
Session 1, 2011
1
Table of Contents
Week 2 Tutorial Exercises ............................................................................................................................................................................................... 3
Problem Set ...................................................................................................................................................................................................................... 3
STATA Hints ..................................................................................................................................................................................................................... 4
Week 3 Tutorial Exercises ............................................................................................................................................................................................... 5
Review Questions (these may or may not be discussed in tutorial classes) ..................................................................................... 5
Problem Set ...................................................................................................................................................................................................................... 5
Week 4 Tutorial Exercises ............................................................................................................................................................................................... 7
Review Questions (these may or may not be discussed in tutorial classes) ..................................................................................... 7
Problem Set (these will be discussed in tutorial classes) .......................................................................................................................... 8
Week 5 Tutorial Exercises ............................................................................................................................................................................................... 9
Review Questions (these may or may not be discussed in tutorial classes) ..................................................................................... 9
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 10
Week 6 Tutorial Exercises ............................................................................................................................................................................................ 11
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 11
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 11
Week 7 Tutorial Exercises ............................................................................................................................................................................................ 15
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 15
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 15
Week 8 Tutorial Exercises ............................................................................................................................................................................................ 17
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 17
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 17
Week 9 Tutorial Exercises ............................................................................................................................................................................................ 20
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 20
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 20
Week 10 Tutorial Exercises ......................................................................................................................................................................................... 23
Readings .......................................................................................................................................................................................................................... 23
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 23
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 24
Week 11 Tutorial Exercises ......................................................................................................................................................................................... 26
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 26
Problem Set ................................................................................................................................................................................................................... 27
Week 12 Tutorial Exercises ......................................................................................................................................................................................... 29
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 29
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 30
Week 13 Tutorial Exercises ......................................................................................................................................................................................... 32
Review Questions (these may or may not be discussed in tutorial classes) .................................................................................. 32
Problem Set (these will be discussed in tutorial classes) ....................................................................................................................... 33
2
Week 2 Tutorial Exercises
Problem Set
Q1. Wooldridge 1.1
• (i) Ideally, we could randomly assign students to classes of different sizes. That is, each
student is assigned a different class size without regard to any student characteristics such
as ability and family background. For reasons we will see in Chapter 2, we would like
substantial variation in class sizes (subject, of course, to ethical considerations and resource
constraints).
• (ii) A negative correlation means that larger class size is associated with lower performance.
We might find a negative correlation because larger class size actually hurts performance.
However, with observational data, there are other reasons we might find a negative
relationship. For example, children from more affluent families might be more likely to
attend schools with smaller class sizes, and affluent children generally score better on
standardized tests. Another possibility is that, within a school, a principal might assign the
better students to smaller classes. Or, some parents might insist their children are in the
smaller classes, and these same parents tend to be more involved in their children’s
education.
• (iii) Given the potential for confounding factors – some of which are listed in (ii) – finding a
negative correlation would not be strong evidence that smaller class sizes actually lead to
better performance. Some way of controlling for the confounding factors is needed, and this
is the subject of multiple regression analysis.
3
• (vi) The average of exppp is about $5,194.87. The standard deviation is $1,091.89, which
shows rather wide variation in spending per pupil. [The minimum is $1,206.88 and the
maximum is $11,957.64.]
• (vii) 8.7%. Note that, by Taylor expansion, log(x + h) – log(x) = log(1 + h/x) ≈ h/x, for
positive x and small h/x (in absolute value).
STATA Hints
• Some of STATA do-files for solving the Problem Set are posted in the course website. It is
important to understand the commands in the do-files so that you will be able write your
own do-files for your assignments and the course project.
4
Week 3 Tutorial Exercises
Review Questions (these may or may not be discussed in tutorial classes)
• The minimum requirement for OLS to be carried out for the data set {(xi, yi), i=1,…,n} with
the sample size n > 2 is that the sample variance of x is positive. In what circumstances is the
sample variance of x zero?
When all observations on x have the same value, there is no variation in x.
• The OLS estimation of the simple regression model has the following properties:
a) the sum of the residuals is zero;
b) the sample covariance of the residuals and x is zero.
Why? How would you relate them to the “least squares” principle?
These are derived from the first-order conditions for minimising the sum of squared
residuals (SSR).
• Convince yourself that the point ( x , y ) , the sample means of x and y, is on the sample
regression function (SRF), which is a straight line.
This can be derived from the first of the first-order conditions for minimising SSR.
• How do you know that SST = SSE + SSR is true?
The dependent variable values can be expressed as the fitted value plus the residual. After
taking the average away from both side of the equation, we find (𝑦𝑖 − 𝑦�) = (𝑦�𝑖 − 𝑦�) + 𝑢�𝑖 .
The required “SST = SSE + SSR” is obtained by squaring and summing both sides of the
above equation, noting (𝑦�𝑖 − 𝑦�) = 𝛽̂1 (𝑥𝑖 − 𝑥̅ ), and using the second of the first-order
conditions of minimising SSR.
• Which of the following models is (are) nonlinear model(s)?
a) sales = β0 /[1 + exp(-β1 ad_expenditure)] + u;
b) sales = β0 + β1 log(ad_expenditure) + u;
c) sales = β0 + β1 exp(ad_expenditure) + u;
d) sales = exp(β0 + β1 ad_expenditure + u).
The model in a) is nonlinear.
• Can you follow the proofs of Theorems 2.1-2.3?
You should be able to follow the proofs. If not at this point, try again.
Problem Set
Q1. Wooldridge 2.4
• (i) When cigs = 0, predicted birth weight is 119.77 ounces. When cigs = 20, bwght =
109.49. This is about an 8.6% drop.
• (ii) Not necessarily. There are many other factors that can affect birth weight,
particularly overall health of the mother and quality of prenatal care. These could be
correlated with cigarette smoking during birth. Also, something such as caffeine
consumption can affect birth weight, and might also be correlated with cigarette
smoking.
• (iii) If we want a predicted bwght of 125, then cigs = (125 – 119.77)/( –.524) –10.18, or
about –10 cigarettes! This is nonsense, of course, and it shows what happens when we
are trying to predict something as complicated as birth weight with only a single
explanatory variable. The largest predicted birth weight is necessarily 119.77. Yet
almost 700 of the births in the sample had a birth weight higher than 119.77.
• (iv) 1,176 out of 1,388 women did not smoke while pregnant, or about 84.7%. Because
we are using only cigs to explain birth weight, we have only one predicted birth weight
5
at cigs = 0. The predicted birth weight is necessarily roughly in the middle of the
observed birth weights at cigs = 0, and so we will under predict high birth rates.
6
Week 4 Tutorial Exercises
7
Problem Set (these will be discussed in tutorial classes)
Q1. Wooldridge 3.2
• (i) Yes. Because of budget constraints, it makes sense that, the more siblings there are in a
family, the less education any one child in the family has. To find the increase in the number
of siblings that reduces predicted education by one year, we solve 1 = .094(Δsibs), so Δsibs =
1/.094 ≈ 10.6.
• (ii) Holding sibs and feduc fixed, one more year of mother’s education implies .131 years
more of predicted education. So if a mother has four more years of education, her son is
predicted to have about a half a year (.524) more years of education.
• (iii) Since the number of siblings is the same, but meduc and feduc are both different, the
coefficients on meduc and feduc both need to be accounted for. The predicted difference in
education between B and A is .131(4) + .210(4) = 1.364.
8
Week 5 Tutorial Exercises
9
Problem Set (these will be discussed in tutorial classes)
Q1. Wooldridge 4.1
• (i) and (iii) generally cause the t statistics not to have a t distribution under H0.
Homoskedasticity is one of the CLM assumptions. An important omitted variable violates
Assumption MLR.4 (hence MLR.6) in general. The CLM assumptions contain no mention of
the sample correlations among independent variables, except to rule out the case where the
correlation is one (MLR.3).
10
Week 6 Tutorial Exercises
11
which is a strong rejection of H0: from Table G.3c, the 1% critical value with 2 and 90 df is
4.85.
• (iii) We use the R-squared form of the F statistic. We are testing q = 3 restrictions and there
are 88 – 5 = 83 df in the unrestricted model. The F statistic is
F = [(.829 – .820)/3]/[(1 – .829)/83] ≈ 1.46. The 10% critical value (again using 90
denominator df in Table G.3a) is 2.15, so we fail to reject H0 at even the 10% level. In fact,
the p-value is about .23.
• (iv) If heteroskedasticity were present, Assumption MLR.5 would be violated, and the F
statistic would not have an F distribution under the null hypothesis. Therefore, comparing
the F statistic against the usual critical values, or obtaining the p-value from the F
distribution, would not be especially meaningful.
• The p-value for testing H0: β1= 0 against the two-sided alternative is about .018, so that we
reject H0 at the 5% level but not at the 1% level.
• (ii) The correlation is about −.84, indicating a strong degree of multicollinearity. Yet each
coefficient is very statistically significant: the t statistic for 𝛽̂log (𝑖𝑛𝑐𝑜𝑚) is about 5.1 and that
for 𝛽̂prppov is about 2.86 (two-sided p-value = .004).
• (iii) The OLS regression results when log(hseval) is added are
�
log (𝑝𝑠𝑜𝑑𝑎) = −.84 + .098 prpblck − .053 log(income) + .052 prppov + .121 log(hseval)
(.29) (.029) (.038) (.134) (.018)
12
n = 401, R2 = .184
The coefficient on log(hseval) is an elasticity: a one percent increase in housing value,
holding the other variables fixed, increases the predicted price by about .12 percent. The
two-sided p-value is zero to three decimal places.
• (iv) Adding log(hseval) makes log(income) and prppov individually insignificant (at even the
15% significance level against a two-sided alternative for log(income), and prppov is does
not have a t statistic even close to one in absolute value). Nevertheless, they are jointly
significant at the 5% level because the outcome of the F2,396 statistic is about 3.52 with p-
value = .030. All of the control variables – log(income), prppov, and log(hseval) – are highly
correlated, so it is not surprising that some are individually insignificant.
• (v) Because the regression in (iii) contains the most controls, log(hseval) is individually
significant, and log(income) and prppov are jointly significant, (iii) seems the most reliable.
It holds fixed three measure of income and affluence. Therefore, a reasonable estimate is
that if prpblck increases by .10, psoda is estimated to increase by 1%, other factors held fixed.
•
13
• (ii) With log(wage) as the dependent variable the estimated equation is
log �
(𝑤𝑎𝑔𝑒) = .284 + .092 educ + .0041 exper + .022 tenure
(.104) (.007) (.0017) (.003)
n = 526, R = .316, 𝜎� = .441.
2
The histogram for the residuals from this equation, with the best-fitting normal distribution
overlaid, is given below:
• (iii) The residuals from the log(wage) regression appear to be more normally distributed.
Certainly the histogram in part (ii) fits under its comparable normal density better than in
part (i), and the histogram for the wage residuals is notably skewed to the left. In the wage
regression there are some very large residuals (roughly equal to 15) that lie almost five
estimated standard deviations (𝜎� = 3.085) from the mean of the residuals, which is
identically zero, of course. Residuals far from zero does not appear to be nearly as much of a
problem in the log(wage) regression.
• The skewness and kurtosis of the residuals from the OLS regression may be used to test the
null that u is normally distributed (skewness = 0 and kurtosis = 3 under the null). The null
distribution of the Jarque-Bera test statistic (which takes into account both skewness and
kurtosis) is approximately the chi-squared with 2 degrees of freedom. Hence, at the 5% level,
we reject the normality of u whenever the JB statistic is greater than 5.99. In STATA, this test
is carried out by the command “sktest”. With the data in WAGE1.RAW, JB = ⋅ (extremely
large) for the wage model and JB = 10.59 for the log(wage) model. While the normality is
rejected for both models, we do see that the JB statistic is much smaller for the log(wage)
model [also compare its p-values of skewness and excess kurtosis tests against those from
the wage model]. Hence the normality appears to be a reasonable approximation to the
distribution of the error term in the log(wage) model.
14
Week 7 Tutorial Exercises
15
performance on the science test fixed while studying the effects of staff on the math pass
rate? This would be an example of controlling for too many factors in a regression
equation. The variable scill could be a dependent variable in an identical regression
equation.
The t statistic on the interaction term is about 2.13, which gives a p-value below .02 against
H1: β3 > 0. Therefore, we reject H0: β3 = 0 in favor of H1: β3 > 0 at the 2% level.
• (iv) We rewrite the equation as
log(wage) = β0 + θ1educ + β2exper + β3 educ (exper − 10) + u,
and run the regression log(wage) on educ, exper, and educ(exper – 10). We want the
coefficient on educ. We obtain θ1 ≈ .0761 and se(θ1) ≈ .0066. The 95% CI for θ1 is about .063
to .089.
The predicted price at lotsize = 10,000, sqrft = 2,300, and bdrms = 4 is about $336,714.
• (ii) The regression is pricei on (lotsizei – 10,000), (sqrfti – 2,300), and (bdrmsi – 4). We want
the intercept estimate and the associated 95% CI from this regression. The CI is
approximately 336,706.7 ± 14,665, or about $322,042 to $351,372 when rounded to the
nearest dollar.
• (iii) We must use equation (6.36) to obtain the standard error of and then use equation
(6.37) (assuming that price is normally distributed). But from the regression in part (ii),
se(𝑦� 0 ) ≈ 7,374.5 and 𝜎� ≈ 59,833. Therefore, se(𝑒̂ 0 ) ≈ [(7,374.5)2 + (59,833) 2]1/2 ≈ 60,285.8.
Using 1.99 as the approximate 97.5th percentile in the t84 distribution gives the 95% CI for
price0, at the given values of the explanatory variables, as 336,706.7 ± 1.99(60,285.8) or,
rounded to the nearest dollar, $216,738 to $456,675. This is a fairly wide prediction interval.
But we have not used many factors to explain housing price. If we had more, we could,
presumably, reduce the error standard deviation, and therefore 𝜎�, to obtain a tighter
prediction interval.
16
Week 8 Tutorial Exercises
17
strictly holds with a single explanatory variable included in the regression, but we often
ignore the presence of other independent variables and use this table as a rough guide. (Or,
we can use the results of Problem 3.10 for a more precise analysis.) If less able workers are
more likely to receive training, then train and u are negatively correlated. If we ignore the
presence of educ and exper, or at least assume that train and u are negatively correlated
after netting out educ and exper, then we can use Table 3.2: the OLS estimator of β1 (with
ability in the error term) has a downward bias. Because we think β1 ≥ 0, we are less likely to
conclude that the training program was effective. Intuitively, this makes sense: if those
chosen for training had not received training, they would have lowers wages, on average,
than the control group.
18
• (v) The fitted probabilities range from about .185 to 1.051, so none are negative. There are
two fitted probabilities above 1, which is not a source of concern with 660 observations.
• � ≥ 0.5 and zero
(vi) Using the standard prediction rule – predict one when 𝑒𝑐𝑜𝑛𝑏𝑢𝑦
otherwise – gives the fraction correctly predicted for ecobuy = 0 as 102/248 ≈ .411, so about
41.1%. For ecobuy = 1, the fraction correctly predicted is 340/412 ≈ .825, or 82.5%. With the
usual prediction rule, the model does a much better job predicting the decision to buy eco-
labeled apples. (The overall percent correctly predicted is about 67%.)
19
Week 9 Tutorial Exercises
20
Notice that β1, which is the slope on inc in the original model, is now a constant in the
transformed equation. This is simply a consequence of the form of the heteroskedasticity
and the functional forms of the explanatory variables in the original equation.
21
• (iv) The smallest fitted value is about .030 and the largest is about .697. The WLS estimates
of the LPM are
� = −.488 + .0126 inc − .000062 inc2 + .0255 age − .00030 age2 − .0055 male
𝑒401𝑘
(.076) (.0005) (.000004) (.0037) (.00004) (.0117)
n = 9,275, R2 = .108.
There are no important differences with the OLS estimates. The largest relative change is in
the coefficient on male, but this variable is very insignificant using either estimation method.
22
Week 10 Tutorial Exercises
Readings
• Read Chapter 9 thoroughly.
• Make sure that you know the meanings of the Key Terms at the chapter end.
23
The exogenous sample selection is a nonrandom sampling scheme in which a sample is
chosen on the basis of the explanatory variables. It is easily seen that such sampling scheme
does not violate MLR1,3,4 and hence does not introduce bias in the OLS estimation. On the
other hand, the endogenous sample selection scheme selects a sample on the basis of the
dependent variable and causes bias in the OLS estimation.
• What are outliers and influential observations?
Outliers are unusual observations that are far away from the centre of other observations. A
small number of outliers can have large impact on the OLS estimates of parameters. These
are called influential observations if the estimation results when including them are
significantly different from the results when excluding them.
• Consider the simple regression model yi = α + (β + bi)xi + ui for i = 1, 2,…, n, where the “slop
parameter” contains a random variable bi; α and β are constant parameters. Assume that the
usual MLR1-5 hold for yi = α + βxi + ui, ; E(bi|xi) = 0 and Var(bi|xi) = ω2. If we regress y on x
with a intercept, will the estimator of the “slope parameter” be biased from β? Will the usual
OLS standard errors be valid for statistical inference?
As the model can be writted as yi = α + β xi + (bixi + ui), where the overall disturbance is (bixi
+ ui). Given the stated assumptions, MLR1-4 hold for this model and the OLS estimator of the
“slope parameter” is unbiased (from β). However, MLR5 does not hold since the conditional
variance of the overall disturbance, var(bixi + ui), will be dependent on xi , and the usual OLS
standard errors will not be valid.
24
• (iv) Both math10 and lnchprg are percentages. Therefore, a ten percentage point increase in
lnchprg leads to about a 3.23 percentage point fall in math10, a sizeable effect.
• (v) In column (1) we are explaining very little of the variation in pass rates on the MEAP
math test: less than 3%. In column (2), we are explaining almost 19% (which still leaves
much variation unexplained). Clearly most of the variation in math10 is explained by
variation in lnchprg. This is a common finding in studies of school performance: family
income (or related factors, such as living in poverty) are much more important in explaining
student performance than are spending per student or other school characteristics.
The coefficient on grant is actually positive, but not statistically different from zero.
• (iii) When we add log(scrap87) to the equation, we obtain
log(scrap88) = .021 − .254 grant88 + .831 log(scrap87)
(.089) (.147) (.044)
n = 54, R = .873,
2
where the years are indicated as suffixes to the variables for clarity. The t statistic for H0:
βgrant = 0 is −.254/.147 ≈ −1.73. We use the 5% critical value for 40 df in Table G.2: -1.68.
Because t = −1.73 < −1.68, we reject H0 in favor of H1: βgrant < 0 at the 5% level.
• (iv) The t statistic is (.831 – 1)/.044 ≈ −3.84, which is a strong rejection of H0.
• (v) With the heteroskedasticity-robust standard error, the t statistic for grant88 is
−.254/.142 ≈ −1.79, so the coefficient is even more significantly less than zero when we use
the heteroskedasticity-robust standard error. The t statistic for H0: βlog(scrap87) = 1 is (.831 –
1)/.071 ≈ −2.38, which is notably smaller than before, but it is still pretty significant.
25
Week 11 Tutorial Exercises
26
it is purely induced by the time trends and has little to say about the true relationship
between the two time series.
• Why would you include a time trend in regressions with trending variables?
Including a time trend in regressions allows one to study the relationship among time series
variables that is not induced by the time trends in the time series.
• What is seasonality in a time series? Give an example of time series variable with seasonality.
Seasonality in a time series is the fluctuation that repeats itself every year (or every week, or
every day). An example would be the daily revenue a pub, which is likely to have week-day
seasonality.
• For quarterly data, how would you define seasonal dummy variables for a regression model?
We need to define three dummies. For instance, if we take the first quarter as the base, we
define dummy variables Q2, Q3 and Q4 for the second, third and fourth quarters and include
them in the regression model.
Problem Set
Q1. Wooldridge 10.1
• (i) Disagree. Most time series processes are correlated over time, and many of them strongly
correlated. This means they cannot be independent across observations, which simply
represent different time periods. Even series that do appear to be roughly uncorrelated –
such as stock returns – do not appear to be independently distributed, as they have dynamic
forms of heteroskedasticity.
• (ii) Agree. This follows immediately from Theorem 10.1. In particular, we do not need the
homoskedasticity and no serial correlation assumptions.
• (iii) Disagree. Trending variables are used all the time as dependent variables in a
regression model. We do need to be careful in interpreting the results because we may
simply find a spurious association between yt and trending explanatory variables. Including
a trend in the regression is a good idea with trending dependent or independent variables.
As discussed in Section 10.5, the usual R-squared can be misleading when the dependent
variable is trending.
• (iv) Agree. With annual data, each time period represents a year and is not associated with
any season.
27
Q3. Wooldridge 10.7
• (i) pet-1 and pet-2 must be increasing by the same amount as pet.
• (ii) The long-run effect, by definition, should be the change in gfr when pe increases
permanently. But a permanent increase means the level of pe increases and stays at the new
level, and this is achieved by increasing pet-1, pet-1, and pet by the same amount.
• (iii) The estimated LRP of i3 with respect to inf is .343 + .382 = .725, which is somewhat
larger than .606, which we obtain from the static model in (10.15). But the estimates are
fairly close considering the size and significance of the coefficient on inft-1.
• (iv) The F statistic for significance of inf t-1 and def t-1 is about 5.22, with p-value ≈ .009. So
they are jointly significant at the 1% level. It seems that both lags belong in the model.
28