Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
EATZAZ AHMED
Page - 1 -
conometrics
Page - 2 -
Y1 = + U1
Y2 = + U2
Y3 = + U3
:
Yi = + Ui
:
:
Yn = + Un
The estimation of depends on assumptions of the model. The Classical
assumptions are as follows;
1). Ui is a random variable for each i.
This means Ui is a random variable for that U1, U2, U3 Un are all random
variables.
Random Variable: random variable is that which can take at least two values with non
zero probability].
Ui is one out of infinite values, each have infinite values.
Time is fixed variable.
Age is not random variable.
Weight is random variable.
2). E (Ui) = 0 for each i.
On average errors are equal to zero.
Since Ui = Y => Yi E () = E (Ui)
This assumption holds by construction.
3). Var (Ui) = for all i.
All errors terms have the same variance, this assumption is known as
Homoscedasticity assumption and if assumption violated well have Hetroscedasticity.
4). Co-Var (Ui, Uj) = 0 of all i j.
Time series data they are correlated but not in cross section data.
If co-Var (Ui, Uj) 0 for some i j then we say that Ui is Auto correlated with Uj different
time at one variable (for example food expenditure).
5). Ui is distributed normally.
Some times we also make the assumption that;
Ui ~ N [Ui is distributed normally]
Page - 3 -
Y E (Y)
0 800
[Mean income]
Or
1000000 800
Estimation of
Let
is an estimator of .
[Where = Y - ]
2). i
0
0
19
0
|i |= 19
Page - 4 -
We should minimize weighted some of errors such that larger error are assigned
greater weights.
Suppose we set weights proportional to absolute size of error, so set;
i = |i |
Now minimize i |i |
Min |i | |i |
Min |i |
Min i
The estimator , which min i is known as Ordinary Least Square (OLS) method
Y= + U
[Basic equation]
Estimation:
e=Ye=YMin ei
OLS estimator of :
Min ei = (Y - )
First-Order condition
Page - 5 -
ei = (Yi - )
= (Yi = y
2).
must be random.
is Unbiased.
Proof:
E ( ) = E ( + ( 1_U1 + 1_U2 + 1_U3 +. + 1_Un)]
n
n
n
n
= + (1_ E (U1) + 1_ E (U2) + 1_ E (U3) +. + 1_ E (Un)]
n
n
n
n
= + (1_ (0) + 1_ (0) + 1_ (0) +. + 1_ (0)]
n
n
n
n
[As we know that E (Ui) =0]
E ( ) = .
Quaid-e-Azam University Islamabad
Page - 6 -
4).
Proof: (a).
Var ( ) = E [ - E ( )]
= E [ + 1_ U1 + 1_ U2 + 1_ U3 +. + 1_ Un - ]
n
n
n
n
= E [1_ (U1 + U2 + U3 +. + Un)]
n
= E [1_ (U1 + U2 + U3 +. + Un) ]
n
= 1_ E [U1 + U2 + U3 +. + Un + ij (Ui, Uj)]
n
= 1_ [E (U1) + E (U2) + E (U3) +. + E (Un) + ij E (Ui, Uj)]
n
= 1_ [ + + +. + + ij (o)]
.: {co-var (Ui, Uj) =0}
n
= 1_ n
n
equation (a)
=
n
(b). Now consider any linear unbiased estimator.
(i)
Page - 7 -
= b1 + b2 + b3 + . + bn + ij bi bj (0)
= bi
(iii). Comparison between Var ( ) < Var (*) unless b1= 1 for all i.
n
Consider Var (*) and minimize it by choosing bi.
Min (b1, bn)
Var (*) = bi
Subject to bi = 1.
Make Lagrangian
L = (b1 + b2 + b3 + . + bn) + [1-(b1 + b2 + . . + bn)]
First-order conditions;
(i= 1,2,3n) --------------- (A)
L => L bi => 2 bi =0
bi
--------------- (B)
L => L => 1-(b1 + b2 + . . + bn) =0
=> 2 bi =
2 bi = 2
n
bi = 1_
n
=> *= b1Y1 + b2Y2 + b3Y3 + ..+ bnYn
= 1_Y1 + 1_Y2 + 1_Y3 + ..+ 1_Yn
n
n
n
n
= 1_ [Y1 + Y2 + Y3 + ..+ Yn]
n
= 1_ Yi
n
*=
Recap: the OLS estimator is linear, unbiased and has minimum variance in the class of
linear unbiased estimators, that is is best linear unbiased estimator (minimum variance).
i.e.:-
is BLUE
Page - 8 -
is a linear function of Y.
= 1 Y1 + 1 Y2 + 1 Y3 + ..+ 1 Yn12
n
n
n
n
Theorem: If X1 ~ N, X2 ~ N, X3 ~ N.. Xn ~ N then any linear combination of X1, X2,
X3 Xn.
Z = a1X1 + a2X2 + a3X3 + .. + anXn ~N
By this theorem since Ui ~N and Yi = + Ui is linear function of Ui, we infer that Yi ~N
Further = b1Y1 + b2Y2 + b3Y3 + ..+ bnYn being a linear function of Y1 toYn is also
distributed normally;
Ui ~N _Lf__ Yi ~N _ Lf__
This
~N
= ~N.
We can use standard tools of statistical inferences; we can say big things with limited
source of data. There is counter argument that the above chain of reasoning is too long and
unnecessary.
We can just assume that ~N. Linearity is not very important (indispensable). We have
more options of estimations. Unbiased ness this means E ( ) = .
.: if we draw all possible random samples of Y and estimate from each sample one by
one, then mean value of will be equal to .
Where
This property is desirable because we dont want to have any systematic error in
estimation but it is not indispensable.
Consider the following example;
Prob ( E < < +E) = 0.6
E( )=
unbiased
* estimator is biased
Prob ( E < * < +E) = 0.9.
E( )
We can see in figure that estimator biased could be better than unbiased.
Quaid-e-Azam University Islamabad
Page - 9 -
is OLS
Best/Minimum Variance this means that Var ( ) < Var (*), where
estimator, * is any other linear biased estimator. If we compute with nonlinear or biased
estimator, the property doesnt help. BLUE property is desirable, unbiased limit our choices
and linearity also limit.
The above model determines E (Y) as a constant.
Y= +U
= E (Y).
Now suppose we want to determine E (Y) given some information set (I). This
information is usually in form of data on variables called explanatory variables, e.g. Gender,
Height etc.
Suppose such variables are X1, X2, X3.. Xm, if the set of information is complete
then we can write;
Y= f (X1, X2, X3.. Xm)
Complete information means:
List of all variables X1, X2, X3.. Xm is complete.
All data are measured accurately.
The functional form [f (.)] is exactly known.
The three sources of error:
Incomplete list of X variables.
Measurement error in data.
Misspecification of the functional form.
This will produce the following type of equation
Quaid-e-Azam University Islamabad
Page - 10 -
Y= 2 X2 + 3 X3 + 4 X4 + + k Xm +Z.
[k < m]
Y= 2 X2 + 3 X3 + 4 X4 + + k Xm +Z.
Or
Y= 2 X2 + 3 X3 + 4 X4 + + k Xm + E (Z) + Z E (Z)
Or
Y= 1 + 2 X2 + 3 X3 + + k Xk + U.
[X1 =1]
Page - 11 -
T Y = + X + U
Suppose we have data through a random sample size n, and then we can write;
Yi = + Xi + Ui
(i= 1, 2, 3..n)
Yi = + Xi + Ui
Estimation:
Suppose and are estimators of and respectively, therefore we have the estimated
values of Y given as;
i = + Xi + Ui
The regression residual;
ei = Y i - i
= Yi - ( + Xi)
ei = [Yi - - Xi]
For OLS we minimize ei with respect to
= 2[Yi-
- Xi] (-1) =0
= 2[Yi-
- Xi] (-Xi) =0
Yi + Xi = n
both sides by n
----------- (iii)
Page - 12 -
Consider
Thus we have
is a linear function of Y.
Proof:
Page - 13 -
= a1 Y1 + a2 Y2 + a3 Y3 + an Yn.
= ai Yi
----------------------- (vii)
is a linear function of U.
Proof:
= ai Yi.
= ai ( + Xi + Ui)
= ai + ai Xi + ai Ui
Now consider
---------------- (viii)
ai = (xi / xi)
= 1 . xi.
xi
= 1 . (0) => ai = 0.
xi
------------------ (ix)
Page - 14 -
= 1 . xi
xi
ai Xi = 1.
Substitute (ix) and (x) into (viii)
= ai + ai Xi + ai Ui
= (0) + (1) + ai Ui
= + ai Ui
------------------ (xi)
= + a1 U1 + a2 U2 + . + an Un is a linear function of U.
It follows that
2).
~ N.
is unbiased:
Proof:
= + ai U i
E ( ) = E [ + (ai Ui)]
= + ai E (Ui)
= + ai (0)
E( )=
3).
since ai is fixed
where E (Ui) = 0
--------------------- (xii)
Var ( ) = E [ - E ( )]
= E [ + ai Ui - ]
= E [ai Ui]
= E [a1 U1 + a2 U2 + a3 U3 + ... + an Un + ij aij Ui Uj]
= a1 E (U1) + a2 E (U2) + ... + an E (Un) + ij aij E (Ui Uj) (A)
Since x values are fixed.
Consider
E (Ui) = E [Ui E (Ui)]
= E (Ui)
Var (Ui) =
Now consider
[where E (Ui) = 0]
---------------------- (xiii)
[where E (Ui) = 0]
Page - 15 -
---------------------- (xv)
----------------------- (xvii)
Consider * = bi Yi.
= bi ( + Xi + Ui)
= bi + bi Xi + bi Ui.
= (0) + (1) + bi Ui.
* = + bi Ui.
Substitute in (xvii)
Var (*) = E [ + bi Ui ]
= E [ bi Ui]
= E [bi Ui + ij bi Ui bj Uj].
= bi E (Ui ) + ij bi bj E (Ui Uj)].
[E (Ui Uj) = 0]
Var (*) = bi.
-------------------------- (xviii)
We need to prove that Var (*) > Var ( ).
Var (*) = bi.
= (bi- ai + ai) .
= [(bi- ai) + (ai) + 2 (bi- ai) (ai)].
= (bi- ai) + ai + 2 (bi - ai) (ai) -2 ai].
Page - 16 -
[ ai = 0, ai Xi = 1, bi = 0, bi Xi = 1].
= [ (bi- ai) - ai + 2 bi ai]
Consider
-------------- (xix)
b i ai = b i x i .
xi
bi ai = ai
ractice Equation:
Suppose we want to estimate the equation,
Yi = + i
Xi
Again derive OLS estimator of .
Quaid-e-Azam University Islamabad
Page - 17 -
Compare between
and
Hetroscedisticity [Yi = + i]
Xi
Properties of OLS residuals:
1). OLS residuals are or thogonal to Regressors.
Like [a1 a2]
a1.b1 + a2.b2 =0
e1 + e2 + e3 + ,.. en => ei =0
- Xi) = 0
=> (Yi- ) = 0
=> i = 0
= 0 => (Yi-
- Xi) Xi = 0
=> (Yi- ) Xi = 0
=> Xi i = 0
+ i
=0
Page - 18 -
(i)
Actual variation = Expected variation.
Now consider
Where
=0
= 0.
Using this result we can write equation (i) as
Extreme cases, so
0 R 1
There is no bench mark and in what context R is taken.
Example:
Age of Ali = + (age of Alis dad) + U
Quaid-e-Azam University Islamabad
Page - 19 -
R = 1
Weight of Ali = + (weight of Alis dad) + U
R = 1
Pakistan, consumption
R = 0.95
In cross section data R = 0.4 is good, but in time series data R = 0.9 is not a remarkable
because of bound ness, we think this is best measure. Problem with R is that R
increases if we add more variables in the regression.
Example:
Dependent variable is consumption of household.
Data: Cross section.
C = + Y +U
R = 0.25
C = + Y + N +U
R = 0.46
C = + Y + Nc + Nm + Nf + R +U
R = 0.56
C = is linear function of Y, Nc, Nm, Nf, Residence, education female, education male,
wealth .etc.
R = 0.9899
If we make R as the criteria to choose the number of variables in the equation, well end
up with as many variables as the number of sample points with R = 1.
Also note that as the sample size decreases R will in general increases.
Page - 20 -
It do not put limitations to include variables in the model, model has to be small cater.
Adjusted R:
Consider the formula for R
Now
Page - 21 -
Then
Z = x- ~ N (1, 0)
Page - 22 -
V2 ~ m2
V1 and V2 are independent,
F = V1/ m1 ~ F m1, m2 [Fisher distribution with numerator degree of freedom to m 1 and denominator
V2/ m2
degree of freedom to m 2]
5). Suppose
X ~ N (, )
V ~ m
X and V are mutually independent then
= > standardize normal
(V/m)
t = X-/
(V/m)
t = X-/ ~ t m
(V/m)
Back to Econometrics:
Consider
Y=+X+U
Suppose we want to test the null hypothesis;
H0: = 0 [where 0 is given value and alternative]
H1: 0
As we know that
~ N (,
Therefore
Quaid-e-Azam University Islamabad
Page - 23 -
In testing null (H0) hypothesis, we will use estimated value of as , the hypothetical
value 0 in place of and x will be computed from the actual data.
That is
from data
= 0 from H0
x from data
Note that remains unknown; one option is to replace by its unbiased estimator,
[Proof is in the book]
Then
~ N (,
is
Or
Now consider
Page - 24 -
Recall
Example 1:
Y=weight
4
8
10
12
11
Y=45
Y=+X+U
X =Age
1
2
3
4
5
X=15
(Xmean X) =x
-2
-1
0
1
2
x=0
(Y-mean Y) =y
-5
-1
1
3
2
y=0
X
4
1
0
1
4
x=10
y
25
1
1
9
4
y=40
xy
10
1
0
3
4
x y=18
= 9 1.8 x 3
= 9 - 5.4
= 3.6
The estimated equation is;
3.6 = weight at time of birth.
1.8 = rate of increase in weight per year increase in age.
Page - 25 -
5.4
7.2
9
10.8
12.6
When x = 1
x=2
x=3
x=4
x=5
We can compute
Set level of significance (or probability of type 1-error, equal to one minus level of
confidence) = 0.05
The calculated t-value falls in rejection (area) range so we reject H0. This means the effect
of age on weight is significantly different from zero.
Quaid-e-Azam University Islamabad
Page - 26 -
Testing of Hypothesis:
Suppose we have estimated the following equation;
C =10.0 + 0.8Y
, R = 0.97, n = 25
(2.5) (0.1)
[The values in brackets are standard errors]
H0: = 0
H1: 0
Degree of freedom = 25 2 = 23
Level of significance = 0.05
Critical t-values = + 2.069
We reject H0.
2).
H0: = 0
H1: 0
We reject H0.
3).
H0: = 1
H1: < 1
Page - 27 -
We reject H0.
Interpretation: The result show that 97% variation in consumption expenditure is explain
by our model, which indicates that the over all performance of the equation is satisfactory,
the intercept is positive and significantly different from zero, its magnitude shows that the
subsistence or autonomous consumption expenditure is 10 thousand rupees per capita per
year, further note that marginal propensity to consume (MPC) is significantly different from
zero and less than one, the estimated value of the MPC shows that the marginal consumption
rate is 0.8 or 80% of each incremental rupee of income is consumed, while the remaining
20% is saved.
Testing a linear restriction on two or more parameters:
Y= + X + U
H0: + = 1
H1: + 1
Page - 28 -
Actual application:
Variances and co-variances are obtained from coefficient variance- co-variance
matrix.
Suppose
= 1.7,
= - 0.2
H0: + = 0
H1: + 0
H0: - = 0
H1: - 0
Page - 29 -
Assumptions:
1). Ui is random variable for each i.
1b). Ui is normally distributed (Ui ~ N) for each i.
2). E (Ui) =0 for each i.
As this assumption holds by construction;
Ui = Zi E (Zi)
E (Ui) = E (Zi) E [E (Zi)]
= E (Zi) E (Zi)
=0
3). Var (Ui) = for all i.
4). Co-Var (Ui Uj) = 0 for all ij.
5). X variables are fixed or exogenous or non-random.
6). Correlation between X2, X3 is not equal to 1.
[This is X2, and
Information,
Content is zero.
X2,
Information
Content is full.
X3
X2, X3
Information is rich.
Page - 30 -
Estimation:
Yi = 1 + 2X2 + 3X3 + U
Replace unknown parameters by their estimators and set U = 0.
First-Order-Condition:
Also note;
Page - 31 -
3).Var ( 1), Var ( 2), Var ( 3) and Z have minimum variance in the class of linear
unbiased estimators, it can be shown that an unbiased estimator of the variance of U,
is;
[We have to see the number of restrictions and not to focus on how many restrictions on parameters].
Where
4.
Then compute
Then
i = Y i .
Quaid-e-Azam University Islamabad
Page - 32 -
Finally compute U
Suppose U = 50.
2). Impose the restriction given into this will yield, according to our example;
Y = 1 + (1) X2 + (0) X3 + 4X4 + U
Y X2 = 1 + 4X4 + U
Estimate 1 and 4; and compute;
1
= ..
= ..
Now complete
i = Y i .
Finally compute R
Suppose we have R = 60.
3).Compute the F-statistics.
Note these values; U = 50, R = 60, R = 2, n = 34 and k = 4.
Now plug in values;
F = (60 50) / 2 = 10 / 2 = 5*30 = 3.
50 / (34 4) 50 / 30 50
4). Conclusion.
We conclude by comparing calculated F-value with the critical F-value. In our
case the critical F-value at R = 2 and n-k = 30 and df = 2.87 is supposed.
In our example the calculated F-value > critical F-value, so we reject H0.
Page - 33 -
Now
Page - 34 -
Where all values come from unrestricted model, so we can ignore the subscript .
We can write model as;
Or
y.
F-statistics indicates that increase in the restriction decrease R by more error (In against
or alternative).
F-statistics indicates the increase in the R due to removal of restrictions.
Note 2: In Special Case 2.
Page - 35 -
ULTICOLLINEARITY:
It is an econometric problem.
There are four ways to solve the problem:
What is problem
What are the consequences of problem
How to test the problem
What is solution
Recall the three variables regression model problem.
Yi = 1 + 2X2 + 3X3 + U
Also recall;
Also note;
Page - 36 -
23 = + 1.
In this case
= ---------------,
3 = ---------------.
O.
O
So we can estimate 2 and 3 by OLS. It can be shown that 2 and 3 can not be estimated
at all. In fact the true values of 2 and 3 are not even perfectly defined.
2
2 = E (Y)
X2
and
3 = E (Y)
X3
do not exist
23 = + 0.
In this case
2 = E (Y) = d E (Y)
X2
d X2
3 = E (Y) = d E (Y)
X2
d X3
Finally note that in this case multiple regression equation and partial (simple)
regression equation produce identical results.
Recap: In case 23 = + 1, multiple regression equation fails theoretically and application wise
also.
In other extreme case 23 = 0, multiple regression equation is not needed, so the only
practical use of multiple regression equation is when
23 = 0, 23 + 1.
Quaid-e-Azam University Islamabad
Page - 37 -
23
= + 1.
X2,
Information.
Content is zero.
23
X3.
= + 1.
Information
Content is full.
23 = 0, 23 + 1.
| 23 | is low.
| 23 | is High.
Page - 38 -
In this equation of CPI, DD plus TD are coming from commercial banks and CC is
major part of money, we can write this equation as to be good.
CPI = + [CC + DD + TD + OD] + U
Or
CPI = + M2 + U
[M2 = CC + DD + TD + OD]
Or
CPI = + (CC + DD) + (TD + OD) + U
There could be model specification problem. Here is matter of judgment not a matter
of science.
Consequences of Multicollinearity:
Note that OLS estimators remain BLUE.
(1) The variances of OLS estimators become large.
Recall formula for variances.
If | 23 | is high (1- 23) will be low, therefore Var ( 2) and Var ( 3) will be large.
It follows that,
Quaid-e-Azam University Islamabad
Page - 39 -
Standard error will also be large. Thus the t-value for H0:
t=
- 0.
= 0.
SE ( 2)
Therefore we may accept H0, while will should not have to accept. In other words we
may wrongly conclude that X variables do not affect Y.
Example:
CPIt = + M2 + ERt + Yt + CPI t-1 + Ut
If data is too large than more multicollinearity, we may accept H0: j = 0. Standard
error will be greater and cause misleading t-value. Another implication of this consequence is
that
varies quite a bit from sample to sample. In particular even small changes in the
annual data from 1970 to 2002. = -3.7 when the data are from 1970 to 2005, so conclusion
is that s also erratically changes with small changes in the data, specification etc, t-value
(decreases) is small then s will volatile the model their will be no robusness (stability) in
the model, their will be no trust over model.
(2) Recall the formula for Co-Var ( 2,
).
) large.
Page - 40 -
If X2 and X3 are positively and highly correlated with each other, then (negative
and very large correlation) under estimation of 2 will accompany over estimation of 3 and
vice versa.
Example: 2 = 100 and 3 = 250
3 = 300 and 2 = 100
Like wise if X2 and 3 are negatively and highly correlated then over (under)
estimation of 2 will accompany under (over) estimation of 3.
Consequences of (1) and (2) imply that the estimated parameters ( s) become
volatile (unreliable, unstable) and too sensitive, their magnitudes are quite likely to be
unrealistic in terms of sign and size (some significant parameters sign will not good).
Example: CPI, Y, ER, IR MPC = 1.3. MPC = -1.3
Own price elasticity is negative it comes positive; it means estimation is not
realistic and reliable.
Testing and Diagnostic of Multicollinearity:
Formal test of multicollinearity are too complex but not much fruit full. In practice
we may rely on certain clues and symptoms (indicators).
(1) Multicollinearity is likely to be present if data are time series data at high
frequency (for example annual rather than monthly data). Unless data are de-trended
(Remove common trend, low interval and low frequency).
(2) A very popular symptom of multicollinearity is that over all performance of the
estimated equation is good in terms of high value of R, but t-statistics for individual
regression coefficients for H0: j = 0 is mostly insignificant.
Example1:
Log CPIt = 1.2 + 0.3 log M2 + 0.7 log Yt 0.25 log ERt + 0.97 log CPIt-1.
T-values:
(0.85)
(1.37)
(-0.09)
(44.73)
R = 0.9938
Log CPIt = 1.2 + 0.3 log M2 + (-) 0.7 log Yt (+) 0.25 log ERt + 0.97 log CPIt-1.
(Insignificant)
(Insignificant)
(Insignificant)
(Highly significant)
(1.57)
(-0.73)
(1.21)
(Insignificant) (Insignificant) (Insignificant)
(17.43)
(Highly significant)
R = 0.99.
Quaid-e-Azam University Islamabad
Page - 41 -
Where
Log Y = Income elasticity.
Log Pw = Price elasticity of wheat.
Log Pr = Cross price elasticity of rice.
Like wise signs are good results is fine because all factors are good.
(3) Parameter estimates are too sensitive to changes in sample, definition of variables
and specification of the model.
If we change sample little bit to add new data and regress and it give new situation
which drastically changes results (we are avoiding Var ( ) is too high not good). Defining
variables in more than one ways like GNP as GDP or GNI and saving data which definition
we are going to use or to take, which changes drastically. Through which way we are
specifying the model,
C = + 1 Y + R + ..
Log C = + 2 log Y + log R + .....
1 = C.
Y
2 = Log C = C Y.
Log Y Y C
Therefore
2 = 1 Y.
C
(Ms)
(Real GDP)
(Real ER)
Page - 42 -
--.
--.
--.
Case 2: P, M = 0.95
M, Y = 0.60 (Ok)
P, Y = 0.65.
GDP, ER = 0.55(Ok)
P, ER = 0.70.
Solutions of Multicollinearity:
(1) Exclude variable (s) causing multicollinearity.
This solution makes sense only when the variable being dropped is not important in the
over all frame-work of our analysis.
Example:
Pt = + Mt + GDPt + ERt + Pt-1 + Pt-2 + U
If Pt-2 is causing multicollinearity we should exclude this variable which is not very
important while Mt is causing multicollinearity we should not exclude this variable because
with out Mt (money supply) we can not measure the inflation.
Note: Unfortunately important variable causes multicollinearity.
Page - 43 -
Page - 44 -
It reduces the changes drastically of multicollinearity but it filter out all valuable
variables also. It is not a good solution, intercept is also gone. Suppose multicollinearity is
caused by common trend.
Why not control trend? To control for the trend, we include time variable in the model/
equation.
t = 0, 1, 2, 3 ..
Yt = + t + Xt + Zt + Ut.
Yt-1 = + (t-1) + Xt-1 + Zt-1 + Ut-1 .
Yt = + Xt + Zt + Ut.
[Y increases W increases]
K = Capital
L = Labor.
Quaid-e-Azam University Islamabad
Page - 45 -
M = Material.
E =Energy.
Log Q = log A + log K + log L + log E + log M + U.
Or
Log Q = a0 + log K + log L + log E + log M + U.
Firm have higher capital stock, and then there will be more employment arises.
Test:
H0: + + + = 1
H1: + + + 1
Suppose H0 is accepted then we can write;
= 1 .
Now production function becomes,
Or
RISCH-WAUGH THEOREM:
Suppose we have;
Y = 1 + 2 X + 3 Z + U
Then
2 = E (Y)
X
The effect of Z can also be eliminated as follows;
Regress Y on Z.
Y = a0 + a1 Z + V
And obtain
(by OLS)
Regress X on Z.
X = b0 + b 1 Z + W
And obtain
(by OLS)
Page - 46 -
UTOCORRELATION:
Definition:
Correlation between Xi and Yj, ij and X and Y may be the same or different
variables are called Serial correlation.
Page - 47 -
Ct = + Yt + Ut
[4 years monthly data]
We have exclude variables which capture the inertia, error become auto correlated
when error term captures inertia.
Consequences of Autocorrelation:
Note: - OLS estimators remain linear and unbiased.
1). OLS estimators no more have minimum variance in the class of linear unbiased
estimators.
Not remains best.
2). Ordinary formula for calculating variances is no more valid.
Var ( ) _
x
ij Cov (Ui Uj) 0
OLS estimators are not sufficient, they are larger in variances.
This is not a big problem we can make correlation if we apply BLUE but not best by
using correct formula, result will come in too high variances then standard errors also
become high and we will accept t-value which miss lead the parameters.
[Testing and Solution of Autocorrelation is post pond till such time we understand the
various forms of Autocorrelation].
Form of Autocorrelation:
Consider the model
Yt = 0 + 1 Xt + Ut
1). Auto Regressive system [AR (p) model]
Ut = 0 + 1 Ut-1 + + p Ut-p + t
Auto correlated portion
Non-auto portion
Innovation
News, Shock
White noise error
Page - 48 -
[ 0 =1]
Ut = + [Moving average of t]
3). ARMA (p, q) model:
Ut = 0 + 1 Ut-1 + + p Ut-p + 0 + 1
AR (p) model
+ + q t-q + t
t-1
MA (q) model
AR (1) Model:
Ut = 0 + 1 Ut-1 + t
This is the most popular and a simple way to model Autocorrelation.
Assumptions:
1). t is a random variable for all t.
2). E ( t) = 0.
3). Var ( t) = for all t.
4). Cov ( t, t) = 0 for all t = t.
[at two different points they are not correlated].
5). | 1 | < 1.
Properties of Ut:
Solve Ut as follows
Ut = 0 + 1 Ut-1 + t
= 0 + 1 [Ut = 0 + 1 Ut-2 + t-1] + t
= 0 + 0 1 + 1 Ut-2 + 1 t-1 + t
= 0 + 0 1 + 1 [0 + 1 Ut-3 + t-2] + 1 t-1 + t
= 0 + 0 1 + 0 1 + 1 Ut-3 + 1 t-2 + 1 t-1 + t
[We will end up with following equation]
Ut = 0 + 0 1 + 0 1 + 0 1 +
+ t + 1 t-1 + 1 t-2 + 1 t-3 + ..
+ 1 Ut-
[1 = 0]
Or
Ut = 0 [1 + 1 + 1 + ...] + t + 1 t-1 + 1 t-2 + ..
Quaid-e-Azam University Islamabad
Page - 49 -
= 0 1- 1 + t + 1 t-1 + 1 t-2 + ..
1- 1
Ut = 0 + t + 1 t-1 + 1 t-2 + MA()
1-1
AR (1) model = MA () model
Ut = + 0 t-0 + 1 t-1 + 2 t-2 +
Weighted average of past innovations, shocks telling us use full information.
Parametric Properties of Ut:
1). E (Ut) = 0 + E ( t) + 1 E (t-1) + 1 E ( t-2) + .
1-1
= 0 + (0) + 1 (0) + 1 (0 ) + ...
1-1
= 0 + 0 [1+ 1 + 1 + ....]
1-1
= 0 + 0 [ 1 ]
1-1,
1-1
= 0 .
1-1
[0 + 0 + 0 + 0 + 0 + 0 + 0 + .+ 0 0]
Time zeros
10 = => 10 = 0 .
0
1=
=> 1 = 0 .
0
In AR process for Ut, if E ( t) = 0, we should set 0 = 0, so well have E (Ut) =0
Ut = 0 + 1 Ut-1 + + t
2). Var (Ut) = Var ( t + 1 t-1 + 1 t-2 + )
= Var ( t) + Var (1 t-1) + Var (1 t-2) + + (covariance)
= + 1 + 1 + ..(0).
= [1 + 1 + 1 + ].
[ = (0.8) = 0]
= [ 1 (1)]
1-1
Var (Ut) = .
--------------------- (1a)
1-1
This is constant variance for all t, there is no hetroscedisticity in Ut.
Page - 50 -
Note
Var (Ut, U0) = 1 [ 1 ]
1-1
= Var (Ut)
= .
1-1
.
.
.
.
Page - 51 -
which is function of i
1 > 0
1 = 0.8
Auto function is geometrically declining and approaching towards zero as the lag
length increases.
1 < 0
1 = - 0.5
Auto function oscillatory, starting with a negative value at lag length one and
approaching towards zero.
Price level = + (money supply) + Ut
Another case of AR (P)
Suppose we have quarterly data, to estimate the equation.
Yt = + Xt + Ut
We expect that
Ut = 0+ 1 Ut-1 + 2 Ut-2 + 3 Ut-3 + 4 Ut- 4 + t
To simply that matter we assume 0 = 1 = 2 = 3 = 0.
Quaid-e-Azam University Islamabad
Page - 52 -
i = 5 => 0
i = 6 => 0.
It follows that
= 0 other wise.
Autocorrelation function is;
4 > 0
4 = 0.5
Page - 53 -
4 < 0
4 = -0.5
AR (1), 1>0
AR (4), 1<0
AR (4), 1>0
AR (1) is just a symptom, it is kind of art not a perfect science and very use full idea,
correlogram is reason.
MA (1) Model:
Ut = t + t-1
t satisfies all standard properties, we can show that;
1). E (Ut) = 0.
2). Var (Ut) = (1+ 1)
3). Cov (Ut, Ut-i) = 1 for i = 1
Cov (Ut, Ut-i) = 0
for i 2
Ut = t + t-1
Ut-1 = t-1 + t-2
Ut-2 = t-2 + t-3
Ut
Ut-1
Ut-2
No correlation
Page - 54 -
For i = 1
for i 2
MA (1) 1 < 0
MA (1) 1 > 0
MA (1) 1 < 0
Testing of Autocorrelation:
1). Durbin Watson test:
DW test is based on DW statistics.
Page - 55 -
H0: = 0 => = 2
H1: 0 => 2
Now unfortunately the distribution of is not unique; it depends on actual data,
for exact distribution we dont have time and energy to make calculations of observation.
Durbin Watson has provided the two extreme distributions as shown in the following graph
Page - 56 -
.
Table for critical value provides dl and du for various values of;
n (Number of observations)
k`(Number of parameters minus one)
Example:
CPI = + M2 + GDP + ER +U
Sample 1970-71 to 2004-05
n = 35
k`= 3
From the table we have
dl = 1.42
du = 1.71
Suppose the calculated = 2.74, we can determine the right tail critical value.
4-dl = 4- 1.42 = 2.58
4-du = 4- 1.71 = 2.29
Since calculated < 4-dl and 4-du, we reject H0 and conclude that autocorrelation is
present, this test has some problem.
Notes on the test:
1). the test statistics has inconclusive range, so it may not produce a concrete conclusion.
2). the test is especially designed for AR (1) process, but not for higher order auto processes
or MA process or others.
3). Despite the above two limitations the test is power full to detect autocorrelation,
especially it is most common form AR (1) process.
Page - 57 -
H0 is :
Test
Reject
Decision
Accept
True
Type I
error
confidence
False
Power
Type II
error
h ~ N (0, 1)
Critical values are + 1.96 for 5% level of significance.
+ 1.645 for 10% level of significance.
+ 1.345 for 1% level of significance.
If h turn to be an imaginary number,
Page - 58 -
The estimation procedure attempt to replace the auto correlated variable Ut by non
auto correlated variable t.
Consider
Yt = + Xt + Ut.
(1)
[Take first difference of equation (1) and multiply with to all terms minus new equation from (1)]
Page - 59 -
(1`)
Yt = + Xt + Ut.
Yt-1 = + ( Xt-1) + Ut-1.
Subtract: Yt Yt-1 = (1- ) + (Xt Xt-1) + Ut- Ut-1.
Or
Yt Yt-1 = (1- ) + (Xt Xt-1) + t. ----------------- (A)
Now using equation (2) we can write
Subtract:
Ut = Yt Xt
Ut-1 = Yt-1 Xt-1
.
Ut Ut-1 = (Yt Xt) (Yt-1 Xt-1)
= t.
Or
(Yt Xt) = (Yt-1 Xt-1) + t. ----------------- (B)
Equation (A) or (B) has the error term t which satisfies all the classical assumptions. The
two unknown values of coefficients multiply each other then it becomes non-linear equation.
However, the trouble is that both these equations are non-linear in parameters; we can
not drive the formula for the OLS estimators of , , .
Since we can not use any unique formula to compute OLS estimators of , and ,
well have to apply some numerical algorithm.
Well consider two methods
(1) Cochrane-Orcatt two step iterative method.
(2) A Version of Direct Search.
Cochrane-Orcatt two step iterative method:
Step 1a:
Start with some initial value of , suppose we set = 0.
Then equation (A) becomes;
Yt = + Xt + t.
-------------------------------- (A`)
and
and
in equation (B)
Or
t = t-1 + t.
Apply OLS to compute , now use
-------------------------- (B`)
in equation (A);
Page - 60 -
Yt* = X1 + Xt* + t.
Apply OLS to yield and .
These are two step estimators of and (either not good due to wrongly estimated
and ).
Since = 0 is not true, and are poor
is poor
and
Or
t = t-1 + t.
------------------- (B``)
Step 2b:
Use in equation (A) to compute
and .
Page - 61 -
This is so called numerical derivative, if the expression in (a) is positive, it means that
at = 0.5, errors are increasing in , so we should set less than 0.5.
Repeat the same procedure for , and .
Once we know the directions in which , and should be searched, we can change
the initial values and repeat the entire process.
Example:
= 2, = 0.5, = 0.7.
Derivative with respect to > 0
Derivative with respect to < 0
Derivative with respect to < 0
Now we can set = 0.2
= 0.8
= 0.9.
For example now the signs of derivatives are
Positive for
Positive for
Negative for
Now set = 0.-3.
= 0.7.
= 0.95.
ETROSCEDISTICITY:
Introduction:
If the assumption that Var (Ui) = for all i is violated, well have Var (Ui) =i, which
can vary from observation to observation, this situation is referred to as Hetroscedisticity.
Examples:
(i) Qi = + Ki + Li + Ai +Ui
Quaid-e-Azam University Islamabad
Page - 62 -
Q = wheat output
K= capital
L = labor
A= acreage.
In our sample we have all size of farms; Var (Ui) measures the size of variation in output
due to random factor. We expect that Var (Ui) to increase with the size of farm.
(ii) Yi = + Xi + Ui
Y= expenditure on snacks
X= income
There is random fluctuation, low variance, and low income in mostly in cross section data.
Yi = + X i + U i
Var (Ui) = i which varies across observation points, one reason can be that when the
value of Xi is larger, there are more chances of larger unexpected variations in Yi, that is
Var (Ui) = i => f (Xi)
Example1):
Yi = + Xi + Ui
Yi is food consumption, Xi is income, and data is at household level. Now the
household with higher income level are expected to experience larger fluctuations in food
consumption.
Example2):
Yi = + X i + U i
Yi is wheat output; Xi is area under wheat crop, and Ui is random fluctuation in
wheat output. Larger the farms are expected to experience larger fluctuations in output. There
can be favorable and unfavorable effects of weather conditions on wheat output.
Obviously hetroscedisticity problem is more likely to arise where larger
variations in Xi. This is more likely to happen in cross section data rather than in time series
data. Hetroscedasticity mainly a problem of cross section data, it may arise in time series data
if the data is observed at low frequency level like daily or weakly.
Consequences of hetroscedisticity:
OLS estimators are remains linear and unbiased.
1). OLS estimators no more have minimum variance in the class of linear unbiased
estimators.
Not remains best.
2). Ordinary formula for calculating variances is no more valid.
Var ( ) _
x
Quaid-e-Azam University Islamabad
Page - 63 -
Yi = + X i + U i
Steps:
1) Arrange the data in order of Xi (ascending order).
2) Omit central 20% observations (to get some whole number), this will yield two subsamples 40% observations with small Xi and 40%observations with large Xi.
3) Estimate a regression equation for each sub-sample and compute 1, 2 and
hence;
2= 2
1= 1,
n1 k
n2 k
4) Compute F-statistics.
n1 = n2 = 0.4n
F = 1
if 1> 2
2
if 2> 1
F = 2
1
The F test is applied at 5% level of significance and degrees of freedom (df) equal
to n1-k and n2-k.our null and alternative hypothesis are as given below;
H0: 1= 2
H1: 1 2
[no hetroscedisticity]
[Hetroscedisticity]
Notes:
1). Test is very power full.
2). If there are more than one X variables than the test become quite complicated.
Foodi = + Incomei + Familyi +Ui
Or
Yi = + Xi + Zi + Ui
3). the test does not indicate the form of hetroscedisticity (due to linear, quadratic or
simultaneous).
Quaid-e-Azam University Islamabad
Page - 64 -
= n R
R is obtained from equation (2) it is not negligible, it is significant.
F is ~ F (n-1), (n-m)
is ~ m
m = 1+ (k-1) + (k-1) + (k-1) (k-2)
____
2
.
.
Intercept, linear, square
Simultaneous
m = 1+ (k-1) + (k-1) + (k-1) (k-2)
2
= 1+ k - 1 + k -1 + k - 3k + 2
2
= 2k 1 + k -1.5k + 1
2
= k + 1k
2 2
m = k (k +1)
2
Hypothesis:
H0: ai =0, bi =0, ci =0 for all I except intercept.
H1: At least one parameter in H0 is 0.
Rejection of H0 indicates presence of Hetroscedisticity.
Notes:
1) If k is large then m will also be large and it will reduce the power of test.
Let suppose k=6 => 6*7 = 21
2
Quaid-e-Azam University Islamabad
Page - 65 -
(1.15)
(4.5)
(0.99)
3) The test is very general in application, it give more than one form of Hetroscedisticity.
Solutions of Hetroscedisticity:
Informal Solution:
In some contexts, we can re specify our model to reduce the chances of
Hetroscedisticity.
Example1:
Suppose we suspect Hetroscedisticity relates to K (capital), we also expect that + + 1
than we can write,
It is more stable variable model as compare to previous; this equation is less likely to have
Hetroscedisticity.
Example2:
Consider a quadratic expenditure system
QES: Yi = + Xi + Xi+Ui
Yi = food,
Xi = income.
Xi, Xi,
Xi,
Xi
Si = + X i + 1 + V i
Xi
Var (Vi) = Var (1 Ui)
Xi
Quaid-e-Azam University Islamabad
Page - 66 -
= 1 Var (Ui)
Xi
= 1 Xi
Xi
= -------- no Hetroscedisticity.
Set
Page - 67 -
Hence
Now replace i by
for male
for female
From equation (i) if we assume as usual that E (U) =0 and D is fixed, we can infer the
following,
E (Y) = + D
[Mean income depends on income of male and female]
E (YM) =
E (YF) = +
E (YF) - E (YM) = ( + ) =>
for male
for female
D2 =0
for female
D2 =1
for male
We can write the model in three different forms (ways).
Y = 0 + 1D1 + U
--------------------- (i)
Y = 0 + 1D2 + V
--------------------- (ii)
Y = 1D1 + 2D2 + W --------------------- (iii)
If we include all dummies for all categories of a quantitative variable and also include
intercept it will create dummy variable trap, this will create perfect co-linearity and
estimation will break down.
E (Y) = 0 + 1D1 = 0 + 1D2 = 1D1 + 2D2
E (YM) = 0
= 0 + 1 =
+ 2
[if D1 =0, D2 =1, as male]
E (YF) = 0 + 1 = 0
= 1
[if D1 =1, D2 =0, as female]
E (YF) - E (YF) = 1
= - 1
= 1 - 2
[difference]
In equation (iii) model specification is not very good, essentially there is no difference in
results (base category is male) e.g. education and literacy relationship with income.
Page - 68 -
D2 =1 if primary,
=0 otherwise.
D3 =1 if secondary,
=0 otherwise.
D4 =1 if senior secondary, =0 otherwise.
D5 =1 if higher,
=0 otherwise.
The regression model is;
Y = 1 + 2D2 + 3D3 + 4D4 + 5D5 + U
We can set that
E (Y) = 1 + 2D2 + 3D3 + 4D4 + 5D5
E (YI) = 1
E (YP) = 1+ 2
E (YS) = 1+ 3
E (YSS) = 1+ 4
E (YH) = 1+ 5
G =1 if female,
=0 otherwise
Page - 69 -
Categories
G =0
Male, primary
Male, secondary
Male, higher
Female, primary
E (YP) = 1
+ 1
G =1 Female, secondary
E (YS) = 1+ 2 + 1 + 2
Female, higher
E (YH) = 1+ 2 + 1 + 3
----------------------------------------------------------------------------------------------.
Combining Qualitative and Quantitative Variables:
Suppose income depends upon experiences and education, experience is measured as a
quantitative variable (the years of experiences), education has three categories;
1). M Sc. or Equivalent
2). M Phil or Equivalent
3). PhD or Equivalent
Defining dummies;
D2 = 1 if M.Phil, = 0 otherwise
D3 = 1 if PhD,
= 0 otherwise
The model can be constructed as follows,
Y = + E + U -------------------- (1)
= 1 + 2D2 + 3D3 ---------------------- (2a)
= 1 +2D2 + 3D3 ---------------------- (2b)
Substitute (2a) and (2b) into (1) then,
Y = 1 + 2D2 + 3D3+ [1 +2D2 + 3D3] E + U
[Mean Income]
E (Y) = 1 + 2D2 + 3D3+ [1 +2D2 + 3D3] E
E (Y M Sc) = 1 + 1E
E (Y M.Phil) = ( 1+ 2) + (1 + 2) E
E (Y PhD) = ( 1+ 3) + (1 + 3) E
We expect that
3> 2>0,
3> 2>0,
1>0
1>0
Page - 70 -
Suppose the assumption that X variables are exogenous is not true. This situation is
called as the case of stochastic/ random repressors.
In a typical equation we have,
Y=+X+U
X
Example2: [Nt = A
Page - 71 -
IS:
R=
Where G is given and R and Y are not given, government can change G according to
their needs.
Example6:
Age
Weight = + Age + U
is given and information is complete, there are more factors like age are given in
the practice.
Consequences of stochastic/ random Regressors problem:
Consider
Y=+X+U
U satisfies all standard assumptions.
X is not fixed, it is random
Page - 72 -
is biased.
3). It can be shown that if x and u are independent, but x is random then the OLS estimator
is biased but with increase in sample size the biasness approach towards zero.
Examples:
1). Weight = + F + U
UWF
2). Q = + P + U
UQP
3). Y = + R + G + U
UYR
A good example is;
Yt = + Yt-1 + U t
Since Yt-1 = + Yt-2 + Ut-1 depends on random variable U t-1 soYt-1 is random, Ut is
uncorrelated with Yt-1 (It means todays event does not depends upon yesterdays action)
todays shock does not change yesterdays event.
However, Ut and Yt-1 are independent as sample size tends to infinity or reasonable large than
bias will be negligible.
Quaid-e-Azam University Islamabad
Page - 73 -
4). It can be shown that if x is random and x is not independent of U then OLS estimator is
biased and the amount of bias does not diminish with increase in sample size.
5). Consider special case of example 4.
Yt = + Yt-1 + U t
[Current CPI depends upon previous CPI].
Ut = Ut-1 + t
Ut
Now Ut-1
Ut and Yt-1 are correlated
Yt-1
Now OLS estimator becomes biased. [Recall that D W statistics also becomes biased, now
we know the reason]. Auto correlated and leg dependent variables both create more serious
problem.
Solution / Estimation Procedure:
Consider the model
Y=+X+U
Cov(X, U) 0
[X is not given, not random and correlated with U, example 4]
Now we define instrumental variable, say Z, as a variable that satisfies two conditions;
1). Z and X are closely correlated with in the given sample.
2). Cov (Z, U) =0 in the population. [This seems impossible in sense]
X
U
But Z is not correlated with U.
Z
Food = + Weight + U
X = weight
Z = age
U (sickness)
Age
Example:
C=+Y+U
Y=C+Z
Z = exogenous [U C Y]
Z, Yt-1
Time series data and Yt-1 also good instrumental factor to use in this example
Quaid-e-Azam University Islamabad
Page - 74 -
Stage 1:
Y=+X+U
------------ (i)
Cov (X, U) 0
Z is valid instrument.
Regress X on Z
X=a+bZ+V
and
and hence
)+U
Error in X variable
Or
Where
Now apply OLS,
Page - 75 -
These estimators are called 2SLS estimators as well as Instrumental Variables Least
Squares [IVLS] estimators.
[Demand]
[Supply]
P and Q both are endogenous variables, we are going to calculate the endogenous variables at
the same time and find first variable and put the value of that to calculate the second one at the same
time is referred to simultaneous equation case.
! ~~~~~~~~~~~~~~~~~~~~~~~~~ !
Page - 76 -
Page - 77 -
powerful test.
f) Following is a set of simultaneous equations
Z = + Y +U
Y=C+I+G+XM
Answer: False Statement, because Z and Y are not simultaneously determined where value of
Y is given in the second equation if both are not given same time then it will be the
set of simultaneous equation.
g) The inconclusive range in the DW test is the result of type-2 error.
Answer: False Statement because type-2 error is acceptance of H0 when it is false and inconclusive
range means we unable to give concrete result.
h) Goldfield-Quandt test is a powerful method of estimating an equation in the presence of
hetroscedisticity.
Answer: False Statement, because Goldfield-Quandt is a test not a estimating method or
multicollinearity is present.
Q.2: Critically evaluate the following statements. Give details to justify your answer.
a) Hybrid equations are used in order to remove both autocorrelation and multicollinearity from
an equation.
Answer: We have not learned that hybrid equation topic.
b) In the presence of multicollinearity, the OLS estimator is linear and unbiased and its variance
is smaller than the variance of any other linear and unbiased estimator.
Answer: It is true that In the presence of multicollinearity, the OLS estimator is linear and
unbiased and its variance is smaller than the variance of any other linear and unbiased
estimator.
Page - 78 -
c) In the presence of autocorrelation OLS estimators of regression parameters are likely to have
large sampling error and, therefore, they are unbiased.
Answer: It is true that In the presence of autocorrelation OLS estimators of regression
parameters are likely to have large sampling error and, therefore, they are unbiased.
d) The estimators based on Cochrane-Orcutt iterative method are linear and unbiased with
minimum variance.
Q.3: Using a cross section data of 500 household you are to study the effects of income, rural-urban
residence and education level of the household on household savings. The information on
education is classified as no education, school level education and higher education. Formulate an
appropriate regression equation.
Answer:
------------------- (2a)
------------------- (2b)
= n R
[R is obtained from given equation, it is significant.]
Where
m = 2k-1 with out cross product terms
Quaid-e-Azam University Islamabad
Page - 79 -
= 2*9 1.
= 18 1.
m = 17
[k = 9]
n = 500
n m = 500 17 = 483.
m = k (k +1)
2
Test the Hypothesis:
H0: ai =0, bi =0, ci =0 for all i except intercept.
H1: At least one parameter in H0 is 0.
Rejection of H0 indicates presence of hetroscedisticity and Acceptance of H0 indicates
no hetroscedisticity.
Q.5: Consider the following estimated regression equation based on a sample of 26 firms in a
manufacturing industry of Pakistan, where MPL and L denote the marginal product of labor and
the number of labor units respectively. The values in parenthesis are computed the t-values.
MPL = 100 +.012 (1/L), R = 0.4
(40.0)
(0.03)
= Log (Mt)
Log (Pt)
= Log (Mt)
Rt
= Log (Mt)
Log (Wt)
= Log (Mt) =
Log (Mt-1)
Page - 80 -
H1: 0
ii.
The output elasticity of money demand is greater than the price elasticity
Answer:
H0: > 0
H1: = 0
Answer:
H0: = , = 0, = 0, = 0
H1: , 0, 0, 0
iii.
independent variable on the right hand side when lagged dependent appears on the right
hand side then DW is not appropriate to detect first order autocorrelation and Durbin htest is suitable in this case.
Q.8: Can Whites general test detect all type of autocorrelation in a random variable?
Answer: This statement is wrong because Whites General test is for hetroscedasticity
variables and is more likely to occur in time series data and there is only one explanatory
variable in the given regression equation.
Q.11: Interpret the following regression equation as an economist. C, Y and W are per capita
consumption, income and wealth respectively, all in thousand rupees. Numbers in parentheses are
the t-values.
Ct = 1.17 + 0.45Yt + 0.55Wt,
(2.13) (6.17) (1.43)
Answer: Interpretation: -
R = 0.9643, DW = 0.09.
Page - 81 -
rupees per capita per year, further note that marginal propensity to consume (MPC) is
significantly different from zero and less than one, the estimated value of the MPC shows
that the marginal consumption rate is 0.45 or 45% of each incremental rupee of income is
consumed, while the remaining 55% is the marginal consumption rate of each
incremental rupee of wealth is consumed.
Q.12: Suppose in the equation Yt = a + bXt + Ut, the stochastic variables Xt and Ut are correlated with
each other.
a) Does this imply that we have problems of autocorrelation and/or multicollinearity and/or
hetroscedasticity?
Answer: It is not a problem of autocorrelation or multicollinearity or hetroscedasticity then it
is endogeniety problem
b) Can in this case the equation be estimated by Whites general test or Durbin-Watson test or
Durbins h-test?
Answer: We can not because Whites general test or Durbin-Watson test or Durbins h-test are
tests not solutions for the given equation.
Q.13: Suppose you have estimated two alternative cost functions for wheat using data on 500 farms.
The cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The
regression results are given below. The vales in parentheses are standard errors.
C/Q = 4568 + 0.2284 Q + 4.84 Q-
(1141) (0.5521) (4.40)
Can you test the null hypothesis that the marginal cost is an increasing function of output for each
equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be
applied and what additional information, if any, is required to perform the test.
Answer:
Log (C) = - 10.48 + 1.12 log (Q)
(2.56) (0.16)
Null Hypothesis
H0: = 0
H1: 0
t
We reject H0.
H0: = 1
H1: < 1
Quaid-e-Azam University Islamabad
Page - 82 -
We accept H0.
C/Q = 4568 + 0.2284 Q + 4.84 Q-
(1141) (0.5521) (4.40)
Null Hypothesis
H0: = 0
H1: 0
We reject H0.
H0: = 0
H1: 0
We reject H0.
Quaid-e-Azam University Islamabad
Page - 83 -
H0: = 1
H1: < 1
We accept H0.
H0: = 1
H1: > 1
We accept H0.
Conclusion:
The over all tests on each equation is given but we are able to make the
decision that over all equation is satisfactory and there is also need to check the performance
of the equation given by R which is not given in each of the equation. It is shown from the
results that marginal cost is not increasing function of the output for each equation.
Q.14: Suppose you want to study the propositions:
Loan recovery rate varies considerably across private commercial banks, public owned
commercial banks and development finance institutions,
ii.
The loan recovery rate declines with the size of loan. Formulate an appropriate econometric
equation, giving special attention to construction of variables and the type of data to be used
for estimation.
Answer:
i.
---------------- (1)
Page - 84 -
Dummies of banks:
D2 = 1 if Public own commercial banks, = 0 other wise.
D3 = 1 if Development finance banks, = 0 other wise.
= 1 + 2 D2 + 3 D3
= 1 + 2 D2 + 3 D3
------------------- (2a)
------------------- (2b)
DW = 1.82
Y = + R + Z +U
M = + R + Y + W +V
Model:
Y=+X+U
------------ (i)
Cov (X, U) 0
Z is valid instrument.
Page - 85 -
Stage 1:
Regress X on Z
X=a+bZ+V
and
and hence
is such that it contains only those variations in X which are determined by Z (basically
we are filtering out the problem).
In other words we have
)+U
Error in X variable
Or
Where
Now apply OLS
These estimators are called 2SLS estimators as well as Instrumental Variables Least
Squares [IVLS] estimators.
c) Provide interpretation for each parameter in the light of economic model you have chosen.
Answer:
Page - 86 -
Q.18:
a) Explain the use of dummy variables in determining the effects of gender (male or female) and
education (matriculation, intermediate, bachelor or higher) on wage rates among clerical
personnel.
Answer:
Gender dummy:
G =1 if female, = 0 other wise.
Education dummies:
E2 = 1 if intermediate, = 0 other wise.
E3 = 1 if higher, = 0 other wise.
We can construct model in this way;
W = + G + U.
= 1 + 2 E2 + 3 E3
= 1 + 2 E2 + 3 E3
------------------- (1)
------------------- (2a)
------------------- (2b)
Male, Matriculation
G = 0. Male, Intermediate
Male, Higher
= 1.
= 1 + 2
= 1 + 3.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
Female, Matriculation
G = 1. Female, Intermediate
Female, Higher
= 1 + 1
= (1 + 2) + (1 + 2)
= (1 + 3) + (1 + 3).
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
Interpretation of parameters.
1 = Mean wage of male at matriculation level education.
2 = Mean wage of male at intermediate level education.
3 = Mean wage of male at higher level education.
1 = Differential effect of being a female at matriculation level education.
2 = Differential effect of being a female at intermediate level education.
3 = Differential effect of being a female at higher level education.
Quaid-e-Azam University Islamabad
Page - 87 -
Q.19: Using the regression equation Yi = + Ui provide a precise answer to the following question
with or without mathematical proofs.
a) Under what assumption is the OLS estimator of linear?
Answer: should be linear function Y.
b) Under what assumption is the OLS estimator of unbiased?
Answer: when E ( ) = then OLS estimator will be unbiased.
Q.20: Consider the following demand function for rice where Q is per capita consumption of rice in
kilograms, P is price of rice per kilogram and M is per capita income in rupees. The regression
equation has been estimated on the basis of time series data for 9 years. The values in
parentheses are standard errors.
ln Q = 2.46 0.45 ln P + 0.65 ln M
(0.82) (0.20)
(0.50)
R = 0.90,
F = 12.00
1. The test statistics has inconclusive range, so it may not produce a concrete
conclusion.
2. The test is especially designed for AR (1) process, but not for higher order auto
processes or MA process or others.
3. DW gives biased results when lagged dependent variable appears on the right hand
side.
Page - 88 -
Answer:
Apply OLS and estimate the NXt to compute regression residual i.
Answer: Apply OLS and estimate the equation to compute regression residual i.
Q.26: Interpret multicollinearity problem as poor information content in data. Consider any
estimation strategy and explain how it can improve the information content.
Page - 89 -
Q.27: What econometric problems arise in the estimation of an equation with lagged dependent
variable on the right hand side? Suggest solution(s) to these problems.
Q.28: Specify an econometric equation to determine monthly earning in a cross section of 300
economists in Pakistan. Define all the variables in your model and explain how they can be
measured in practice.
Q.29: Determine identification of the following two equations by hybrid equations method and
explain the steps for estimation of each equation by 2SLS method. Consider the following set
of equations.
a) Y = 1 + 2 R + 3 Z + U
Cov (R, U) 0.
Answer: Here
b) M = 1 + 2 Y + 3 R + V
Answer: Here
Cov (Y, V) 0.
Page - 90 -
Stage 1: Regress Y on R
Y = 1 + 2 R + U
Apply OLS and obtain , and
Where is such that, it contains only that variation which explained or determined by R.
In other words we have
Y= +YWe can say that Y is endogenous, therefore there is some trouble
De-endogenizes Y
Stage 2: Rewrite the main equation as follows
M = 1 + 2 ( + Y - ) + 3 R + V
M = 1 + 2 + + 3 R + V + 2 (Y - )
Error in Y variable
Or
[W = 2 (Y - )]
M = 1 + 2 + + 3 R + W
Q.30: Consider an econometric equation involving four or more variables. Suppose you have access
to only annual data for 25 years for Pakistan and no other data are available in or outside
Pakistan. Further suppose that there is severe multicollinearity in data that can not be
eliminated by dropping any variable from the equation. How would you handle this situation?
Provide an elaborate answer.
Q.31: The daily demand for strawberries in Islamabad depends on price of strawberries only. On each
day a fixed quantity of strawberries (that can change from day to day) is brought to the market
and the price is determined at a level that clears the market. If it were known that the elasticity
of demand is constant. Would you be able to obtain unbiased estimator of the elasticity?
Answer:
Qd = + P + U.
Qs = Q is fixed.
Quaid-e-Azam University Islamabad
[Demand function]
[Supply function]
Page - 91 -
Log Qs = Q is fixed.
[Demand function]
[Supply function]
Here we can not take out the P from expectation because P is not fixed variable, so it
becomes biased because P and U are correlated with each other.
Q.32: You want to estimate Cobb-Douglas production function for manufacturing sector of
Pakistan with capital, labor and energy as the factor inputs, with only 16 time series
observations available. Multicollinearity problem is likely to arise. In order to tackle
this problem one can use 16 observations on the private sector and other 16 on public
sector to make a pooled sample of 32 observations. What complications are likely to
arise due to pooling and how would you respond to these complications?
!~~~~~~~~~~~~~~~~~~~~~~~~~!
Page - 92 -
E523 Econometrics
Sir Eatzaz Ahmed
Terminal Paper
2.
Are the following statements true, false or uncertain? Explain your answer.
a) The sample mean of the random error term, U = 1 Ui is equal to zero.
n
b) In the regression equation Y/ X = b + U the OLS estimator of b is equal to Y / X.
c) If the variable Y is regressed on X and log (X), it may create multicollinearity due
to strong linear relationship between the variables X and log (X).
d) In the equation X t = + Y t + Y t-1 + U t a major limitation of DW test is that it
produces biased results due to presence of Y t-1 on right hand side of the
equation.
e) Since a dummy variable can take only values, it must be fixed (exogenous).
f) Instrumental variables are used to test the presence of endogenous variables in the
equation.
3. Consider the regression model:
Yt = + Yt + Ut
U t = U t-2 + t.
t is white noise
a) Derive autocorrelation coefficients for the lag lengths 0, 1, 2, 3, 4.
b) Explain the Two-Step Iterative method of estimation.
4.
Page - 93 -
E523 Econometrics
Sir Eatzaz Ahmed
1st Mid Term
1. Suppose you have estimated two alternative cost functions for wheat using data on 500 farms. The
cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The regression
results are given below. The vales in parentheses are standard errors.
C/Q = 4568 + 0.2284 Q + 4.84 Q-
(1141) (0.5521) (4.40)
Can you test the null hypothesis that the marginal cost is an increasing function of output for each
equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be
applied and what additional information, if any, is required to perform the test.
Answer:
Log (C) = - 10.48 + 1.12 log (Q)
(2.56) (0.16)
Null Hypothesis
H0: = 0
H1: 0
t
We reject H0.
H0: = 1
H1: < 1
We accept H0.
Quaid-e-Azam University Islamabad
Page - 94 -
Null Hypothesis
H0: = 0
H1: 0
We reject H0.
H0: = 0
H1: 0
We reject H0.
H0: = 1
H1: < 1
We accept H0.
Quaid-e-Azam University Islamabad
Page - 95 -
H0: = 1
H1: > 1
We accept H0.
Conclusion:
The over all tests on each equation is given but we are able to make the
decision that over all equation is satisfactory and there is also need to check the
performance of the equation given by R which is not given in each of the equation. It is
shown from the results that marginal cost is not increasing function of the output for each
equation.
2. Using the regression equation Yi = + Xt + Ui provide a precise answer to the following question
with or without mathematical proofs.
a) OLS estimator of is .
Answer: OLS estimator of is .
Yi = + Xt + Ui
Estimation:
As we know that
= Yi As we know
= (Yi - )
= (Yi -
-Xt)
=> -2 (Yi -
-Xt) = 0.
=> Yi n - Xt) = 0.
=> Yi Xt = n
Quaid-e-Azam University Islamabad
Page - 96 -
= 1 ( Yi Xt)
n
= 1 (Yi Xt)
n
= 1 ( + Xt + Ui Xt)
n
= 1 ( + Ui)
n
= (1 +1 Ui)
n
n
= n 1 +1 Ui)
n
n
= +1 Ui
n
= + ai Ui
where ai = 1.
n
is unbiased
E ( ) = .
Proof:
E ( ) = E [ + (ai Ui)]
= + ai E (Ui)
= + ai (0)
since ai is fixed
where E (Ui) = 0
E( )=
! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !
Page - 97 -
E523 Econometrics
Sir Eatzaz Ahmed
2nd Mid Term
1.
a) How would you simply define Multicollinearity?
b) What type of procedure do you suggest to Diagnose or test Multicollinearity?
c) Multicollinearity?
2.
d) Drive Autocorrelation coefficient function at lag length 0, 1, 2, 3, 4.
e) Consider the following model & solve through Iterative Two-Step procedure.
Page - 98 -