Sei sulla pagina 1di 11

# Page 1 of 11

## Abdul Qadeer Khan (PhD Scholar & Research Associate)

nd
D-Block 2 Floor, Room # 218
Office: 051-4486701 Ext: 212
Cell: 0333-6487274

Section 1, 2 & 3
Lecture # 4
July 3-5, 2012

## Assumptions of Classical Linear Regression Model are following:

i) Linearity
ii) Xt as variable not fixed
iii) Xt is not stochastic and fixed in different/repeated sample
iv) Expected disturbance term is zero
v) Homoskedasticity
vi) Autocorrelation
vii) Multicollineraity
viii) Normality of residual
ix) N≥2

## Now we discuss in detail:

Linearity: OLS (Ordinary Least Square Method) will be applied after the linearity of the data. The
assumption of linearity is that “slope should be constant”. So, slope will be constant when there will be
linearity in data. The confirmation of linearity is done through by functional specification test.
Page 2 of 11

## One Common Misunderstanding:

Dear Students & Fellows, the term “linear” refer to the fact that the population parameters that are “a
and b” in the equation appear linearly here and not to the fact that Xt (that is Independent variable or
explanatory variable) appears linearly. Thus, the model Yt =a+bXt2 + ut is still called a simple linear
regression even though the term “X” appears as a quadratic.

## An example of “NON-LINEAR REGRESSION MODEL” is:

Yt = a + XB +ut

Violations of linearity are extremely serious. If you fit a linear model to data which are nonlinearly
related, your predictions will be seriously in error.

Xt as variable not fixed: Xt is variable means more values must exist for explanatory variables. It must
not be fixed.

Xt is not stochastic (not random, it must be deterministic) and fixed in different samples.

## Linear Regression Equation:

Yt = a + bxt + ut

Where Yt is dependent variable whose behavior the researcher is interested to explain. Other names of
dependent variable are regressond and left hand side variable. The investigator then indentifies the
number of variables denoted by X that influence the dependant variable. X is generally called
independent variable. Other names are exogenous variable, explanatory variable, regressor, right hand
side variable. The choice of independent variable may come from economic theory, past experience,
other studies or from intuitive judgment.

Whereas, α is intercept (constant) and b is the slope of the straight line, also called regression
coefficient and tells us about marginal effect.

And Ut is an unobserved random variable, called error term. Other names are disturbance term or
stochastic term.

The term a+bxt is the deterministic part of the model, and Ut is called stochastic term.

The third assumption: Xt is not stochastic term, this must be deterministic. The second part of the
equation says that Xt is fixed in different samples. Here different samples means if you take the data of X
variable on December 31, 2011 on daily, monthly, quarterly or on yearly basis the value will be same.

Expected disturbance term is zero: The fourth assumption of CLRM is the expected disturbance term is
zero. Error term may come positive or negative.
Page 3 of 11

Homoskedasticity: The term Homoskedasticity means same spread from the regression line. Variance of
error term must be constant. If spread from the regression line deviate then Heteroscedasticity will exist
in the data.

For Example:
Page 4 of 11

Effects of Heteroscedasticity:

## i) β: (efficient and consistent) Whether Heteroscedasticity exists or not, there would be no

effect on beta, it will remain efficient and consistent. In other words, slope of the regression
line will be same.
ii) Standard Error (Inefficient, incorrect and may affect hypothesis testing): Standard error
(for sample) and standard deviation (for population). Due to the Heteroscedasticity standard
error may increase or decrease based on different spread of the data.
iii) T-Statistics (Inefficient and may affect hypothesis testing): Due to Heteroscedasticity,
standard error will increase or decrease that will ultimately upset t-statistics which is
calculated by: t = β / S.E

## More precisely in simplest words:

If Standard Error increase then t-statistics decrease (due to the denominator effect).
Ultimately due to t-statistics, hypothesis testing will be incorrect. So such variable that was
significant in nature will become insignificant.

If Standard Error decrease then t-statistics increase (due to the denominator effect).
Ultimately due to t-statistics, hypothesis testing will be incorrect. So such variable that was
insignificant in nature will become significant.

iv) F-Statistics: Due to Heteroscedasticity regression line (OLS, CLRM or linear line are same
name) not remain best fit. So decision making will be inefficient.

Autocorrelation: Today’s prices predicted by its lag prices (past prices) then autocorrelation will exist. In
other words, if one series is predicted by its lag series, then autocorrelation exist.

## Multicollineraity: According to this assumption there is no relation between explanatory variables.

Multicollineraity is not a problem originated from or related to the specification of the model or the
estimation of the specified model, it is a problem originating from the nature of the data as it exists in
case of when one or more explanatory variables (Independent Variables) affects other explanatory
variables (Independent Variables). In practice, one can minimize multicollineraity but cannot eliminate
it.

## N ≥ 2 : Following is the regression equation:

Yt = a + bxt + ut

Page 5 of 11

## Above issues will be dealt by responding the following questions.

1. What is Heteroscedasticity
2. How to detect
3. Problem associated
4. Its removal
5. What are different methods to deal in the presence of Heteroscedasticity

Detection of Heteroscedasticity:

1. Breusch-Pagan Test
2. Harvey Test
3. Glejser Test
4. Auto-Regressive Conditional Heteroscedasticity Test
5. PARK Test
6. White Test
7. Goldfeld-Quandt Test

## i. Generate variables and paste data and save it.

Page 6 of 11

ii. Regress these variables, go to quick menu, estimate equation and write equation; x1 as
dependent variable and x2 x3 and x4 as independent variables. x1 c x2 x3 x4 click OK.
Following results will be displayed. Do nothing, just generate error term; write equation as:
genr ut=resid as shown in picture below.

iii. We are applying first Heteroscedasticity test, Breusch-Pagan Test. Assumption of this test is
to use square root of error term. So to meet this assumption, we will generate: genr
utsq=ut^2 as shown in below:
Page 7 of 11

iv. Again go to quick menu and estimate equation. Now utsq will become dependent variable.
Equation will be: utsq c x2 x3 x4. From the results we will pick the value of R-square which is
in this case: 0.041560. We use R-square value for computing calculated value by formula:
LM = n*R2
= 39*0.041560
= 1.62084 -----------------------------This value is called Calculated Value.
Page 8 of 11

v. For final decision about Heteroscedasticity, we need critical value or tabulated value. We
will generate chi-square as: genr chi=@qchisq(0.95,3). Here 0.95 is confidence interval and
3 means we have 3 independent variables namely x2 x3 and x4. After generating chi square
chi file name will appear. Please open, series of single constant value will appear. This value
is called tabulated value.

Decision Criteria:

## Tabulated Value = 7.81472

o If Calculated value > Tabulated/critical value then Heteroscedasticity (In other words, there is
significant relationship)

o If Calculated value < Tabulated/critical value then Homoskedasticity (In other words, there is
insignificant relationship).

On the basis of above decision criteria, we conclude that there is Homoskedasticity or insignificant
relationship.

## Following shortcut steps:

i. Generate variables
ii. Go to quick and estimate equation with x1 c x2 x3 x4
iii. Go to view of that small window and click on residual diagnostics then Heteroscedasticity
Tests; as shown in given below:
Page 9 of 11

iv. Select Breusch-Pagan Test and click OK, results will display as given below picture. There are
same results as previous method. We will check Prob. Chi-Square(3), if this probability is
insignificant then it means there is Homoskedasticity, if significant there will exist
Heteroscedasticity.
Page 10 of 11

Other Tests i.e. Harvey Test, Glejser Test and White Test:

All steps are same as we previously tested in shortcut method; just change test type and click OK. If Chi-
square probability is higher than 5% or 0.05 then there will be Homoskedasticity otherwise
Heteroscedasticity. This is the easiest way for detection.

If you are ambitious to work with backend generation of variables; Steps for Glejser Test are given
below:

i. Generate Variables
ii. Regress those variables as x1 c x2 x3 x4
iii. Genr ut=resid
iv. For Glejser Test, don’t create error term or its square
v. Create: genr absut=abs(ut)
vi. absut c x2 x3 x4
vii. Pick the value of R-Square
viii. Apply the formula: LM = n*R2 -------------Called Calculated Value
ix. Generate chi-square as: genr chi=@qchisq(0.95,3)-------------Called Tabulated or critical value
x. Compare and take decision
Page 11 of 11

## In Harvey Godfray Test, please follow these steps:

i. Generate Variables
ii. Regress those variables as x1 c x2 x3 x4
iii. Genr ut=resid
iv. As Bruice Pagan Test, generate error term as: genr utsq=ut^2
v. genr Lutsq=log(utsq)
vi. Go to quick, estimate equation
vii. Write: Lutsq c x2 x3 x4
viii. Pick the value of R-Square
ix. Apply the formula: LM = n*R2 -------------Called Calculated Value
x. Generate chi-square as: genr chi=@qchisq(0.95,3)-------------Called Tabulated or critical value
xi. Compare and take decision

“Knowledge is power. Information is power. The secreting or hoarding of knowledge or information may
be an act of tyranny camouflaged as humility.” (Robin Morgan)

Note: Please convey, if you found any mistake. Comments for improvement will be highly appreciated. Thanks