Sei sulla pagina 1di 28


Two Variable Regression: The

Problem of Estimation

Muhammad Arsalan Hashmi

College of Management Sciences
The Method of Ordinary Least
 The method of OLS was developed Carl F Gauss.
 The estimators obtained from using this method has
very desirable statistical properties.
 The OLS method depends on the least square
principle, which is as follows:
Suppose we have the following SRF to estimate the
The Method of OLS (Continued)
 Diagrammatically, the estimates of ui are as
The Method of OLS (Continued)
 An intuitively appealing criterion is to
minimise the sum of residuals, i.e.
 However, this criterion has several
limitations. For e.g. all the residuals get equal
weight despite their size.
 To avoid these problems, we minimise the
sum of squared residuals, , where
The Method of OLS (Continued)

 To compute least squares estimators β^1 and

β^2 , we begin by partially differentiating the
preceding expression wrt β^1 and β^2

 These simultaneous equations are called

normal equations.
The Method of OLS (Continued)
 Solving the normal equations simultaneously,
gives us the estimators of β^1 and β^2 for the
simple regression model.
Assumptions Underlying the
Method of OLS

 If our objective is to perform statistical inference in

addition to estimation, we must make some
assumptions about how Xi (the explanatory
variable) and ui (the disturbance term) are
 The classical linear regression model is based on
ten assumptions:
1. Linear regression model: The regression model is
linear in the parameters
Yi = β1 + β2Xi + ui
Assumptions Underlying the
Method of OLS (Continued)
2. X values are fixed in repeated sampling: That is,
X is assumed to be non-stochastic.
3. Zero mean of the disturbance term, ui: That is,
the conditional mean of ui is zero.
E(ui|Xi) = 0
4. Homoscedasticity or equal variance of ui: That
is, the conditional variance of ui are identical.
Var(ui|Xi) = E[ui - E(ui|Xi)]2
= E(u2i|Xi)
= σ2
Assumptions Underlying the
Method of OLS (Continued)

 Diagrammatic depiction of homoscedasticity

Assumptions Underlying the
Method of OLS (Continued)
5. No autocorrelation between the
Given two X values, Xi and Xj (where i≠j), the
correlation between ui and uj (where i≠j) is
Assumptions Underlying the
Method of OLS (Continued)
6. Zero covariance between ui and Xi:

7. The number of observations n must be greater than the

number of parameters to be estimated.
8. Variability in X values: The X values in a given sample must not
all be the same.
9. The regression model is correctly specified
10. There is no perfect multicollinearity: That is, there is no
perfect linear relationships among the explanatory variables.
Variances & Standard Errors of
Least Square Estimates
 As the least square estimates of parameters
change from sample to sample, we need
some measure of reliability (or precision) of
the estimators β^1 and β^2.
 The precision of an estimate is measured by
its standard error.
 The standard error is merely the standard
deviation of the sampling distribution of the
Variances & Standard Errors of Least
Square Estimates (Continued)

 For a bivariate regression model, it can be

shown that the variances and standard errors
of β^1 and β^2 are as follows:

(See Appendix 3A for details)

Variances & Standard Errors of Least
Square Estimates (Continued)

 Moreover, the unbiased estimator of σ2 (the conditional

variance of ui), is

 An unbiased estimator of the standard error (se) is

NB: The (n-2) term denotes the degrees of freedom (df).

Variances & Standard Errors of Least
Square Estimates (Continued)

Prominent features of variances (and standard

errors) of estimators β^1 and β^2.
 Given σ2, the larger the variation in X values,
the smaller the variance of β^2. Also given
the larger the variance σ2, the larger the
variance of β^2
 The variance of β^1 is directly proportional to
σ2 and and inversely proportional to n and
Variances & Standard Errors of Least
Square Estimates (Continued)

 The dependence of β^1 and β^2 on each other

is measured by their covariance
Gauss-Markov Theorem
 If the assumptions of CLRM are satisfied, the OLS
estimators are BLUE (Best Linear Unbiased
Estimator). For e.g. if we have a regression as

and the assumptions of CLRM are valid, then the

estimator β^2 is BLUE.
NB: See Appendix 3A for proof of the Gauss-Markov
Gauss-Markov Theorem
 We have been considering the finite (small)
sample properties of estimators.
 It is important to distinguish between finite-
sample properties and asymptotic properties
of estimators.
Coefficient of Determination: r2
 The coefficient of determination r2 is a
summary measure that tells how well the
sample regression line fits the data.
 We compute r2 as follows:

 Summing and squaring on both sides gives

Coefficient of Determination: r2
 The former expression can be written as


That is, the total variation in the observed Y

values about their mean can be decomposed
into two parts, one attributable to the
regression line and the other to random
Coefficient of Determination: r2
Coefficient of Determination: r2
 As before,
Divide both sides with TSS
1= ESS + RSS

Where r2 = ESS =
Coefficient of Determination: r2
 Alternatively, r2 = 1 – RSS =

 Essentially, r2 measures the proportion or

percentage of the total variation in Y explained by
the regression model.
 Two properties of r2
a. It is a non-negative quantity
b. r2 lies between 0 and 1, i.e. 0≤ r2 ≤1
Sample Coefficient of Correlation
 A measure of linear association between two
variables is the coefficient of correlation.
 Do not confuse the coefficient of correlation
with the coefficient of determination.
 One may compute the sample correlation
coefficient, r, in the following ways

Properties of the Sample Coefficient
of Correlation
 It can be positive or negative.
 r lies between -1 and +1, i.e. -1≤ r ≤1
 It is symmetrical
 If two variables X and Y are independent, r=0.
However, r=0 does not necessarily imply that the
variables are independent.
 It is a measure of linear association only.
 Although a measure of linear association between
two variables, it does not imply a cause-and-effect
So far
 We have studied the estimation aspect of the
simple regression model in detail.
 The other major component of classical
statistical inference is hypothesis testing.
 Before delving into hypothesis testing, we
need to consider the probability distributions
of β^1, β^2 and other related issues.
 Gujarati, D (2003), “Basic Econometrics” 4th
Edition, McGraw Hill. Chapter 3.