Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 1
Lecture Outline
An Economic Model
An Econometric Model
The Nature of Regression Analysis
Estimating the Regression Parameters
Assumptions of OLS
Suggested Reading:
• Chapter 2, 3 & 4.1, 4.2: Hill, R.C., Griffiths W.E. and Lim, G.C.
Principles of Econometrics, fourth edition, Wiley, 2012 (pp. 39-110)
• Chapter 1-3: Gujarati, D.N. and Porter D.C. Basic econometrics, 5th
ed., McGraw-Hill, 2009 (pp. 34-117)
• Chapter 1 & 2: Dougherty, Christopher. Introduction to
econometrics. 4th ed. Oxford University Press, 2011 (pp.83-147)
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 2
Before we move on,
Random variable
Conditional probability
Population parameters (Fixed values)
Sample statistics (Random variables)
Expected value rules
Variance and covariance rules
Estimator and estimate
Properties of an estimator
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 3
4.1 An Economic Model
As business decision makers we are usually more
interested in studying relationships between variables
– Economic theory tells us that expenditure on economic
goods depends on income
Expendidure f (Income)
– Consequently Expenditure is defined as the ‘‘dependent
variable’’ and Income as the “independent’’ or
‘‘explanatory’’ variable
– The dependent variable is denoted as y and the
independent variable is denoted as x.
y=Expenditure; x=Income
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 4
4.1 An Economic Model
Suppose that we are interested only in food expenditure of
households with an income of $1,000 per week
Probability distribution of
food expenditure y given
income x = $1000
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 5
4.1 An Economic Model
In econometrics, we recognise that real-world expenditures
are random variables, and we want to use data to learn
about the relationship
Probability distributions
of food expenditures y
given incomes x = $1000
and x = $2000
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 6
4.2 An Econometric Model
Expenditure and income of 40 sample households
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 7
4.2 An Econometric Model
To investigate the relationship between expenditure and
income we must build an economic model and then a
corresponding econometric model
This econometric model is also called a regression model
– The simple regression function is written as
E y | x μ y| x β1 β2 x
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 8
4.2 An Econometric Model
The econometric model: a linear relationship
E ( y | x) dE ( y | x)
The slope of the regression line β 2 x
dx
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 10
4.2 An Econometric Model
4.2.1 Introducing the Error Term
Any observation on the dependent variable y can be decomposed
into two parts: a systematic component and a random component
called a “random error term”.
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 11
4.2 An Econometric Model
4.2.1 Introducing the Error Term
• The random error term is defined as
e y E y | x y β1 β2 x
• Rearranging gives
y β1 β 2 x e
where y is the dependent variable and x is the independent
variable
• The expected value of the error term, given x, is
E (e | x) E ( y | x) β1 β2 x 0
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 12
4.2 An Econometric Model
4.2.1 Introducing the Error Term
• The random error term is defined as
e y E y | x y β1 β2 x
• Any other economic factors that affect expenditures on food
are “collected” in the error term.
• The error term e is a “storage bin” for unobservable and/or
unimportant factors affecting the dependent variable
• The error term e captures any approximation error that arises
because the linear functional form we have assumed
• The error term captures any elements of random behaviour that
may be present in each individual.
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 13
4.2 An Econometric Model
4.2.1 Introducing the Error Term
Probability density functions for e and y
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 14
4.3 The Nature of Regression Analysis
4.3.1 Statistical versus Deterministic Relationships
• Regression analysis concerned with the statistical, not functional or
deterministic, dependence among variables
• In statistical relationships deal with random or stochastic variables,
that is, variables that have probability distributions.
• The dependence of family food expenditure on income, family size,
location, and taste, for example, is statistical in nature in the sense
that the explanatory variables, although certainly important, will not
enable the economist to predict food expenditure exactly because
of errors involved in measuring these variables as well as a host of
other factors (variables) that collectively affect the expenditure but
may be difficult to identify individually.
• Thus, there is bound to be some “intrinsic” or random variability in
the dependent-variable crop yield that cannot be fully explained no
matter how many explanatory variables we consider.
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 15
4.3 The Nature of Regression Analysis
4.3.2 Regression versus Correlation
• Closely related to but conceptually very much different,
• The primary objective of correlation analysis is to measure the
strength or degree of linear association between two variables.
• Regression analysis tries to estimate or predict the average value of
one variable on the basis of the fixed values of other variables.
• Regression and correlation have some fundamental differences.
• In regression analysis the dependent variable is assumed to be
statistical, random, or stochastic, that is, to have a probability
distribution. The explanatory variables, on the other hand, are assumed
to have fixed values (in repeated sampling),
• In correlation analysis, there is no distinction between the dependent
and explanatory variables and both variables are assumed to be
random.
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 16
4.3 The Nature of Regression Analysis
4.3.2 Regression versus Causation
• Although regression analysis deals with the dependence of one variable on other
variables, it does not necessarily imply causation.
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 17
4.4 Estimating the Regression Parameters
4.4.1 Estimates for the Food Expenditure Function
Food Expenditure and Income Data [food (POE 4th ed.) from gretl]
File>>Open data>>Sample file…>>POE 4th ed.>>food
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 18
4.4 Estimating the Regression Parameters
4.4.1 Estimates for the Food Expenditure Function
Food Expenditure and Income Data [food (POE 4th ed.) from gretl]
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 19
4.4 Estimating the Regression Parameters
4.4.1 Estimates for the Food Expenditure Function
Model…>>Ordinary least squares…
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 21
4.4 Estimating the Regression Parameters
4.4.2 Interpreting the Estimates
The value b2 = 10.21 is an estimate of 2,
the amount by which weekly expenditure
on food per household increases when
household weekly income increases by
$100.
Thus, we estimate that if income goes up
by $100, expected weekly expenditure on
food will increase by approximately $10.21
The intercept estimate b1 = 83.42 is an
estimate of the weekly food expenditure
on food for a household with zero income
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 22
4.4 Estimating the Regression Parameters
4.4.3 Prediction
Suppose that we wanted to predict weekly food
expenditure for a household with a weekly income of
$2000. This prediction is carried out by substituting x = 20
into our estimated equation to obtain:
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 23
4.4 Estimating the Regression Parameters
4.4.4 Least Square Principles
We want to estimate
population parameters β1
and β2 from
y β1 β 2 x e
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 24
4.4 Estimating the Regression Parameters
4.4.4 Least Square Principles
Least squares estimates for the
unknown parameters β1 and β2
are obtained my minimizing the
sum of squares function:
b1 y b2 x
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 25
4.4 Estimating the Regression Parameters
4.4.4 Least Square Principles
1 2 3 4 3*4 5
xi yi xi x yi y xi x 2 yˆ b1 b2 xi ê
2 4 -2 -1 2 4
3 3 -1 -2 2 1
5 4 1 -1 -1 1
6 7 2 2 4 4
5 4 1 -1 -1 1
4 7 0 2 0 0
4 6 0 1 0 0
3 5 -1 0 0 1
x =4 y =5 ∑=6 ∑=12
å(xi - x )(yi - y )
b2 = = b1 y b2 x
å(xi - x ) 2
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 26
4.5 Econometric Method
Assumptions of the Simple Linear Regression
SR1:The value of y, for each value of x, is is given by the linear
regression: y β1 β 2 x e
SR5: The variable x is not random, and must take at least two
different values
N12205 (BUSI2053) Introductory Econometrics Lecture 4: The Simple Linear Regression I Page 28