Sei sulla pagina 1di 84

Chapter 3

A brief overview of the


classical linear regression model

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 1
Regression

• Regression is probably the single most important tool at the


econometrician’s disposal.
But what is regression analysis?

• It is concerned with describing and evaluating the relationship


between a given variable (usually called the dependent
variable) and one or more other variables (usually known as
the independent variable(s)).

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 2
Some Notation

• Denote the dependent variable by y and the independent


variable(s) by x1 , x2 , ..., xk where there are k independent
variables.

• Some alternative names for the y and x variables:


y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variabl
• Note that there can be many x variables but we will limit
ourselves to the case where there is only one x variable to
start with. In our set-up, there is only one y variable.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 3
Regression is different from Correlation

• If we say y and x are correlated, it means that we are treating


y and x in a completely symmetrical way.

• In regression, we treat the dependent variable (y) and the


independent variable(s) (x’s) very differently. The y variable is
assumed to be random or “stochastic” in some way, i.e. to
have a probability distribution. The x variables are, however,
assumed to have fixed (“non-stochastic”) values in repeated
samples.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 4
Simple Regression

• For simplicity, say k=1. This is the situation where y depends


on only one x variable.
• Examples of the kind of relationship that may be of interest
include:
– How asset returns vary with their level of market risk
– Measuring the long-term relationship between stock prices and
dividends.
– Constructing an optimal hedge ratio

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 5
Simple Regression: An Example

• Suppose that we have the following data on the excess returns


on a fund manager’s portfolio (“fund XXX”) together with
the excess returns on a market index:
Year, t Excess return Excess return on market index
= rXXX ,t − rft = rmt − rft
1 17.8 13.7
2 39.0 23.2
3 12.8 6.9
4 24.2 16.8
5 17.2 12.3

• We have some intuition that the beta on this fund is positive,


and we therefore want to find whether there appears to be a
relationship between x and y given the data that we have. The
first stage would be to form a scatter plot of the two variables.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 6
Graph (Scatter Diagram)
45

40

35
Excess return on fund XXX

30

25

20

15

10

0
0 5 10 15 20 25
Excess return on market portfolio

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 7
Finding a Line of Best Fit

• We can use the general equation for a straight line,


y = a + bx
to get the line that best “fits” the data.

• However, this equation (y= a+ bx) is completely


deterministic.

• Is this realistic? No. So what we do is to add a random


disturbance term, u into the equation.
yt = α + βxt + ut
where t= 1,2,3,4,5

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 8
Why do we include a Disturbance term?

• The disturbance term can capture a number of features:

– We always leave out some determinants of yt


– There may be errors in the measurement of yt that cannot
bemodelled.
– Random outside influences on yt which we cannot model

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 9
Determining the Regression Coefficients
• So how do we determine what α and β are?
• Choose α and β so that the (vertical) distances from the
data points to the fitted lines are minimised (so that the line
fits the data as closely as possible):
y

x
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 10
Ordinary Least Squares

• The most common method used to fit a line to the data is


known as OLS (ordinary least squares).

• What we actually do is take each distance and square it (i.e.


take the area of each of the squares in the diagram) and
minimise the total sum of the squares (hence least squares).
• Tightening up the notation, let
yt denote the actual data point t
yˆt denote the fitted value from the regression line
uˆt denote the residual, yt − yˆt

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 11
Actual and Fitted Value
y

yt

ût

ˆyt

xt x

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 12
How OLS Works

P5
• So min. uˆ1 2 + uˆ2 2 + uˆ3 2 + uˆ4 2 + uˆ5 2 , or minimise t=1 uˆt
2
.
This is known as the residual sum of squares.

• But what was uˆt ? It was the difference between the actual
point and the line, yt − yˆt .
P 2
(yt − yˆt )2 is equivalent to minimising
P
• So minimising uˆt
with respect to α̂ and β̂.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 13
Deriving the OLS Estimator

• But ŷt = α̂ + β̂xt , so let

T
X T
X
2
L= (yt − ŷt ) = (yt − α̂ − β̂xt )2 .
t=1 t=1

• Want to minimise L with respect to (w.r.t.) α̂ and β̂ , so


differentiate L w.r.t. α̂ and β̂

∂L X
= −2 (yt − α̂ − β̂xt ) = 0 (1)
∂ α̂ t
∂L X
= −2 xt (yt − α̂ − β̂xt ) = 0 (2)
∂ β̂ t

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 14
Deriving the OLS Estimator (Cont’d)
• From (1),
X X X
(yt − α̂ − β̂xt ) = 0 ⇔ yt − Tα̂ − β̂ xt = 0
t
P P
• But yt = Tȳ and xt = Tx̄.

• So we can write

T y¯ − Tα̂ − Tβ̂x̄ = 0 or ȳ − α̂ − β̂x̄ = 0 (3)


• From (2),
X
xt (yt − α̂ − β̂xt ) = 0 (4)
t

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 15
Deriving the OLS Estimator (Cont’d)

• From (3),

α̂ = ȳ − β̂x̄ (5)

• Substitute into (4) for α̂ from (5),


X
xt (yt − ȳ + β̂x̄ − β̂xt ) = 0
t
X X X X
xt yt − ȳ xt + β̂x̄ xt − β̂ xt2 = 0
t
X X
xt yt − T x̄ ȳ + β̂T x̄ 2 − β̂ xt2 = 0
t

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 16
Deriving the OLS Estimator (Cont’d)

• Rearranging for β̂,


 X  X
β̂ T x̄ 2 − xt2 = T xy − xt yt

• So overall we have
X
xt yt − T xy
β̂ = X and α̂ = ȳ − β̂x̄
xt2 − T x̄ 2

• This method of finding the optimum is known as ordinary


least squares.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 17
What do We Use α̂ and β̂ For?

• In the CAPM example used above, plugging the 5


observations in to make up the formulae given above would
lead to the estimates
• α̂ = −1.74 and β̂ = 1.64. We would write the fitted line as:

yˆt = −1.74 + 1.64xt


• Question: If an analyst tells you that she expects the market
to yield a return 20% higher than the risk-free rate next year,
what would you expect the return on fund XXX to be?
• Solution: We can say that the expected value of y = ‘−1.74
+ 1.64 × value of x’, so plug x = 20 into the equation to get
the expected value for y:

ŷt = −1.74 + 1.64 × 20 = 31.06

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 18
Accuracy of Intercept Estimate

• Care needs to be exercised when considering the intercept


estimate, particularly if there are no or few observations close
to the y-axis:
y

0 x
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 19
The Population and the Sample

• The population is the total collection of all objects or people


to be studied, for example,

• Interested in Population of interest


predicting outcome the entire electorate
of an election
• A sample is a selection of just some items from the
population.

• A random sample is a sample in which each individual item in


the population is equally likely to be drawn.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 20
The DGP and the PRF

• The population regression function (PRF) is a description of


the model that is thought to be generating the actual data
and the true relationship between the variables (i.e. the true
values of α and β).

• The PRF is yt = α + βxt + ut

• The SRF is yˆt = α̂ + β̂xt


and we also know that uˆt = yt − yˆt .

• We use the SRF to infer likely values of the PRF.

• We also want to know how “good” our estimates of α and β


are.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 21
Linearity

• In order to use OLS, we need a model which is linear in the


parameters (α and β). It does not necessarily have to be
linear in the variables (y and x).

• Linear in the parameters means that the parameters are not


multiplied together, divided, squared or cubed etc.

• Some models can be transformed to linear ones by a suitable


substitution or manipulation, e.g. the exponential regression
model
yt = e α Xtβ e ut ⇔ ln Yt = α + β ln Xt + ut

• Then let yt = ln Yt and xt = ln Xt


yt = α + βxt + ut

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 22
Linear and Non-linear Models

• This is known as the exponential regression model. Here, the


coefficients can be interpreted as elasticities.

• Similarly, if theory suggests that y and x should be inversely


related:
β
yt = α + xt + ut

then the regression can be estimated using OLS by


substituting
1
zt = xt

• But some models are intrinsically non-linear, e.g.


yt = α + xtβ + ut

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 23
Estimator or Estimate?

• Estimators are the formulae used to calculate the coefficients.

• Estimates are the actual numerical values for the coefficients.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 24
The Assumptions Underlying the Classical Linear
Regression Model (CLRM)

• The model which we have used is known as the classical linear


regression model.
• We observe data for xt , but since yt also depends on ut , we
must be specific about how the ut are generated.
• We usually make the following set of assumptions about the
ut ’s (the unobservable error terms):
Technical notation Interpretation
(1) E(ut ) = 0 The errors have zero mean
(2) var(ut ) = σ 2 The variance of the errors is constant and
finite over all values of xt
(3) cov(ui , uj ) = 0 The errors are linearly independent of
one another
(4) cov(ut , xt ) = 0 There is no relationship between the error
‘Introductory Econometrics for Finance’
and corresponding x variate
c Chris Brooks 2013 25
The Assumptions Underlying the Classical Linear
Regression Model (CLRM) (Cont’d)

• An alternative assumption to (4), which is slightly stronger, is


that the xt ’s are non-stochastic or fixed in repeated samples.

• A fifth assumption is required if we want to make inferences


about the population parameters (the actual α and β) from
the sample parameters (α̂ and β̂)

• Additional assumption

(5) ut is normally distributed

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 26
Properties of the OLS Estimator

• If assumptions (1) through (4) hold, then the estimators and


determined by OLS are known as Best Linear Unbiased
Estimators (BLUE).
What does the acronym stand for?

• ‘Estimator’ – α̂ and β̂ are estimators of the true value of


α and β

• ‘Linear’ – α̂ and β̂ are linear estimators

• ‘Unbiased’ – on average, the actual values of α̂ and β̂ will be


equal to their true values

• ‘Best’ – means that the OLS estimator β̂ has minimum


variance among the class of linear unbiased estimators; the
Gauss–Markov theorem proves that the OLS estimator is best.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 27
Consistency/Unbiasedness/Efficiency
• Consistent

The least squares estimators α̂ and β̂ are consistent. That is,


the estimates will converge to their true values as the sample
size increases to infinity. Need the assumptions E (xt ut ) = 0
and Var (ut ) = σ 2 < ∞ to prove this. Consistency implies that

lim Pr [|β̂ − β| > δ] = 0 ∀δ > 0


T →∞

• Unbiased

The least squares estimates of α̂ and β̂ are unbiased. That is


E (α̂) = α and E (β̂) = β. Thus on average the estimated
value will be equal to the true values. To prove this also
requires the assumption that E (ut ) = 0. Unbiasedness is a
stronger condition than consistency.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 28
Consistency/Unbiasedness/Efficiency (Cont’d)

• Efficiency

An estimator β̂ of parameter β is said to be efficient if it is


unbiased and no other unbiased estimator has a smaller
variance. If the estimator is efficient, we are minimising the
probability that it is a long way off from the true value of β.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 29
Precision and Standard Errors

• Any set of regression estimates of and are specific to the


sample used in their estimation.

• Recall that the estimators of α and β from the sample


parameters (α̂ and β̂) are given by
X
xt yt − T xy
β̂ = X and α̂ = ȳ − β̂x̄
xt2 − T x̄ 2

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 30
Precision and Standard Errors (Cont’d)
• What we need is some measure of the reliability or precision of
the estimators (α̂ and β̂). The precision of the estimate is
given by its standard error. Given assumptions (1)–(4) above,
then the standard errors can be shown to be given by
v v
X X
2
xt2
u u
u x t
u
SE (α̂) = s t X = st
u u 
X  
T (xt − x̄)2 T xt2 − Tx̄ 2
v v
1 1
u u
SE(β̂) = s t X = st X
u u
(xt − x̄)2 xt2 − Tx̄ 2

where s is the estimated standard deviation of the residuals.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 31
Estimating the Variance of the Disturbance Term

• The variance of the random variable u t is given by


Var(ut ) = E[(ut )-E(ut )]2
which reduces to
Var(ut ) = E(ut2 )

• We could estimate this using the average of ut2 :


1
s2 = ut2
P
T

• Unfortunately this is not workable since ut is not observable.


We can use the sample counterpart to ut , which is ût :
s 2 = T1
P 2
ût

But this estimator is a biased estimator of σ 2 .

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 32
Estimating the Variance of the Disturbance Term
(cont’d)

• An unbiased estimator of σ is given by


qP
ût2
s= T −2
ût2 is the residual sum of squares and T is the sample
P
where
size.
• Some Comments on the Standard Error Estimators
1. Both SE(α̂) and SE(β̂) depend on s 2 (or s). The greater the
variances 2 , then the more dispersed the errors are about their
mean value and therefore the more dispersed y will be about
its mean value.
2. The sum of the squares of x about their mean appears in both
formulae. The larger the sum of squares, the smaller the
coefficient variances.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 33
Some Comments on the Standard Error Estimators
(xt − x̄)2 is small or large:
P
Consider what happens if
y y

_ _
y y

_ _
0 x x
0 x x

1. The larger the sample size, T, the smaller will be the


coefficient variances. T appears explicitly in SE(α̂) and
implicitly in SE(β̂).
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 34
Some Comments on the Standard Error Estimators
(Cont’d)

(xt − x̄)2 is from t = 1


P
T appears implicitly since the sum
to T .
P 2
2. The term xt appears in the SE(α̂).
P 2
The reason is that xt measures how far the points are away
from the y-axis.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 35
Example: How to Calculate the Parameters and
Standard Errors

• Assume we have the following data calculated from a


regression of y on a single variable x and a constant over 22
observations.
• Data:
X
xt yt = 830102, T = 22, x̄ = 416.5, ȳ = 86.65,
X
xt2 = 3919654, RSS = 130.6

• Calculations

830102 − (22 × 416.5 × 86.65)


β̂ = = 0.35
3919654 − 22 × (416.5)2
α̂ = 86.65 − 0.35 × 416.5 = −59.12
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 36
Example: How to Calculate the Parameters and
Standard Errors (Cont’d)
• We write ŷt = α̂ + β̂xt
ŷt = −59.12 + 0.35xt
qP q
ût2 130.6
• SE (regression), s = T −2 = 20 = 2.55
s
3919654
SE (α̂) = 2.55 × = 3.35
22 × (3919654 − 22 × 416.52 )
r
1
SE (β̂) = 2.55 × = 0.0079
3919654 − 22 × 416.52
• We now write the results as

ŷt = −59.12 + 0.35xt


(3.35) (0.0079)

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 37
An Introduction to Statistical Inference

• We want to make inferences about the likely population


values from the regression parameters.

• Example: Suppose we have the following regression results:

ŷt = 20.3 + 0.5091xt


(14.38) (0.2561)

• β̂ = 0.5091 is a single (point) estimate of the unknown


population parameter, β. How “reliable” is this estimate?

• The reliability of the point estimate is measured by the


coefficient’s standard error.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 38
Hypothesis Testing : Some Concepts

• We can use the information in the sample to make inferences


about the population.
• We will always have two hypotheses that go together, the null
hypothesis (denoted H0 ) and the alternative hypothesis
(denoted H1 ).
• The null hypothesis is the statement or the statistical
hypothesis that is actually being tested. The alternative
hypothesis represents the remaining outcomes of interest.
• For example, suppose given the regression results above, we
are interested in the hypothesis that the true value of β is in
fact 0.5. We would use the notation
H0 : β = 0.5
H1 : β =
6 0.5
This would be known as a two sided test.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 39
One-Sided Hypothesis Tests

• Sometimes we may have some prior information that, for


example, we would expect β > 0.5 rather than β < 0.5. In
this case, we would do a one-sided test:
H0 : β = 0.5
H1 : β < 0.5

or we could have had


H0 : β = 0.5
H1 : β < 0.5

• There are two ways to conduct a hypothesis test: via the test
of significance approach or via the confidence interval
approach.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 40
The Probability Distribution of the Least Squares
Estimators

• We assume that ut ∼ N(0, σ 2 )

• Since the least squares estimators are linear combinations of


the randomP variables
i.e. β̂ = wt yt
• The weighted sum of normal random variables is also normally
distributed, so
α̂ ∼ N(α, Var (α))
β̂ ∼ N(β, Var (β))

• What if the errors are not normally distributed? Will the


parameter estimates still be normally distributed?

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 41
The Probability Distribution of the Least Squares
Estimators (Cont’d)

• Yes, if the other assumptions of the CLRM hold, and the


sample size is sufficiently large.

• Standard normal variates can be constructed from α̂ and β̂:

√α̂−α ∼ N(0, 1) and √β̂−β ∼ N(0, 1)


var (α) var (β)

• But var(α) and var(β) are unknown, so


√α̂−α ∼ tT −2 and √β̂−β ∼ tT −2
SE (α̂) SE (β̂)

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 42
Testing Hypotheses: The Test of Significance
Approach
• Assume the regression equation is given by,
yt = α + βxt + ut for t = 1, 2, ..., T
• The steps involved in doing a test of significance are:
1. Estimate α̂, β̂ and SE (α̂), SE (β̂) in the usual way
2. Calculate the test statistic. This is given by the formula
β̂−β ∗
test statistic = SE (β̂)

where β ∗ is the value of β under the null hypothesis.


3. We need some tabulated distribution with which to compare
the estimated test statistics. Test statistics derived in this way
can be shown to follow a t-distribution with T-2 degrees of
freedom.
As the number of degrees of freedom increases, we need to be
less cautious in our approach since we can be more sure that
our results are robust.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 43
Testing Hypotheses: The Test of Significance
Approach (Cont’d)

4. We need to choose a “significance level”, often denoted α.


This is also sometimes called the size of the test and it
determines the region where we will reject or not reject the
null hypothesis that we are testing. It is conventional to use a
significance level of 5%.
Intuitive explanation is that we would only expect a result as
extreme as this or more extreme 5% of the time as a
consequence of chance alone.
Conventional to use a 5% size of test, but 10% and 1% are
also commonly used.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 44
Determining the Rejection Region for a Test of
Significance
5. Given a significance level, we can determine a rejection region
and non-rejection region. For a 2-sided test:
f ( x)

2.5% 95% non-rejection region 2.5%


rejection region rejection region

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 45
The Rejection Region for a 1-Sided Test (Upper
Tail)

f ( x)

95% non-rejection region 5%


rejection region

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 46
The Rejection Region for a 1-Sided Test (Lower
Tail)

f ( x)

5% 95% non-rejection region


rejection region

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 47
The Test of Significance Approach: Drawing
Conclusions

6. Use the t-tables to obtain a critical value or values with which


to compare the test statistic.

7. Finally perform the test. If the test statistic lies in the


rejection region then reject the null hypothesis (H0 ), else do
not reject H0 .

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 48
A Note on the t and the Normal Distribution

• You should all be familiar with the normal distribution and its
characteristic “bell” shape.

• We can scale a normal variate to have zero mean and unit


variance by subtracting its mean and dividing by its standard
deviation.

• There is, however, a specific relationship between the t- and


the standard normal distribution. Both are symmetrical and
centred on zero. The t-distribution has another parameter, its
degrees of freedom. We will always know this (for the time
being from the number of observations −2).

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 49
What Does the t -Distribution Look Like?

f ( x)

normal distribution

t-distribution

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 50
Comparing the t and the Normal Distribution

• In the limit, a t-distribution with an infinite number of degrees


of freedom is a standard normal, i.e. t(∞) = N(0, 1)

• Examples from statistical tables:


Significance level N(0, 1) t(40) t(4)
50% 0 0 0
5% 1.64 1.68 2.13
2.5% 1.96 2.02 2.78
0.5% 2.57 2.70 4.60
• The reason for using the t-distribution rather than the
standard normal is that we had to estimate σ 2 , the variance of
the disturbances.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 51
The Confidence Interval Approach to Hypothesis
Testing

• An example of its usage: We estimate a parameter, say to be


0.93, and a “95% confidence interval” to be (0.77, 1.09).
This means that we are 95% confident that the interval
containing the true (but unknown) value of β.

• Confidence intervals are almost invariably two-sided, although


in theory a one-sided interval can be constructed.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 52
How to Carry out a Hypothesis Test Using
Confidence Intervals

1. Calculate α̂, β̂ and SE (α̂), SE (β̂) as before.

2. Choose a significance level, α, (again the convention is 5%).


This is equivalent to choosing a (1-α)×100% confidence
interval, i.e. 5% significance level = 95% confidence interval

3. Use the t-tables to find the appropriate critical value, which


will again have T-2 degrees of freedom.

4. The confidence interval is given by


(β̂ − tcrit × SE (β̂), β̂ + tcrit × SE (β̂))

5. Perform the test: If the hypothesised value of β (β ∗ ) lies


outside the confidence interval, then reject the null hypothesis
that β = β ∗ , otherwise do not reject the null.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 53
Confidence Intervals Versus Tests of Significance

• Note that the Test of Significance and Confidence Interval


approaches always give the same answer.

• Under the test of significance approach, we would not reject


H0 that β = β ∗ if the test statistic lies within the
non-rejection region, i.e. if
β̂−β ∗
−tcrit ≤ SE (β̂)
≤ +tcrit

• Rearranging, we would not reject if


−tcrit × SE (β̂) ≤ β̂ − β ∗ ≤ +tcrit × SE (β̂))
β̂ − tcrit × SE (β̂) ≤ β ∗ ≤ β̂ + tcrit × SE (β̂))

• But this is just the rule under the confidence interval


approach.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 54
Constructing Tests of Significance and Confidence
Intervals: An Example

• Using the regression results above,

ŷt = 20.3 + 0.5091xt


, T = 22
(14.38) (0.2561)

• Using both the test of significance and confidence interval


approaches, test the hypothesis that β = 1 against a
two-sided alternative.

• The first step is to obtain the critical value. We want


tcrit = t20;5%

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 55
Determining the Rejection Region
f ( x)

2.5% 95% non-rejection region 2.5%


rejection region rejection region

–2.086 +2.086 x

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 56
Performing the Test

• The hypotheses are:


H0 : β = 1
H1 : β 6= 1

Test of significance approach Confidence interval approach


β̂ − β ∗
test stat =
SE (β̂) Find tcrit = t20;5% = ±2.086
0.5091 − 1
= = −1.917
0.2561
β̂ ± tcrit · SE (β̂)
= 0.5091 ± 2.086 · 0.2561
= (−0.0251, 1.0433)
Do not reject H0 since test statistic Do not reject H0 since 1 lies
lies within non-rejection region within the confidence interval

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 57
Testing other Hypotheses

• What if we wanted to test H0 : β = 0 or H0 : β = 2?

• Note that we can test these with the confidence interval


approach.

• For interest (!), test


H0 : β = 0
vs. H1 : β =
6 0

H0 : β = 2
vs. H1 : β =
6 2

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 58
Changing the Size of the Test

• But note that we looked at only a 5% size of test. In marginal


cases (e.g. H0 : β = 1), we may get a completely different
answer if we use a different size of test. This is where the test
of significance approach is better than a confidence interval.

• For example, say we wanted to use a 10% size of test. Using


the test of significance approach,

β̂ − β ∗
test stat =
SE (β̂)
0.5091 − 1
= = −1.917
0.2561
as above. The only thing that changes is the critical t-value.
c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 59
Changing the Size of the Test: The New Rejection
Regions

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 60
Changing the Size of the Test: The Conclusion

• t20;10% = 1.725. So now, as the test statistic lies in the


rejection region, we would reject H0 .

• Caution should therefore be used when placing emphasis on or


making decisions in marginal cases (i.e. in cases where we
only just reject or not reject).

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 61
Some More Terminology

• If we reject the null hypothesis at the 5% level, we say that


the result of the test is statistically significant.

• Note that a statistically significant result may be of no


practical significance. E.g. if a shipment of cans of beans is
expected to weigh 450g per tin, but the actual mean weight of
some tins is 449g, the result may be highly statistically
significant but presumably nobody would care about 1g of
beans.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 62
The Errors That We Can Make Using Hypothesis
Tests

• We usually reject H0 if the test statistic is statistically


significant at a chosen significance level.
• There are two possible errors we could make:
1. Rejecting H0 when it was really true. This is called a type I
error.
2. Not rejecting H0 when it was in fact false. This is called a type
II error.
Reality

H0 is true H is false
√0
Significant Type I error = α
Result of test (reject H0 )

Insignificant Type II error = β
(do not reject H0 )

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 63
The Trade-off Between Type I and Type II Errors

• The probability of a type I error is just α, the significance level


or size of test we chose. To see this, recall what we said
significance at the 5% level meant: it is only 5% likely that a
result as or more extreme as this could have occurred purely
by chance.

• Note that there is no chance for a free lunch here! What


happens if we reduce the size of the test (e.g. from a 5% test
to a 1% test)? We reduce the chances of making a type I
error ... but we also reduce the probability that we will reject
the null hypothesis at all, so we increase the probability of a
type II error:

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 64
The Trade-off Between Type I and Type II Errors
(Cont’d)

Less likely Lower


to falsely →chance of
Reduce size→More strict →Reject nullրreject type I error
of test (e.g. criterion for hypothesisց
5% to 1%) rejection less often More likely to Higher
incorrectly →chance of
not reject type II error

• So there is always a trade off between type I and type II errors


when choosing a significance level. The only way we can
reduce the chances of both is to increase the sample size.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 65
A Special Type of Hypothesis Test: The t − ratio

• Recall that the formula for a test of significance approach to


hypothesis testing using a t-test was
β̂i −βi∗
test statistic = SE (β̂i )

If the test is H0 : βi = 0

H1 : βi 6= 0
i.e. a test that the population coefficient is zero against a
two-sided alternative, this is known as a t-ratio test:
β̂i
Since βi∗ = 0, test stat =
SE (β̂i )

• The ratio of the coefficient to its SE is known as the t-ratio or


t-statistic.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 66
The t-ratio: An Example

• Suppose that we have the following parameter estimates,


standard errors and t-ratios for an intercept and slope
respectively.
Coefficient 1.10 -4.40
SE 1.35 0.96
t-ratio 0.81 -4.63
Compare this with a tcrit with 15-3 = 12 d.f.
(2 12 % in each tail for a 5% test) = 2.179 5%
= 3.055 1%

• Do we reject H0 : β1 = 0? (No)
H0 : β2 = 0? (Yes)

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 67
What Does the t-ratio tell us?

• If we reject H0 , we say that the result is significant. If the


coefficient is not “significant” (e.g. the intercept coefficient in
the last regression above), then it means that the variable is
not helping to explain variations in y. Variables that are not
significant are usually removed from the regression model.
• In practice there are good statistical reasons for always having
a constant even if it is not significant. Look at what happens
if no intercept is included:
yt

xt

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 68
An Example of the Use of a Simple t -test to Test
a Theory in Finance

• Testing for the presence and significance of abnormal returns


(“Jensen’s alpha” - Jensen, 1968).

• The Data: Annual Returns on the portfolios of 115 mutual


funds from 1945-1964.

• The model: Rjt − Rft = αj + βj (Rmt − Rft ) + ujt for j=1, ...,
115

• We are interested in the significance of αj .

• The null hypothesis is H0 : αj =0 .

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 69
Frequency Distribution of t -ratios of Mutual Fund
Alphas (gross of transactions costs)
45
41
40

35

30 28
Frequency

25
21
20
15
15

10
5
5 2 2
1
0
–5 –4 –3 –2 –1 0 1 2 3
t-ratio

Figure : Source Jensen (1968). Reprinted with the permission of


Blackwell publishers.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 70
Frequency Distribution of t -ratios of Mutual Fund
Alphas (net of transactions costs)
35
32
30
30 28

25
Frequency

20

15
10 10
10

5 3
1 1
0
–5 –4 –3 –2 –1 0 1 2 3
t-ratio

Figure : Source Jensen (1968). Reprinted with the permission of


Blackwell publishers.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 71
Can UK Unit Trust Managers “Beat the Market”?

• We now perform a variant on Jensen’s test in the context of


the UK market, considering monthly returns on 76 equity unit
trusts. The data cover the period January 1979 – May 2000
(257 observations for each fund). Some summary statistics for
the funds are:
Mean Min Max Median
Average monthly return 1.0% 0.6% 1.4% 1.0%
Std dev of returns 5.1% 4.3% 6.9% 5.0%
• Jensen Regression Results for UK Unit Trust Returns, January
1979-May 2000
Rjt − Rft = αj + βj (Rmt − Rft ) + ǫjt

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 72
Can UK Unit Trust Managers “Beat the Market”?
: Results

Estimates of Mean Minimum Maximum Median


α -0.02% -0.54% 0.33% -0.03%
β 0.91 0.56 1.09 0.91
t-ratio on α -0.07 -2.44 3.11 -0.25

• In fact, gross of transactions costs, 9 funds of the sample of


76 were able to significantly out-perform the market by
providing a significant positive alpha, while 7 funds yielded
significant negative alphas.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 73
The Overreaction Hypothesis and the UK Stock
Market

• Motivation
Two studies by DeBondt and Thaler (1985, 1987) showed
that stocks which experience a poor performance over a 3 to
5 year period tend to outperform stocks which had previously
performed relatively well.

• How Can This be Explained?

2 suggestions
1. A manifestation of the size effect
DeBondt & Thaler did not believe this a sufficient explanation,
but Zarowin (1990) found that allowing for firm size did
reduce the subsequent return on the losers.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 74
The Overreaction Hypothesis and the UK Stock
Market (Cont’d)
2. Reversals reflect changes in equilibrium required returns
Ball & Kothari (1989) find the CAPM beta of losers to be
considerably higher than that of winners.

• Another interesting anomaly: the January effect.


– Another possible reason for the superior subsequent
performance of losers.
– Zarowin (1990) finds that 80% of the extra return available
from holding the losers accrues to investors in January.
• Example study: Clare and Thomas (1995)

Data:
Monthly UK stock returns from January 1955 to 1990 on all
firms traded on the London Stock exchange.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 75
Methodology

• Calculate the monthly excess return of the stock over the


market over a 12, 24 or 36 month period for each stock i:
Uit = Rit − Rmt n = 12, 24 or 36 months

• Calculate the average monthly return for the stock i over the
first 12, 24, or 36 month period:
R̄i = n1 nt=1 Uit
P

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 76
Portfolio Formation

• Then rank the stocks from highest average return to lowest


and from 5 portfolios:
Portfolio 1: Best performing 20% of firms
Portfolio 2: Next 20%
Portfolio 3: Next 20%
Portfolio 4: Next 20%
Portfolio 5: Worst performing 20% of firms.
• Use the same sample length n to monitor the performance of
each portfolio.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 77
Portfolio Formation and Portfolio Tracking Periods

• How many samples of length n have we got?


n = 1, 2, or 3 years.

• If n = 1year:
Estimate for year 1
Monitor portfolios for year 2
Estimate for year 3
...
Monitor portfolios for year 36

• So if n = 1, we have 18 INDEPENDENT (non-overlapping)


observation/tracking periods.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 78
Constructing Winner and Loser Returns

• Similarly, n = 2 gives 9 independent periods and n = 3 gives 6


independent periods.
• Calculate monthly portfolio returns assuming an equal
weighting of stocks in each portfolio.
• Denote the mean return for each month over the 18, 9 or 6
periods for the winner and loser portfolios respectively as R̄pW
and R̄pL respectively.
• Define the difference between these as R̄Dt = R̄pL − R̄pW .

• Then perform the regression


R̄Dt = α1 + ηt (Test 1)

• Look at the significance of α1 .


c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 79
Allowing for Differences in the Riskiness of the
Winner and Loser Portfolios

• Problem: Significant and positive α1 could be due to higher


return being required on loser stocks due to loser stocks being
more risky.

• Solution: Allow for risk differences by regressing against the


market risk premium:
R̄Dt = α2 + β(Rmt − Rft ) + ηt (Test 2)
where
Rmt is the return on the FTA All-share
Rft is the return on a UK government 3 month t-bill.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 80
Is there an Overreaction Effect in the UK Stock
Market? Results
Panel A: All Months
n = 12 n = 24 n = 36
Return on loser 0.0033 0.0011 0.0129
Return on winner 0.0036 −0.0003 0.0115
Implied annualised return difference −0.37% 1.68% 1.56%
Coefficient for (3.37): α̂1 −0.00031 0.0014∗∗ 0.0013
(0.29) (2.01) (1.55)
Coefficients for (3.38): α̂2 −0.00034 0.00147∗∗ 0.0013∗
(−0.30) (2.01) (1.41)
Coefficients for (3.38): β̂ −0.022 0.010 −0.0025
(−0.25) (0.21) (−0.06)
Panel B: all months except January
Coefficient for (3.37): α̂1 −0.0007 0.0012∗ 0.0009
(−0.72) (1.63) (1.05)
Notes: t-ratios in parentheses; ∗ and ∗∗ denote significance at the 10% and 5% levels, respectively.
Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell Publishers.

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 81
Testing for Seasonal Effects in Overreactions

• Is there evidence that losers out-perform winners more at one


time of the year than another?

• To test this, calculate the difference between the winner &


loser portfolios as previously, R̄Dt , and regress this on 12
month-of-the-year dummies:
R̄Dt = 12
P
i =1 δi Mi + vt

• Significant out-performance of losers over winners in,


– June (for the 24-month horizon), and
– January, April and October (for the 36-month horizon)
– winners appear to stay significantly as winners in
• March (for the 12-month horizon).

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 82
Conclusions

• Evidence of overreactions in stock returns.

• Losers tend to be small so we can attribute most of the


overreaction in the UK to the size effect.

Comments

• Small samples

• No diagnostic checks of model adequacy

c Chris Brooks 2013


‘Introductory Econometrics for Finance’ 83
The Exact Significance Level or p-value

• This is equivalent to choosing an infinite number of critical


t-values from tables. It gives us the marginal significance level
where we would be indifferent between rejecting and not
rejecting the null hypothesis.
• If the test statistic is large in absolute value, the p-value will
be small, and vice versa. The p-value gives the plausibility of
the null hypothesis.
e.g. a test statistic is distributed as a t62 = 1.47.
The p-value = 0.12.
• Do we reject at the 5% level?...........................No

• Do we reject at the 10% level?.........................No

• Do we reject at the 20% level?.........................Yes


c Chris Brooks 2013
‘Introductory Econometrics for Finance’ 84

Potrebbero piacerti anche