Testing For Assumptions (Listed in 6.2) of The Disturbance of The Population Regression

Testing for Assumptions (listed in 6.
2) of the Disturbance
of the Population Regression
Assumption
a. Linearity
Plot
Statistic
(look for key patterns)
(summarize key
patterns)
Residual vs. Predicted
NA
Residual vs. Predicted
Breusch - Pagan
Histogram
Skewness
Normal plot (Q-Q or P-P)
Kurtosis
E( e | Xs) = 0
b. Constant Variance
Var(e | Xs) =
c. Normality
Jarque Bera
d. Independence
(See Appendix I)
(See Appendix II)
Line plot of the residual
Durbin - Watson
Appendix I: Testing for Normality By Using a Q-Q Plot

A natural question in applying a normal distribution is: how can we test whether the
the data actually come from a normal distribution? A simple method is to construct a
histogram, and compare the shape with the normal distribution that has the same mean
and the standard deviation as the sample mean and the sample standard deviation of the
data, respectively. Fortunately, several convenient statistical packages are available for
drawing both the histogram with the normal curve superimposed. As an example, the
figure below shows a histogram with a normal curve for recent 61 observations of the
monthly stock rate of return of Exxon.
20
10
Std. Dev = .05

Mean = .014
0
N = 61.00
-.125
-.075
-.100
-.025
-.050
.025
.000
.075
.050
.125
.100
.175
.150
The histogram uses only 61 observations, whereas the normal curve superimposed
depicts the histogram using infinitely many observations. Therefore, sampling errors
should show up as gaps between the two curves. The graph shows that the distribution of
the rate of return of Exxon appears not markedly different from a normal distribution at
least in the middle part.
The procedure described above is easy to understand, but is not effective for revealing
a subtle but systematic departure of the histogram from normality. A better graphical
check of normality is a normal probability plot. The plot can be easily developed using
Excel and we describe the process in below.
The first step is to sort the data from the lowest to the highest. Let n be the number of
observations. Then, the lowest observation, denoted as x(1) is the (1/n) th quantile of the
data. A quantile times 100 is the percentile, so x(1) is also the (1/n) x 100 th percentile of
the data. With this convention, however, the largest observation becomes the 100
percentile of the data, which presents a problem as the 100 percentile of a normal
distribution is infinity, the value that can never be assumed in observation. A suggested
choice is to define the i-th largest observation, x(i) as the (i/(n+1))th quantile, or the (i/
(n+1)) x 100 percentile of the data. In the Excel worksheet on the next page both choices
are computed for comparison. The next step is to determine for each observation the
corresponding quantile of the normal distribution that has the same mean and the
standard deviation as the data. The following Excel function is a convenient way to
determine the normal (i/(n+1)) th quantile, denoted as x(i).
x(i) = NORMINV(i/(n+1), sample mean, sample standard deviation).
i/(n+1)
x(i)
This value is the expected quantile if the data come from a normal distribution. x(i)
should be close to x(i) if the normality of the distribution is true. The quantile of the
normal distribution -- with the mean and the standard deviation equaling the sample
mean and the sample standard deviation, respectively -- are computed in the column
with the heading Expected.
A normal probability plot is a scatterplot of the data vs. the expected quantiles. the
plot is shown below. If the data indeed come from a normal distribution, then the
scatterplot should deviate in a random fashion from the reference line. Note that the 45
degree line serves as a convenient reference line for detecting a systematic departure
from normality.
0.2000
0.1500
0.1000
0.0500
0.0000
-0.1 -0.0 0.00 0.05 0.10 0.15 0.20
000-0.0500
500 00
00
00
00
00
-0.1000
-0.1500
We show below two normal probability plots, one for a sample of 60 independent
observations from a uniform distribution over the interval (0, 1) and the other for a
sample of 60 independent observations from an exponential distribution with mean 0.5.
Probabilities at both tails of a uniform distribution is significantly larger than a normal
distribution. On the other hand, an exponential distribution is skewed. These departures
from the normality are quite common and the two plots show how each departure can be
detected by using the normal probability plots.
Normal Probability Plot of Data From a Uniform Distribution
1.0
0.9
0.8
U
n
i
f
o
r
m
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.5
1.0
expunif
The plot on the right is a normal probability plot of observations from a uniform
distribution. The plot has an elongated S shape.
Normal Probability Plot of Data From an Exponential Distribution

The plot on the right is a normal probability plot of
observations from an exponential distribution. The plot
is convex.
2
E
x
p
o1
n
0
0
expexpo
Appendix II: Testing for Normality By Using a Jarque-Bera Statistic

A normal probability plot test can be inconclusive when the plot pattern is not clear.
In such case it is useful to compute a few numbers that measure non-normality. The
asymmetry of the distribution is measured by the skewness which is the third central
moment of the distribution:
x
3 E

The sample skewness is evaluated as follows:
1 n x x 3
3 i
n i 1
where:
n
1
2
xi x
n i 1
The skewness 3 is 0 for a symmetric population, as can been seen from the formula.
3 is significantly different from 0, then one can infer
Therefore, the sample skewness
that the population distribution is unlikely to be symmetric and hence not normal.
Another number that can be used to check the normality of the distribution is the the
fourth central moment of the distribution, called the kurtosis 4 :
x
4 E

The sample kurtosis is computed as follows:
1 n x i x
4

n i 1
The kurtosis measures the amount of the tail probabilities of the distribution and equals 3
4 is
for a normal population distribution. Therefore, if the sample kurtosis
8
significantly different from 3, then one can infer that the population distribution is
unlikelty to be not normal.
3 and
4 as follows:
The Jarque-Bera Statistic combines the two measures
n 2 1
2
JB
3 4 3
6
4
For a large number of observations, JB higher than 6 suggests that the population
distribution is unlikely to be normal.
The worksheet below illustrates the computation of these statistics for the rate of
return of the Weyerhaeuser stock for recent 60 months.
Computing the Skewness, the Kurtosis and the Jarque-Bera Statistc for Recent 60
Monthly Return of the Weyerhaeuser Stock
Month
Weyerhaeuser
1
0.27020
2
0.09449
3
0.08873
4
-0.02731
5
-0.10706
6
0.02551
7
0.02139
8
0.08088
9
-0.05442
10
-0.27098
11
-0.06645
12
0.10320
13
-0.00968
14
0.13487
15
-0.10145
16
-0.02065
17
-0.02000
18
0.11735
19
-0.08037
20
-0.02010
21
-0.00513
22
0.01753
23
0.02051
24
0.01005
25
0.08657
26
-0.04167
27
0.02899
28
0.10986
29
0.01709
30
-0.07143
31
0.16018
32
-0.01181
33
-0.05976
34
-0.06610
35
0.00459
36
0.00913
37
-0.10679
38
0.00000
39
0.06154
40
-0.04155
41
0.11735
42
-0.06849
43
-0.04216
44
-0.15026
45
-0.02439
46
-0.07875
47
0.07586
48
0.12179
49
0.06514
50
0.00543
51
0.04324
52
0.11606
53
0.14085
54
-0.11934
55
0.05794
56
-0.01339
57
0.00452
58
0.00631
59
-0.14932
60
0.17021
AVERAGE
0.00931
STDEVP
0.09164
z-score
2.8469
0.9295
0.8667
-0.3996
-1.2699
0.1768
0.1318
0.7810
-0.6955
-3.0586
-0.8267
1.0246
-0.2073
1.3701
-1.2086
-0.3269
-0.3198
1.1789
-0.9787
-0.3209
-0.1575
0.0898
0.1223
0.0081
0.8431
-0.5563
0.2147
1.0973
0.0850
-0.8810
1.6464
-0.2305
-0.7537
-0.8229
-0.0515
-0.0019
-1.2669
-0.1016
0.5699
-0.5550
1.1789
-0.8490
-0.5617
-1.7413
-0.3677
-0.9609
0.7262
1.2275
0.6092
-0.0423
0.3703
1.1649
1.4354
-1.4039
0.5307
-0.2477
-0.0522
-0.0328
-1.7310
1.7558
0.00000
JB=
(z-score) 3
23.0741
0.8031
0.6509
-0.0638
-2.0478
0.0055
0.0023
0.4764
-0.3364
-28.6127
-0.5649
1.0756
-0.0089
2.5722
-1.7656
-0.0349
-0.0327
1.6386
-0.9373
-0.0331
-0.0039
0.0007
0.0018
0.0000
0.5992
-0.1721
0.0099
1.3211
0.0006
-0.6839
4.4625
-0.0122
-0.4282
-0.5572
-0.0001
0.0000
-2.0333
-0.0010
0.1851
-0.1709
1.6386
-0.6120
-0.1772
-5.2795
-0.0497
-0.8873
0.3830
1.8495
0.2261
-0.0001
0.0508
1.5806
2.9572
-2.7669
0.1494
-0.0152
-0.0001
0.0000
-5.1869
5.4131
-0.03913
1.43903
(z-score) 4
65.6902
0.7464
0.5641
0.0255
2.6004
0.0010
0.0003
0.3721
0.2339
87.5141
0.4670
1.1021
0.0018
3.5242
2.1339
0.0114
0.0105
1.9318
0.9173
0.0106
0.0006
0.0001
0.0002
0.0000
0.5052
0.0957
0.0021
1.4496
0.0001
0.6025
7.3469
0.0028
0.3227
0.4586
0.0000
0.0000
2.5760
0.0001
0.1055
0.0949
1.9318
0.5196
0.0995
9.1929
0.0183
0.8526
0.2782
2.2702
0.1378
0.0000
0.0188
1.8412
4.2447
3.8843
0.0793
0.0038
0.0000
0.0000
8.9787
9.5046
3.75464
0.30000
0.20000
0.10000
0.00000
-0.2000
-0.1000
0.0000
-0.10000
-0.20000
-0.30000
10
0.1000
0.2000
0.3000

Testing For Assumptions (Listed in 6.2) of The Disturbance of The Population Regression

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Testing For Assumptions (Listed in 6.2) of The Disturbance of The Population Regression

Caricato da

Copyright:

Formati disponibili

Testing for Assumptions (listed in 6.

(look for key patterns)

Residual vs. Predicted

Residual vs. Predicted

Normal plot (Q-Q or P-P)

(See Appendix II)

Line plot of the residual

Appendix I: Testing for Normality By Using a Q-Q Plot

Std. Dev = .05

Normal Probability Plot of Data From an Exponential Distribution

Appendix II: Testing for Normality By Using a Jarque-Bera Statistic

The sample skewness is evaluated as follows:

The sample kurtosis is computed as follows:

Potrebbero piacerti anche