Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
For any probability density function /(x), the likelihood that a single observation Xi comes from that density is /(.TJ). For a sample of n observations,
the likelihood that the sample comes from density f(x) is
i-l
\x> - H\c
(4.1)
(Turner, 1960). The MLE's are then used in the likelihood equation to
obtain the maximum likelihood. However, it is not always possible to
obtain MLE's in closed form for a specific distribution.
The maximum likelihood, conditional on the observations, is found for
both the alternative (/i(x)) and null (fo(x)) densities. The ratio of these
likelihoods, Li/Lo, is the LR test statistic. If the test statistic is greater
than the appropriately chosen critical value, then the null hypothesis is
rejected.
. 1
II
where /32 = 2<r2. Taking the derivatives with respect to // and a and setting
the results to 0 give the MLE's
fi = x
and
n
l
2
fr2 \^(r n~
Its
7
\-*-"i r\
JU I
U
1/2
n
~2
is the LR test statistic. This test also seems to be a good test against
other short-tailed symmetric alternatives; for these types of alternatives a
lower-tailed test should be used. A two-tailed range test can be used as a
test for non-normal symmetric distributions. Critical values for this test
are given in Table BIO.
For the double exponential alternative (or other long-tailed symmetric alternative), a single-tailed test using lower percentage points should be used;
d can also be used in a two-tailed fashion. This LR test is asymptotically
equivalent to Geary's a(l) test (Section 3.3.1), and is in fact calculated
identically except for the use of the median rather than the mean in the
numerator. This test has had only limited use, but has been investigated
for its power as a test for normality (Uthoff, 1973; Hogg, 1972; Smith, 1975;
Thode, Smith and Finch, 1983). Critical values for selected sample sizes
are given in Table B9.
Example 4-2. Forty subjects underwent a standard physical fitness test and pulse rate was measured at termination of the test
(Langley, 1971, p. 63). A Q-Q plot of these data (Data Set 6) is
presented in Figure 4-1, which indicates that these data are normal except possibly for an outlying observation. For these data
<r2 = 336.9 and the median is 113. This results in a test statistic
value of
U = 0.78
gaT> o
?-!
ol-
Expected Quantile
Figure 4-1 Q-Q plot of pulse rates measured after undergoing physical fitness test (n = 40).
which is not significant when compared to the lower 5% level critical value.
A* = /
/>oo
J-oo
Jo
OO
fOO
II
TT
1l\]cn~'2/1llfHc
I I JfdcT
V
' (lijr\,
U,ULUit\,
J_oo .= 1
where the Xi are independent. Then the most powerful test of the density
/o against /i which is invariant under changes in location and scale is given
by
At >C\*0
which rejects the null hypothesis for the appropriate constant C (Hajek
and Sidak, 1967).
4.2.1 Normal Maximal Invariant
For the normal distribution the location and scale maximal invariant is
A* (l/2)n~ 1 / 2 7r~~ n / 2+1/ ' 2 r(ra/2 - 1/2)[Y\T- - x) 2 l~ n / 2 + 1 / 2
(Uthoff, 1970). As discussed previously concerning LR tests (Section 4.1.1),
this is a function of 1/s, and the maximal invariants for the alternatives
described below are also in the form of the inverse of some function of
the observations; therefore, the common form of the test statistic, g ( x ) / s ,
where g(x) is the function of observations for the alternative, should be
compared to the lower tail of the distribution of the test statistic.
4.2.2 Uniform Alternative
For the uniform distribution the location and scale maximal invariant is
(Uthoff, 1970). As with the normal maximal invariant, the (n l)st root
of A^ is proportional to a basic function of the observations, in this case
the range, so that the MPLSI test statistic for a uniform alternative is
essentially the range test statistic
TU,N = u = (z( n ) -
(David, Hartley and Pearson, 1954). For this alternative, the MPLSI and
LR tests (Section 4.1.2) are identical.
4.2.3 Double Exponential Alternative
Uthoff (1973) showed that the maximal invariant for the double exponential
alternative is
AJ, = (7(xM)Bn}-n+l
where XM is the sample median,
t=i
and (for n odd),
.=
(2i - n)(n
For n even,
i
(n-l)
Bn =
(2i-n)(n
where ni = [(n 4- l)/2] and n2 = [n/2] + 1. The -(n - l)st root of the
MPLSI test statistic for this alternative is therefore proportional to
TD,N = l(xM}Bn/s
and is asymptotically equivalent to Geary's test, a(l) = j(x)/(na), and
the LR test for this alternative, d = /y(xM}/(n<r)- A lower-tailed test is
used to test for long-tailed symmetric alternatives.
4.2.4 Exponential Alternative
For the exponential alternative, the maximal invariant is given by
n
I
Y^ I V^
<
(
\n \ I
I
I \
l-TJ ~ xk)
10g F.7 ~ %k\
\
1I
I T* i
I 1/ J o \
'
T" i I I
( T* i
7 / 1 1 f / I? \ *
T" i 1 I
K )
The appropriate A divided by A^ gives the MPLSI for a Cauchy alternative to a normal, using a lower-tailed test for long-tailed symmetric alternatives. An appendix in Franck (1981) contains a FORTRAN program for
calculating this test.
4.2.6 Combinations of MPLSI Tests
Using the logic that the combination of a good test for short tails and
a good test for long tails would result in a good test for an unspecified
symmetric alternative, Spiegelhalter (1977) used a combination of MPLSI
tests to obtain a test that would be useful under more general conditions.
He defined a test statistic against symmetric alternatives as
where u is the range test statistic, a(l) is Geary's test statistic, and
Cn =
-i
.L6A
where A = (N,D,U,R,E).
Test critical values for T's and Ss are contained in Table Bll. Ts
and T's are upper-tailed tests for detecting symmetric alternatives; Ss is a
lower-tailed omnibus test.
Example 4-4- For the pulse rate data (Data Set 6),
u = 4.90,
a(l) = 0.79
which does not exceed the 5% critical value for a sample of size
40.
The kernels of the [/-statistics investigated by Locke and Spurrier have the
general form
fc-i
..,xk) =
c?:(x(?:+1) -(,;) )p
(4.2)
k < n, where the x<i\ are the order statistics of .TI, . . . ,.Tfc, the c?; are real
constants, and p is greater than zero.
Asymmetric Alternatives
When the alternative of interest is asymmetric, the constants in (4.2)
are such that c,; = c/c_7;. Locke and Spurrier (1976) considered the Ustatistics with kernel of order k = 3, c\ = 1, c^ = 1 and p I or 2,
and the respective test statistics Tpn = Upn/sp. The Upn are calculated as
Upn = f
\
2^
1 ^* '
<j>(Xi,Xj,Xk).
n ^ Is**' m
-i
Vij i + j n 1.
As [/-statistics, T\n and T2n are asymptotically normal and are symmetric
about 0 for symmetric distributions. The asymptotic approximation is
satisfactory when n > 20 for Tin and n > 50 for T^n. For T\ n , the variances
for samples of size n under the normal approximation are
Var(T13) = 01
Var(Tu) = (01 + 3a2)/4
and
Var(Tln) = 3!(n - 3)!{ai + 3(n - 3)a2 + 1.5(n - 3)(n - 4)o3}/n!
for n > 5. The constants ai, a2 and 03 are approximately 1.03803994,
0.23238211 and 0.05938718, respectively. The variances for T2n are calculated identically, with values of 7.03804, 1.77417 and 0.49608 for ai, a2 and
03, respectively.
Similar to \/bi, positive values of Tpn indicate skewness to the right
while negative values indicate skewness to the left. Two-tailed tests can be
used when the alternative is skewed in an unknown direction.
Locke and Spurrier found that 7\n and T2n had good power properties
compared to \fb\ but there was no advantage over skewness using T^n and
higher p.
Example 4-5. In a study of competing risks, Hoel (1972) presented
data on the effects of environment on mice subject to radiation.
4=o
a> N
D-xJ
I o-l
Figure 4-2 Detrended Q-Q plot of number of days until occurrence of cancer
in mice subjected to radiation (n 38).
The competing risks were from the various types of cancer that
developed. In this example, we examine the number of days until
cancer occurrence for 38 mice in a normal environment which
develop reticulum sarcoma (Data Set 7). The detrended Q-Q plot
for these data is shown in Figure 4-2 which shows that the data
are highly skewed in nature.
Components of the test statistic T\n are
1/1,38 = -34.29
with the sample variance of the data s2 10480.68, .so that
ri,38 = -0.335.
Standardizing with fi 0 and Var(Ti ) 3g) = 0.0156 results in a
standard normal value of z = 2.68, indicating that the data are
skewed to the left significantly more than would be expected from
a normal sample.
Symmetric Alternatives
For symmetric departures from normality, the constants in (4.2) are
such that ('i = f'k-i- Locke and Spurrier (1977; 1981) reported on two tests
( 4)'
n 1
S S 5Z <t>(xi,xjixk.,xi)
i-l j=i+l k=j+l l-k+l
where
This simplifies to
with
Wi
= (i - l}(n - i)(2i
-n-
The other test investigated, TD, was defined by the kernel (4.2) with k =
2, GI = 1 and p = 1. They determined that Tpj was a constant times
D'Agostino's D (Section 4.3.2).
Although T* is asymptotically normal, approximation of critical values for n > 10 is better using beta distributions (Locke and Spurrier, 1981).
The lower tails of the distribution are used to test for heavy-tailed alternatives and upper tails are used to test for light-tailed alternatives. Critical
values for selected sample sizes are given in Table B12.
Example 4-6. For the margarine sodium level data (Data Set 5),
U* = 6.18
which results in a value of
T* = 0.675.
This value is just below the upper tail 5% level of 0.68 of the test
statistic null distribution for a sample of size 28. Therefore, based
on this test, we would not reject the null hypothesis of normality.
4.3.2 D'Agostino's D
D'Agostino (1971) proposed the D statistic as an extension of the WilkShapiro test for moderate and large samples, since at the time the W test
was only identified (via the required coefficients) for samples up to size 50.
T is related to a* by
<7*=2>/7iT/(ra 2 -n).
Further, cr* is equal to y^g/2 where
which is known as Gini's mean difference; like the tests by Locke and
Spurrier, this is also a U-statistic (David, 1968), and in fact belongs to
the family of [/-statistics (4.2) derived by Locke and Spurrier (1977) for
symmetric alternatives with kernel (f)(x\,x<2) = \x\ x<2\, i.e., k 2, c\ = 1
and p = 1.
D'Agostino's test statistic is given by
Expected Quantile
Figure 4.8 Q-Q plot of average daily wind speeds for July 1984 (n 31).
D'Agostino (1972) later extended the D test to small samples and calculated small sample percentiles for Y using Pearson curves. These percentiles are given in Table B13. Although D is asymptotically normal, even
for samples of size 1000 the percentiles of Y have not converged to those
of a standard normal distribution, nor are they symmetric, indicating slow
convergence to the asymptotic distribution.
Although D was originally derived as an omnibus test, Locke and
Spurrier (1977) claimed that the kernel indicated that it might be weak
against alternatives with considerable asymmetry. This was confirmed in
many simulation studies which included D. D seems to be most powerful
as a directional test for detecting heavy-tailed alternatives. For skewed
alternatives or as an omnibus test, a two-tailed test should be used; as
a directional test for symmetric non-normal alternatives, the lower tail
should be used for heavy-tailed alternatives and the upper tail for lighttailed alternatives.
Example 4-7. Daily average wind speed at Long Island Mac Arthur
airport for the month of July 1984 were obtained from NO A A
(Data Set 8). These data are plotted in Figure 4-3 and appear
normal with a possible upper tail outlier.
For these data, the test statistic components are
T = 694.80,
D = 0.273,
Y = -1.646
Table 4-1 Variance estimates of Oja's T\ and T2 test statistics under normality.
11
Var(Ti)
Var(T2)
Var(Ti)
Var(T2)
5
6
7
8
9
0.01512
0.01045
0.00740
0.00575
0.00457
0.01549
0.00800
0.00473
0.00350
0.00248
10
15
20
30
oo
0.00358
0.00218
0.00144
0.00089
0.0214/n
0.00196
0.00098
0.00058
0.00032
0.0026/n
\r
~v
\7"
"V
T2 =
was proposed, with the sum over \<i<j<k<l<n.
The joint limiting distribution of T\ and T2 is bivariate normal and,
under any symmetric null distribution with density function g, T\ and T^
are uncorrelated. Hence, the test statistic
(ri- g (Ti)) 2
(T2 - ^9(r2))2
Varg(Ti)
Varg(T2)
nl
ai>g(*(j)-*(i))
(4.3)
aij
As with TI and T2, Oja claimed an omnibus test can be constructed using
T{ and T. Based on simulation results, E(T{) = 0 and E(T^} = 0.4523.
Variances are given by
/r?\ -1
V(T[] = L' ) (0.11217n2 - 0.44196n + 3.21425)
V'V
-n
/ -00
f(x)\og(f(x)}dx.
*^
\2ra
so that exp[/f(/)]/cr < \/2?re for all f ( x ) , equality being attained under
normality. Therefore, an omnibus test for a sample of size n is defined by
rejecting the null hypothesis if
Kmn < K*
2.14
#3,15 = 1-96
both of which are below the 5% critical value for the respective
test. Although we have identified the sample as being non-normal,
the value of the test does not give an indication of the type of
non-normality.
Similar to the Wilk-Shapiro W and D'Agostino's D, this test is also
the ratio of two estimates of the standard deviation, since
no a
on
DO
53-
1
0
t
Expected Quantile
Figure J,.^ Q-Q plot, of heights of cross-fertilized Zea mays plants (n = 15).
the term on the right being the ratio of the entropy estimate of the standard
deviation to the maximum likelihood estimate.
to attain a more nearly normal marginal distribution for the variance. Although the n pairs (x,;,?/,;) are not independent, Lin and Mudholkar used
r to estimate the extent of the dependence. Equivalently, the product
moment correlation of (xi,yi) can be used. High values of \r\ indicate
non-normality.
Asymptotically r is AT(0, 3/n) under the null hypothesis; however, for
moderate samples they suggest using Fisher's transformation
.0
605.0
610.0
615.0
620.0
Subsomple Mean
Figure 4-5 Plot of means and transformed variances, y,;, used for calculation
of the Lin and Mudholkar (1980) test for the number of days until cancer
occurrence data for mice (Data Set 7).
Under the null hypothesis, z is also asymptotically JV(0, 3/n); for small
samples, z is nearly normal with mean 0 and variance
0-2 = 3/n _ 7.324/n2 + 53.005/n3.
An approximation of the percentile points of the distribution of z under
normality was obtained using a Cornish-Fisher expansion. The upper a
percentile of z is approximately
Za =ffn[Ua-+T2n(U- 3tfa)/24]
(4.4)
where x,; is the mean of the iih quartile of the ordered data. Upper tail
critical values for Z were obtained through simulation only for sample sizes
of n - 20, 50 and 100; 0.05 values are 6.268, 4.999 and 4.720, respectively,
where the test for heavy-tailed alternatives is an upper-tailed test. For
short-tailed alternatives, a lower-tailed test should be used, although these
critical values were not given.
\ii(x.i)
A= 0
under the constraint that all Xi are positive. The best transformation to
normality is obtained by determining the maximum likelihood estimate of
A, where the log-likelihood is
L = -^ln(^) + (A-l)]Tln(:r 7 ;)
(4.5)
and &y is the variance under the transformation. In the context of testing
for normality, if the likelihood ratio test for A = 1 does not reject or,
equivalently, a confidence interval for A contains 1, then the sample cannot
be rejected as being from a normal distribution.
The precise maximum likelihood estimate of A can be obtained by iterative methods using (4.5) and a computer optimization routine; however,
in practice, (4.5) is usually evaluated using a grid of A values (for example,
Draper and Smith, 1981, suggest an initial grid within the range of (-2, 2).
This range can be widened subsequently as necessary). A value of A is then
8-
-0.5
03
Lambda
Figure 4-6 Likelihood for Box- Cox transformation of TSP data (Data Set
\n(Xi + A 2 )
AI = 0.
[n/2]
(n + 1
(n))
where n$ n/2. Both TI and ^2 are U-statistics and therefore asymptotically normal.
Groeneveld and Meeden's B\, BI(O) and 13%
Groeneveld and Meeden (1984) suggested several statistics as measures
of skewness. One general measure for a specified distribution F is
B2(a) = [F~l(l -a} + F~l(a} - 2i/x]/[F-1(l - a) + F-]
(4.6)
where vx is the median of the distribution and F-1(a:) is the ath quantile.
For a sample of observations, this suggests the substitution of sample order
statistics into (4.6) as a measure of skewness in a sample. The Bowley
coefficient of skewness (Bowley, 1920) is a measure of this form,
where the Qi are the ith quartiles of the random variable X. A third
measure of skewness is
where p,x is the sample mean. B\ can also be estimated using the sample
observations. #3 is closely related to the Pearson measure of skewness
where vx is the mode. Both BS and q are equal to 0 for any symmetric
distribution, and are positive for right skewed and negative for left skewed
densities.
Filliben's p
As a measure of the skewness of the densities he was using as alternatives in a power study of the probability plot correlation test, Filliben
(1975) used, in addition to
p = [F-1 (0.975) + F-l(0.5)]/[F~1(O.Q75) + F'1 (0.025)].
Note that this is the distributional equivalent to Uthoff 's T% statistic (Section 4.6.1) with a = 0.025, except the numerator measures the upper tail
of the distribution. For symmetric distributions p 0.5, while for asymmetric distributions skewed to the right 0.5 < p < I . In general, p can
range within the interval [0, 1].
where n\ and n<2 are about ftn and (1 fi)n, respectively, and ft is about
0.25; the hn are defined in Section 4.6.1. Under these conditions the expected value of S\ is 0.290 for the normal distribution.
then T2 is defined as
T2 = (1 - i/n)0.57854
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
References
Bowley, A.L. (1920). Elements of Statistics, 4th ed. Charles Scribner's
Sons, New York.
Box, G.E.P., and Cox, D.R. (1964). An analysis of transformations. Journal of the Royal Statistical Society B 226, 211-252.
D'Agostino, R.B. (1970). Linear estimation of the normal distribution
standard deviation. American Statistician 24, 14-15.
D'Agostino, R.B. (1971). An omnibus test of normality for moderate and
large size samples. Biometrika 58, 341-348.
D'Agostino, R.B. (1972). Small sample probability points for the D test of
normality. Biometrika 59, 219-221.
David, H.A. (1968). Gini's mean difference rediscovered. Biometrika 55,
573-575.
David, H.A., Hartley, H.O., and Pearson, E.S. (1954). The distribution
of the ratio, in a single normal sample, of the range to the standard
deviation. Biometrika 41, 482-493.
Davis, C.E., and Quade, D. (1978). U-statistics for skewness or symmetry.
Communications in Statistics - Theory and Methods A7, 413-418.
Downton, F. (1966). Linear estimates with polynomial coefficients. Biometrika 53, 129-141.
Draper, N.R., and Smith, H. (1981). Applied Regression Analysis, 2nd
ed. John Wiley and Sons, New York.
Filliben, J.J. (1975). The probability plot correlation coefficient test for
normality. Technometrics 17, 111-117.
Franck, W.E. (1981). The most powerful invariant test of normal versus
Cauchy with applications to stable alternatives. Journal of the American
Statistical Association 76, 1002-1005.
Groerieveld, R.A., and Meeden, G. (1984). Measuring skewness and kurtosis. The Statistician 33, 391-399.
Grubbs, F.E. (1969). Procedures for detecting outlying observations in a
sample. Technometrics 11, 1-21.
Hajek, J., and Sidak, Z.S. (1967). Theory of Rank Tests. Academic
Press, New York.
Hoeffding, W. (1948). A class of statistics with asymptotically normal
distributions. Annals of Mathematical Statistics 19, 293-325.
Hoel, D.G. (1972). A representation of mortality data by competing risks.
Biometrics 28, 475-488.
Hogg, R.V. (1967). Some observations on robust estimation. Journal of
the American Statistical Association 62, 1179-1186.
Hogg. R.V. (1972). More light on the kurtosis and related statistics. Journal of the American Statistical Association 67, 422-424.
Langley, R. (1971). Practical Statistics. Dover Publications, New York.
Lin, C.-C., and Mudholkar, G.S. (1980). A simple test for normality against
asymmetric alternatives. Biometrika 67, 455-461.
Locke, C., and Spurrier, J.D. (1976). The use of U-statistics for testingnormality against non-symmetric alternatives. Biometrika 63, 143-147.
Locke, C., and Spurrier, J.D. (1977). The use of U-statistics for testing
normality against alternatives with both tails heavy or both tails light.
Biometrika 64, 638-640.
Locke, C., and Spurrier, J.D. (1981). On the distribution of a new measure of heaviness of tails. Communications in Statistics - Theory and
Methods A10, 1967-1980.
Oja, H. (1981). Two location and scale free goodness of fit tests. Biometrika
68, 637-640.
Oja, H. (1983). New tests for normality. Biometrika 70, 297-299.
Shapiro, S.S., and Wilk, M.B. (1965). An analysis of variance test for
normality (complete samples). Biometrika 52, 591-611.
Smith, V.K. (1975). A simulation analysis of the power of several tests for
detecting heavy-tailed distributions. Journal of the American Statistical
Association 70, 662-665.
Spiegelhalter, D.J. (1977). A test for normality against symmetric alternatives. Biometrika 64, 415-418.
Spiegelhalter, D.J. (1980). An omnibus test for normality for small samples. Biometrika 67, 493-496.
Thode, H.C., Jr., Smith, L.A., and Finch, S.J. (1983). Power of tests of
normality for detecting scale contaminated normal samples. Communications in Statistics - Simulation and Computation 12, 675-695.
Turner, M.E. (1960). On heuristic estimation methods. Biometrics 16,
299-301.
Uthoff, V.A. (1968). Some scale and origin invariant tests for distributional
assumptions. Ph.D. thesis, University Microfilms, Ann Arbor, MI.
Uthoff, V.A. (1970). An optimum test property of two well-known statistics. Journal of the American Statistical Association 65, 1597-1600.
Uthoff, V.A. (1973). The most powerful scale and location invariant test
of the normal versus the double exponential. Annals of Statistics 1,
170-174.
Vasicek, O. (1976). A test for normality based on sample entropy. Journal
of the Royal Statistical Society B 38, 54-59.