Sei sulla pagina 1di 7

Approximating the Normal Tail Alan G. Hawkes The Statistician, Vol. 31, No. 3. (Sep., 1982), pp. 231-236.

Stable URL: http://links.jstor.org/sici?sici=0039-0526%28198209%2931%3A3%3C231%3AATNT%3E2.0.CO%3B2-E The Statistician is currently published by Royal Statistical Society.

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/rss.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers, and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take advantage of advances in technology. For more information regarding JSTOR, please contact support@jstor.org.

http://www.jstor.org Wed Mar 12 15:05:41 2008

Tile Statisticiai~, ol. 31, No. 3 V ;Q 1982 Institute of Statisticians

0039-0526j82/00760231 $02.00

Approximating the Normal Tail


ALAN G . HAWKES Departmetzt of'Stntistics, Utrioersitj~ College oJ'S~vnnsen, Swnrzsen SA2 8PP

A number of simple approximate formulae for the upper tail probability of the normal distribution are compared. Some new more accurate, but only slightly more complicated, formulae are introduced. All are suitable for use with a pocket calculator.
Introduction It is very useful to have simple approximations to the cun~ulative normal distributioil. About every statistician carries a pocket calculator, but how many of us carry a book of tables everywhere we go? When working with a desk-top microcomputer it would be inefficient to have to stop to look up tables frequently. One could store a look-up table, but this would use up a lot of storage and the accuracy of interpolation in the table may not be good. The approxiination does not have to be all that simple for the micro or programmable calculator but clearly it should be very simple for manual operation of a calc~~lator, simplicity would also help ~ O L I and to remember the formula. All of the formulae used in this paper were calculated very easily on a Casio 502-P programmable pocket calculator. A number of approximations have appeared in the literature. We compare the merits of some of them and introduce some new and very accurate formulae. The problem, formally stated, is that if Z has the unit normal distribution we require approximations to the upper tail probability
Q(z) =P(Z > z ) = +(x)

dx

where $.) (u

= (27r)-llZ

exp (- .~2/2).

We deal only with the case z>O. The modificatiojls needed for z < O will be obvious. Some Alternative Approxinmtions 2.1. Pnge's Formrihe Page (1977) offers three approximations of the form Q d z ) = 1 /{I + exp (2.~):

where
y = alz(l+ azz2)

There are three possibilities suggested for the choice of constants ai, a2. These are, in increasing order of complexity,
(i) a1= 2/(2/i7), aa = 0

(iii) a1 =0.7988, a2 = 0.04417 The first of these, given by Tocher (1963), is not very accurate and is not considered further. There is little to choose between the other two. Alternative (ii) is slightly simpler in requiring only one numerical constant, but both are extremely easy to calculate. Alternative (iii) is very slightly more accurate overall but (ii) is better in several places. We tabulate version (iii) in the third column of Table 1. The exact values given in column 2 are taken from Pearson and Hartley's (1954) Biometrika Tables. It has a maximum absolute error of 0.00014. The percentage error is less than 1 per cent up to z=2.3. Thereafter it rises steadily to 9 per cent at z = 3 and 22 per cent at z=3.5. Either of these versions is good, being simple and accurate, but I would not recommend them above z = 2.5. 2.2. Harnalcer's Formula A number of approximations are essentially modifications of a result given by PoIya (1946) Q(z) = i l l - {I - exp (- 2z"n)}l/z] QHAJ{(Z) f [ l - {I- exp (- t2)11/zl = where
t = 0.806z(1- 0.0182)

(3)

(4) (5)

Note that 0.806 is close to (2/n)l/2=0.7979, so it represents quite a small change from equation (3). This function is tabulated in column 4 of Table 1. It is just as simple to calculate as Page's version (iii). It is not quite as accurate as Q p for z < 2, having a maximum absolute error of 0.0061, but is considerably better for larger values of z. The relative error remains below 1 per cent up to z = 3.5, reaches 2.3 per cent at z=4, and thereafter grows steadily to 10 per cent at z=4.5 and keeps on growing. If you want one simple formula that is reasonably accurate over quite a wide range, not above z= 4, then this can be recommended. 232

Table 1 Exact arzd various approximate tail probabilities of'the normal distributiott
Exact Page
QP

Lew

2.3. Lew's Formulae It is difficult for one simple approximation to be satisfactory for all values of z. Lew (1981) overcomes this problem by suggesting two formulae, one for small z and for large z. He was also concerned to produce ultrasimple, easy to remember, formulae. To this end he suggests QLI(Z) 4 - (2n)-lla(z- z8/7) =
and
Q L Z ( Z ) = (1

(6)

+z@(z)/(l+z+ za)

(7)

Lew recommends the use of (6) for z 6 1 and (7)for z > 1 , although the former actually remains superior up to z= 1.14. They are tabulated in columns 5 and 6 of Table 1. Q L ~ ( z ) , which is again a modification of (3), has a maximum absolute error of 0.00183 and relative error below 0.9 per cent in the range O<z< 1.1. For larger z the error rises rapidly.
In the range z , 1.1, Q ~ z ( z ) a maximum error of 0.00254. The relative has error is about 2 per cent over the range 1.1 6 z< 2, drops to 1 per cent for all z>3. It is very accurate for very large z, as shown in Table 2, with relative error less than 0.6 per cent for all z > 4.

QLI is less accurate than Q p and, in view also of the relatively poor performance of Q L for z<2, I suggest a better approach would be to ~ Table 2 Normal tail probabilities for large z
Q
Q L ~

QH~

Negative exponent

Note: The final column gives the negative power of 10 by which the figures under Q, Q L or Q H must be multiplied to give the ~ ~ probability. Thus, for example, Q(9)= 1129 x
10-22.

use Qp for zG2.4 and Q L for z > 2.4. This dual procedure would have ~ a maximum absolute error of 0.00014 and maximum relative error 1.4per cent, compared with values of 0.00265 and 2 per cent respectively for the procedure suggested by Lew.

Some New Formulae The procedures discussed above will be adequate for many purposes, but it is possible to achieve much greater accuracy with formulae only marginally more complicated and well within the scope of a programmable pocket calculator. They are certainly to be preferred for use on a microcomputer.
For small z I propose Qxi(z) = 911 - {I - exp ( - 2t2/x)}llz] where
t = z - (7.5166E- 3)z3+(3.1737E- 4)zj- (2.9657E- 6)zi

(8)

(9)

This may be thought of as a generalization of Hamaker's formula and is closely related to Bailey's (1981) solution to the incerse problem of computing quantiles. It is tabulated in column 7 of Table 1. It has a maxilnum error of 0.000017 for z<4. The relative error is less than 0.1 per cent for z <2.2, growing to 1.3 per cent at z= 3 and 18 per cent at z= 4. For large z I propose a minor modification of Lew's second formula, given by equation (7), ilamely

This is tabulated in column 8 of Table 1 and in Table 2. The aim was to improve the relatively poor performance of QLZfor z < 3 while being asymptotically equivalent to it for larger z. In fact we see from Table 2 that it does even better than the already outstanding performance of QLZ for z > 4 , having a relative error of at most 0.1 per cent right up to z = 20. For larger values of z we go beyond the range of accuracy of my calculator. In the range 2 < z < 4 , Q H differs from Q by at most 1 in the ~ fourth significant figure giving a relative error below 0.1 per cent in this range also.
A dual procedure using Qrri for 0 ~ 2 ~ 2 and Q I Z for z > 2.2 thus . 2 ~ has a maximum absolute error of 0.00001 and relative error less than 0.1 per cent for the whole range from 0 to 20. This degree of accuracy is surely sufficient for almost any application. If you are prepared to sacrifice a little accuracy for a slightly simpler calculation the dual procedure suggested at the end of section 2 may be used. If you want it simpler still, then Hamaker gives one very simple formula which has reasonable accuracy over quite a wide range.

REFERENCES
BAILEY, 3. R. (1981). Alternatives to Hastings' approximation to the inverse B. of the normal cumulative distribution function. Applied Statistics, 30,275-6. 'HAMAKER, H . C. (1978). Approximating the cumulative nonnal distribution and its inverse. Applied Statistics, 27,767. LEW,R. A. (1981). An approximation to the cumulative normal distribution with simple coefficients. Applied Statistics. 30,299-301. PAGE,E. (1977). Approximations to the cumulative normal function and its inverse for use on a pocket calculator. Applied Statistics, 26,756. PEARSON, S. and J~ARTLEY, H. 0. (1954). Biometrika ~ a b l efor Statisticians. E. s Cambridge University Press. POLYA, . (1946). Remarks on computing the probability integral in one and two G dimensions. Proceedings of the 1st Berkeley Symposium on Math. Statist. Prob., pp. 63-78. Tocm~, D. (1963). The Art of Simulation. English Universities Press, London. K.