Sei sulla pagina 1di 14

Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

www.elsevier.com/locate/jspi

The Pitman estimator of the Cauchy location parameter


Gabriela V. Cohen Freue
Department of Statistics, University of British Columbia, Canada

Received 16 September 2003; accepted 19 May 2006


Available online 16 June 2006

Abstract
This paper examines the estimation of the Cauchy location parameter when the scale parameter is known. Using the squared
error loss function, a closed form of the minimum risk equivariant (MRE) estimator is derived. While several properties of the
estimator are discussed, this article focuses particularly on the efficiency of the MRE (or the Pitman) estimator in finite samples. A
simulation study indicates that the gain in efficiency when using the MRE estimator rather than the MLE is particularly important
in small samples. We also compare the performance of the Pitman estimator with that of other equivariant estimators, including
M- and L-estimators and some approximations to the Pitman estimator. In addition, we study the unconditional and conditional
inference of the Cauchy location parameter based on both the MLE and the Pitman estimator. We further assess the small sample
robustness properties of the estimators included in the simulation by contaminating the Cauchy distribution at three different levels
of contamination.
© 2006 Elsevier B.V. All rights reserved.

MSC: 62F10; 62F03

Keywords: Pitman estimator; MRE estimator; Cauchy; Location parameter

1. Introduction

This article focuses on the estimation of the location parameter of the Cauchy distribution when the scale pa-
rameter is known. Given the location invariance of this distribution, it is natural to restrict attention to the class of
location equivariant estimators. We derive the closed form of the minimum risk equivariant (MRE) estimator under
the squared error loss function and we discuss some of its properties for the Cauchy family. Finding a MRE estima-
tor can be especially relevant for small samples, where other estimators may perform poorly (e.g., Barnett, 1966a,b;
Barndorff-Nielsen and Cox, 1994; Ventura, 1998; Giummolè and Ventura, 2002). In addition, the variance of this esti-
mator serves as an attainable lower bound for the variance of other unbiased estimators. We also perform a simulation
study to compare it with other estimators suggested in the literature and we address the inference problem based on this
estimator.
The most natural estimator in the class of location equivariant estimators, the sample mean, is not useful for the
Cauchy location family because its distribution is again Cauchy independent of the sample size, and thus has infi-
nite moments. The simplest alternative, the sample median, is unbiased and consistent but inefficient (Rothenberg
et al., 1964). To improve on the median, previous research develops more efficient estimators based on different
sets of order statistics, but the resulting gains in efficiency are limited (e.g., Rothenberg et al., 1964; Barnett, 1966a;

E-mail address: gcohen@stat.ubc.ca.

0378-3758/$ - see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.jspi.2006.05.002
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1901

Bloch, 1966; Chan, 1970; Balmer et al., 1974). Another estimator that has been extensively studied for the Cauchy
family is the maximum likelihood estimator, MLE (e.g., Barnett, 1966b; Reeds, 1985; Bai and Fu, 1987). However,
while this estimator is consistent and asymptotically efficient, it is difficult to compute and achieves low efficiency in
small samples (Reeds, 1985; Barnett, 1966a,b; Small et al., 2000). In addition, none of these estimators has minimum
risk in the class of location equivariant estimators. We add to previous work by studying the MRE estimator using
the squared error loss function, the case where (under certain regularity conditions) the MRE estimator reduces to the
Pitman estimator (Pitman, 1939; Lehmann, 1998).
There is an extensive literature on the Pitman estimator. Girshick and Savage (1951) showed that under general
regularity conditions it is minimax within the class of all (not necessarily equivariant) estimators. Karlin (1958) and
Stein (1959) proved its admissibility under different regularity conditions. In addition, Port and Stone (1974) and Stone
(1974) proved its efficiency and asymptotic normality, and Johns (1979) studied its robustness properties. Bondesson
(1975) examined the relation between the Pitman estimator and the uniform minimum variance estimator in location
families. The MRE estimator has also been examined under other loss functions (e.g., Brown, 1966; Parsian et al.,
1993) and extended to families with multivariate unknown parameters (e.g., Jensen and Foutz, 1991; Prabakaran and
Chandrasekar, 1994). Lehmann (1998) provides a comprehensive review on MRE and Pitman estimators. Here we
check that the Pitman estimator is minimax and, for sample sizes greater than 7, it is also admissible for the Cauchy
location family. While it is known that no uniform minimum variance unbiased estimator exists for this family (Kendall
and Stuart, 1977), we add to previous literature by proving that a uniform minimum variance estimator does not exist
either. Moreover, we show that, for the Cauchy family, the Pitman estimator has a bounded IF, i.e., it is B-robust.
As the Pitman estimator requires the solution of two integrals, it is in general intractable. Thus, many approximations
of this estimator have been developed (e.g., Easton, 1991; Ventura, 1998) or numerical methods can be used to
approximate these integrals. For the case of the Cauchy location family, we improve on approximate estimates by
providing a closed form of the Pitman estimator. In addition, we use this form in a simulation study where we compare
the Pitman estimator with other estimators suggested in the literature in terms of mean and median bias as well as mean-
squared error. In particular, we assess the magnitude of the gain in efficiency when using the Pitman estimator rather
than alternative estimators. Our results suggest that the gain in relative efficiency when using the Pitman estimator
instead of the MLE can be as high as 19% in small samples. We also investigate the robustness properties of the
estimators in finite sample sizes under contaminations of the Cauchy model.
While most of the literature has focused on the point estimation of the Cauchy location parameter, Haas et al.
(1970) and Lawless (1972) addressed the inference problem. In particular, Haas et al. (1970) provided a table with
the distribution of a pivotal quantity based on the MLE and used it to construct confidence intervals for the location
parameter. Lawless (1972), instead, considered the distribution of this pivotal quantity conditional on the observed
values of an ancillary statistics and compared the unconditional and conditional confidence levels of the intervals. While
the same conditional intervals would be obtained using any equivariant estimator, the unconditional inference depends
on the estimator used. Thus, we investigate the use of the Pitman estimator to make inference about the unknown
location parameter. In addition, we use our simulation to compare the unconditional and the conditional confidence
levels based on the Pitman estimator.
Our focus in this paper is on the case where only the location parameter needs to be estimated, a situation where
other available estimators typically suffer from numerical problems or achieve low efficiency. However, we can extend
the results derived in this paper to the case where both parameters of the Cauchy distribution are unknown. As in
this case the likelihood function is unimodal, it is substantially different from the one examined in this paper and it
is being considered in a separate paper by the author. Related studies include Chan (1970), Koutrouvelis (1982) and
Tiku and Suresh (1992). The rest of the paper is organized as follows. In Section 2 we derive the closed form of the
Pitman estimator. In Section 3 we discuss some of its properties. Section 4 focuses on the inference problem of the
Cauchy location parameter based on the Pitman estimator. In Section 5 we discuss the results of our simulation where
we examine both the point estimation and the inference problems. Section 6 concludes.

2. A closed form of the MRE estimator

In this section we derive a closed form of the MRE estimator under the squared error loss function for the location
parameter of the Cauchy distribution when the scale parameter is known. Using Theorem 1.20 and Problem 1.21 in
1902 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

Lehmann (1998), it is easy to show that this estimator exists, is unique, and is given by the Pitman estimator
 +∞ n  
−∞ u k=1 (1/)(/ (Xk − u) +  ) du
2 2

 (X) =  +∞ n   , (1)
k=1 (1/)(/ (Xk − u) +  ) du
2 2
−∞

where X = (X1 , X2 , . . . , Xn ) is a sample of independent and identically distributed random variables having a joint
density

n
1 
f (x) =  . (2)
 (xk − )2 + 2
k=1

We can obtain an exact closed formula for the Pitman estimator ∗ (X) defined in (1) using the Residue Theorem to
solve integrals of the form
 +∞ n
1 
Ir (x) = ur   du, 0 r < 2n − 1. (3)
−∞  (xk − u)2 + 2
k=1

A similar method was used by Spiegelhalter (1985), Hanson and Wolf (1996), Howlander and Weiss (1988) in a
Bayesian context. √
From now on, i denotes the imaginary number (−1). It is easy to verify that

Ir (x) = lim gr (z) dz, 0 r < 2n − 1.
R→∞ CR

where
n
1 
gr (z) = zr   and CR : z = Reit , 0 t , R ∈ R.
 (xk − z)2 + 2
k=1

Note that the curve CR defines in polar coordinates the contour given by the interval [−R, R] followed by the upper
half of a circumference centered at 0 with radius R. Then, using the Residue Theorem we obtain
n
 
n−1 
n  1
Ir (x) = 2i Res(gr , zk+ ) = (zk+ )r , 0 r < 2n − 1,
k=1

k=1
(z+
j =k k
− zj+ )(zk+ − zj− )

where Res(g, z) is the residue of a function g at a point z, zs± = xs ± i, for s = 1, . . . , n are the poles of g, and zs+ are
those with positive imaginary part and thus lying inside CR . Rearranging terms we can write previous equation as


n−1 n  1 2
+ r
Ir (x) = (zk ) k where k = 1− i . (4)
 (xs − xj )2 + 42 (xs − xj )
k=1 j  =k
n
Note that the integrals defined in (3) are real for all 0 r < 2n − 1. In particular, I0 (x) = (/)n−1 k=1 k is real,
which implies that
n

Re{ik } = 0, (5)
k=1

where Re{z} denotes the real part of a complex number z. Thus, the integrals in (4) for r = 0 and 1 reduce to

n−1 
n 
n−1 
n
I0 (x) = Re{k } and I1 (x) = xk Re{k } (6)
 
k=1 k=1
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1903

and the Pitman estimate for the Cauchy location parameter has the form of a weighted average of the observations:

n 
∗ I1 (x)  Re{k }
 (x) = = xk n , n > 1. (7)
I0 (x) k=1 Re{k }
k=1

3. Properties

In this section we review and summarize the main properties of the Pitman estimator of the location parameter that
hold for the Cauchy family when the scale parameter is known. We assume, without loss of generality, that the known
scale parameter  is equal to 1.
In addition to having the minimum risk in the class of equivariant estimators, it is easy to verify that the Pitman
estimator minimizes the maximum risk among all other estimators (not necessarily equivariant) of the Cauchy location
parameter. That is, the Pitman estimator is minimax for the Cauchy location family. Furthermore, according to Stein
(1959), the Pitman estimator is admissible if there exist an equivariant estimator with third finite moment. Thus, for
n7, the Pitman estimator of the Cauchy location parameter is admissible as the median is an equivariant estimator
with third finite moment.
Another important property is that the Pitman estimator of the Cauchy location parameter is unbiased. This easily
follows from the uniqueness of the MRE estimator under the squared error loss function (Lehmann, 1998). Thus, under
this loss function, the risk of the Pitman estimator reduces to its variance which serves as an attainable lower bound for
the variance of other unbiased equivariant estimators. Note that most equivariant estimators suggested for this family
belong to this class, including the M-estimators (e.g., Barnett, 1966b; Goodall, 1983), most L-estimators (e.g., Barnett,
1966a; Rothenberg et al., 1964; Bloch, 1966) and the asymptotically best linear estimators based on a given spacing
(e.g., Balmer et al., 1974; Chan, 1970). In particular, this implies that, for finite sample sizes, the Pitman estimator
is more efficient than the MLE, which has always been considered as a gold standard for this family (e.g., Barnett,
1966a; Rothenberg et al., 1964; Bloch, 1966; Balmer et al., 1974; Chan, 1970). The following proposition states another
important consequence of the unbiasedness of the Pitman estimator.

Proposition 3.1. For the Cauchy location family a uniformly minimum variance (UMV) estimator does not exist.

Proof. For a general location parameter family, Bondesson (1975) proved that if a UMV estimator of the location
parameter exists, then the Pitman estimator is also a UMV estimator. However, if the Pitman estimator is a uniformly
minimum variance estimator for the Cauchy location family, in particular it will be a uniformly minimum variance
unbiased estimator (UMVUE). This is a contradiction as there is no UMVUE in this location family (Kendall and
Stuart, 1977). Thus, a UMV estimator cannot exist in this case. 

Finally, in the following proposition we study the robustness of the Pitman estimator for the Cauchy location family
in terms of its influence function (IF). Heuristically, the IF measures the asymptotic bias of an estimator caused by an
infinitesimal contamination of the nominal distribution. If this function is bounded, the estimator is called B-robust. For
example, it is easy to see that the MLE of the Cauchy location parameter is B-robust and equivalent to the Hampel’s
B-robust estimator (Hampel et al., 1986, pp. 104, 120).

Proposition 3.2. For the Cauchy location family, the Pitman estimator is B-robust.

Proof. It is easy to show that the Pitman estimator in (1), can be expressed as

∗ u exp{nh(u, Fn )} du
 (X) =  , (8)
exp{nh(u, Fn )} du

where h(t, G) = log f (x − t) dG(x), f is the density function and Fn is the empirical cumulative distribution of the
Cauchy location family. Under certain regularity conditions, Johns (1979) approximated the integrals in (8) using a
1904 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

Laplace approximation and showed that the functional T of the Pitman estimator is implicitly defined by

f  (x − T (F ))
dF (x) = 0, (9)
f (x − T (F ))

where F is the Cauchy cumulative distribution. Thus, to show that the Pitman estimator Cauchy location parameter
is B-robust, it suffices to see that the regularity conditions required to apply the Laplace formula are satisfied for this
family. More specifically, we need to show that h(t, F ) has a unique maximum at tF ; h(t, F ) tends to −∞ as t → ±∞,
its derivative h (t) exists in some neighborhood of tF , and h (tF ) exists and it is negative (Johns, 1979; De Bruijn,
1961). In particular, using standard results from real analysis, one can show that h satisfies these conditions for the
Cauchy location family. Thus, the Pitman location estimator is B-robust as its IF depends on the function f  (x) which
is bounded. 

It is interesting to note that as the B-robustness is an asymptotic property, a B-robust estimator may still be too
sensitive to slight contaminations in the model for small-sample sizes. Thus, in Section 5.1 we study the finite sample
robustness properties of the estimators.

4. Inference based on the Pitman estimator

In this section we address the inference problem of the Cauchy location parameter using the Pitman estimator. More
specifically, we obtain percentage points of a pivotal quantity that enable us to construct CI based on this estimator.
Haas et al. (1970) previously derived these points for a pivotal quantity based on the MLE. For the sake of comparison,
we use the same simulated samples to derive the percentage points of both pivotal quantities and compare the CI based
on the Pitman √ estimator and the MLE.√
Let ZP = n(∗ − ) and ZMLE = n(ˆ − ), where n is the sample size and ∗ and ˆ are the Pitman estimator and
the MLE of the Cauchy location parameter , respectively. Given the equivariance of these estimators, the distributions
of ZP and ZMLE depend only on n; thus, these are pivotal quantities. Then, their percentage points can be used √ to
construct CI for . In particular, the 100(1 − )% CI based on the Pitman estimator is given by (∗ ± z(1−/2) / n),
where z(1−/2) is such that p = P (Z > z(1−/2) ) = 1 − /2 for Z = ZP . Similarly, we obtain CI based on the MLE
replacing ∗ with ˆ and ZP with ZMLE . To obtain the percentage points shown in Table 1, we generate 80, 000 Cauchy
samples using the rcauchy function in R (the statistical software we use in this study) with  = 0,  = 1 and n in
5 : 5 : 40. For each sample, we compute the MLE and the Pitman estimate (see Section 5) to construct the pivotal
quantities ZP , and ZMLE and derive the points z(1−/2) for several values of .
Table 1 shows that the CI based on the Pitman estimator are in general narrower than those based on the MLE.
However, the cumulative distribution of ZP does not dominate stochastically (at first order) that of ZMLE , in particular
for the case of n = 5. Thus, although the Pitman estimator is more efficient than the MLE, the expected widths of some

Table 1
Cumulative distribution of pivotal quantities ZP and ZMLE based on the Pitman estimator and the MLE, respectively, for different sample sizes n

n\p Pitman MLE

0.85 0.9 0.95 0.975 0.995 0.85 0.90 0.95 0.975 0.995

5 1.64 2.14 3.06 4.03 7.52 1.60 2.09 3.09 4.27 8.52
10 1.54 1.96 2.64 3.33 5.20 1.54 1.96 2.66 3.42 5.59
15 1.51 1.90 2.50 3.09 4.55 1.51 1.89 2.52 3.12 4.67
20 1.49 1.86 2.47 3.00 4.15 1.49 1.86 2.47 3.01 4.24
25 1.49 1.86 2.43 2.97 4.12 1.49 1.86 2.44 2.98 4.16
30 1.48 1.85 2.44 2.95 3.99 1.48 1.85 2.44 2.95 4.01
35 1.48 1.84 2.40 2.90 3.98 1.48 1.84 2.40 2.90 3.99
40 1.49 1.84 2.39 2.88 3.90 1.49 1.83 2.39 2.89 3.90
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1905

CI based on the Pitman estimator can be larger than those based on the MLE. As the sample size increases the widths
of all CI as well as the estimates become closer and for n > 30 we obtain almost the same CI with both estimators.

5. Simulation results

We perform a simulation study to compare various estimators of the location parameter in terms of their bias, MSE, and
median bias. In addition, having the exact value of the Pitman estimate enables us to evaluate some approximations of this
estimator. We also investigate the robustness properties of the estimators in finite sample sizes by slightly contaminating
the Cauchy model. In Section 5.1 we summarize these results. In Section 5.2 we compare the unconditional and
conditional confidence levels of the (unconditional) confidence intervals based on the MLE and the Pitman estimator
constructed using Table 1.
We generate 2×104 samples of sizes 5 n 40 using the rcauchy function built in R with the location parameter =
0 and the scale parameter =1. For each random sample we compute the following 10 estimators of the Cauchy location
parameter. We program seven of these estimators that are not directly available in the software (the corresponding codes
are available from the author).

• Pitman: the Pitman estimator obtained using the closed form given in (7). Our algorithm involves simple matrices
operations and no loops, making it computationally inexpensive.
• MLE: the MLE is computationally expensive as the likelihood equation has multiple roots with no explicit solution
(Barnett, 1966b; Reeds, 1985; Small et al., 2000). Barnett (1966b) recommended scanning the whole log-likelihood
function to locate the global maximum. As this method is unfeasible to apply in a large simulation study, we design
an algorithm that is less computationally expensive and as accurate as the one suggested by Barnett. We use the
built in R function optimize to locate a maximum, , ˆ which in some cases finds a local one (see Figs. 1(a) and
(b)). Assuming that there are no more than one local maximum in any interval of length smaller than 0.25 (Barnett,
1966b), we repeat the procedure in the intervals (x(1) − 1, ˆ − 0.25) and (ˆ + 0.25, x(n) + 1), where x(1) and x(n)
are the minimum and maximum observed values in the sample, respectively. We then compare the values of the
log-likelihood function at these three points and the one with the largest value is taken as the global maximum.
• r ∗ : the r ∗ -estimator is the solution of the modified directed likelihood r ∗ introduced by Barndorff-Nielsen (1986)
(see also Barndorff-Nielsen and Cox, 1994; Pace and Salvan, 1999; Giummolè and Ventura, 2002). As the r ∗
equation becomes numerically very unstable near the MLE, Giummolè and Ventura (2002) recommended using a
smoothing method to approximate it before solving it. We wrote a code in R to find the r ∗ -estimator following these
recommendations.
• PitApp: Ventura (1998) used a Laplace approximation to derive an approximation of the Pitman estimator that avoids
numerical integration. The resulting estimator is a simple function of the MLE.
• PitQ: the Pitman estimator can be approximated using numerical integration. Thus, we use the built in R function
integrate to compute the adaptive quadratures of both numerator and denominator of (1).
• Hub: the Huber’s M-estimator (Huber, 1964) is a B-robust estimator which represents a compromise between the
sample mean and the sample median. We compute it using the built in R function hubers with the tuning constant
b = 0.4 so that it achieves almost the highest asymptotic efficiency, 88%, under the Cauchy model.
• Tuk: the Tukey’s biweight redescending estimator (Beaton and Tukey, 1974) is another B-robust estimator which
rejects extreme outliers entirely. Being a redescending estimator, it presents similar computational difficulties as the
MLE. Thus, we compute it using an algorithm similar to the one used to compute the MLE (see Fig. 1(d)). We set
its tuning constant c = 3.5 to achieve an asymptotic efficiency of 90% under the Cauchy model.
• Median: the sample median is the simplest L-estimator included in this study and the simplest alternative to the
sample mean.
• RFT: this L-estimator is the trimmed mean based on the middle 24% of the observations proposed by Rothenberg
et al. (1964) to improve on the efficiency of the sample median. We use the built R function mean with argument
trim to compute it.
• MNBLE: this L-estimator is the modified nearly best linear estimator analyzed by Barnett (1966a) for sample sizes
5 n 20. We calculate the coefficients of this estimator for sample sizes 21n 40 and write a code to compute
the estimator.
1906 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

-18 ● ●
-12

-16
-16

-14

-20
-12
● MLE ● MLE
nlm opt
-24

(a) -5 0 5 (b) -20 -15 -10 -5 0

-50 ●
● MLE 10
Local Max

-150 9

8
-250

7 ● nlm
● opt
-350 ● Tuk

6
(c) -20 0 20 40 60 80 120 (d) -10 -5 0 5 10 15 20

Fig. 1. (a) and (b) Log-likelihood function for a sample of size 5. The maximum located using our algorithm (MLE) and those using the R built
in functions nml (nlm) optimize (opt), respectively. (c) Log-likelihood function for a sample of size 40 with the global and a local maxima
identified. (d) Function  corresponding to Tukey’s biweight estimator. Estimates found with our algorithm (Tuk) and the built in functions nlm
(nlm), optimize (opt).

5.1. Results

In this section we summarize the results of our simulation. We first compare the estimators in terms of their biases
and note that, for all sample sizes, the estimators included in the simulation are both mean and median unbiased (the
results are available from the author). Thus, their mean-squared errors reduce to the variances which are summarized
in Table 2. This table shows that the variance of the Pitman estimator is the lowest in the group for all sample sizes.
This is not surprising since this estimator has the minimum variance among all unbiased equivariant estimators (see
Section 3). In addition, the variances of all the estimators decrease as the sample size increases. In particular, we
note that, for all sample sizes, the variance of PitQ is almost identical to that of the Pitman estimator suggesting that
quadrature methods can be a good alternative to compute the integrals in (1). However, having the exact value of the
Pitman estimator enables us to examine the distribution of the absolute error between this approximation and the true
estimate. Table 3 shows that the quadrature method is not always performing satisfactorily, specially for small-sample
sizes.
To help the visualization, we use two separate graphs, Figs. 2 and 3, to illustrate the “small-sample efficiency’’ of the
estimators defined as the ratio of the Cramér–Rao lower bound to the variance the estimator (Barnett, 1966a). For the
˜ where Var()
Cauchy location family, this quantity equals 2/(nVar()), ˜ is the variance of an estimator .
˜ Thus, we can
use it to explore the relation between the variance of the estimators and the sample size. In addition, the Cramér–Rao
lower bound is not attainable for the Cauchy location family as a uniform minimum variance unbiased estimator does
not exist (Kendall and Stuart, 1977). Thus, we use the small-sample efficiency to see how optimistic this bound is
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1907

Table 2
Variances of some equivariant estimators of the Cauchy location parameters for different sample sizes n

n Pitman MLE r∗ PitApp PitQ Hub Tuk Median RFT MNBLE

5 0.93614 1.11069 1.18959 1.58161 0.92354 1.18662 1.41508 1.20382 1.20382 1.20382
6 0.60568 0.70716 0.77923 1.19658 0.60593 0.83616 1.08267 0.83836 0.83836 0.83836
7 0.47601 0.53649 0.59208 0.79489 0.47677 0.60885 0.66351 0.62746 0.62746 0.62746
8 0.37601 0.41019 0.45523 0.53314 0.37589 0.47084 0.53532 0.47385 0.47385 0.46970
9 0.30816 0.32975 0.35951 1.04708 0.31000 0.37855 0.39002 0.39569 0.39569 0.37445
10 0.27274 0.28597 0.33499 0.37008 0.27269 0.33533 0.32734 0.33917 0.36998 0.32870
11 0.24158 0.25903 0.28113 0.44115 0.24037 0.29551 0.29644 0.31110 0.30065 0.28647
12 0.21634 0.22426 0.23667 0.33975 0.21612 0.25759 0.25471 0.26175 0.27485 0.24999
13 0.19273 0.19685 0.20636 0.22411 0.19270 0.23247 0.22209 0.24662 0.23416 0.22245
14 0.17519 0.18028 0.19023 0.21336 0.17520 0.21141 0.20037 0.21643 0.21568 0.20098
15 0.16099 0.16472 0.17059 0.18382 0.16100 0.19013 0.18512 0.20310 0.19140 0.18054
20 0.11475 0.11586 0.12683 0.13098 0.11476 0.13486 0.12827 0.13958 0.13777 0.12593
25 0.08852 0.08924 0.09017 0.08862 0.08852 0.10235 0.09779 0.11054 0.10351 0.09567
30 0.07239 0.07269 0.07321 0.07246 0.07239 0.08285 0.08006 0.08703 0.08352 0.07658
35 0.06204 0.06217 0.06245 0.06206 0.06204 0.07151 0.06884 0.07673 0.07221 0.06573
40 0.05261 0.05270 0.05290 0.05259 0.05260 0.06036 0.05860 0.06398 0.06069 0.05506

Table 3
Quantiles of the absolute value of the error of PitQ which approximates the Pitman estimator using numerical integration

n Min 0.5 0.75 0.9 0.95 0.99 Max

5 5.55E − 17 1.21E − 06 7.91E − 05 0.00211 0.01085 0.15527 11.32015


10 2.89E − 15 1.03E − 06 1.14E − 05 9.70E − 05 0.00034 0.00336 0.31627
15 2.39E − 13 1.99E − 07 2.24E − 06 2.13E − 05 7.49E − 05 0.00054 0.03516
20 4.36E − 15 9.21E − 08 1.04E − 06 8.16E − 06 3.25E − 05 0.00032 0.01292
25 1.07E − 16 5.47E − 08 8.30E − 07 5.80E − 06 2.09E − 05 0.00024 0.00282

for the Cauchy location family. We do not include PitQ in these graphs as it has almost the same efficiency as that of
Pitman.
These graphs show that, for small-sample sizes, the Cramér–Rao bound is extremely low compared to the variance
of the Pitman estimator which is the minimum attainable in the class of unbiased equivariant estimators. As the sample
size increases, the variance of the Pitman estimator decreases towards this bound and both get very close for n 25
where the “small-sample efficiency’’ of the Pitman estimator is almost always above 0.9. Fig. 2 includes the Pitman
estimator, the MLE, the PitApp and the r ∗ -estimator. It is interesting to note that for sample sizes greater than 22, all
the estimators have almost the same efficiency. In addition, Fig. 2 shows that the efficiency of the MLE is not high for
small-sample sizes (see also Barnett, 1966a,b). Fig. 4 explicitly shows how much we gain by using the Pitman estimator
rather than the MLE. We can see that for samples of sizes ranging from n = 5 to 4, the gain in relative efficiency could
be relatively large and as high as 19% for n = 5.
As the PitApp and the r ∗ have been proposed to improve the small-sample performance of the MLE (see previous
references), we include these estimators in the same plot and examine their efficiencies for the Cauchy location family.
However, we can see that for this family, the efficiency of the MLE is not improved by neither r ∗ nor the PitApp
estimators. The main disadvantage of both the PitApp and the r ∗ , is that for the Cauchy location distribution, the
log-likelihood function is multimodal (Reeds, 1985, Small et al., 2000 and Figs. 1(a)–(c)). For the case of the PitApp,
this implies that the main regularity condition to apply a Laplace approximation is not satisfied. Thus, the resulting
approximation of the Pitman estimator is not very accurate for this family, in particular for small-sample sizes. Reeds
(1985) proved that the number of local maxima (different from the global one) remains positive even for large values
of n (see Fig. 1(c)). However, as the sample size increases, the log-likelihood function shows a sharp peak at the global
maximum and this point becomes more distinct. Thus, the approximation becomes more accurate and the efficiency of
PitApp increases fast towards that of the Pitman estimator as the sample size increases. In addition, the log-likelihood
1908 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

0.9

0.8

0.7

0.6

0.5

0.4

Pit
MLE
0.3 r*
PitApp

0.2
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Sample size

Fig. 2. Small sample efficiencies (CRLB/Var) of Pitman estimator, MLE and estimators based on the MLE.

function may have a local maximum (different from the global one) very close to the global maximum (see Fig. 1(a)).
Thus, the computation of r ∗ is too sensitive to the grid chosen to approximate the equation and the resulting estimator
becomes very unstable for this family. In addition, both methods require the computation of the MLE and thus inherit
its numerical problems.
Fig. 3 includes all B-robust estimators: the Pitman estimator, the MLE, Huber’s and Tukey’s M-estimators and the
L-estimators (Median, RFT and MNBLE). Not surprisingly, the Pitman estimator is the most efficient in this group of
estimators for all sample sizes. In addition, recall that the MLE minimizes the asymptotic variance among all estimators
sharing the same bound for the IF, i.e., it is equivalent to the optimal Hampel’s B-robust estimator. Fig. 3 shows that even
for finite sample sizes, the variance of the MLE remains below that of the other B-robust estimators. Huber’s estimator
(Hub) performs poorly compared with other M-estimators (Tuk and MLE). Moreover, in the class of L-estimators, the
RFT and the MNBLE have been proposed as a simpler alternative to the MLE that improve the efficiency of the sample
median (e.g., Barnett, 1966a). It is interesting to note that while in general, Hub and Tuk perform better than the RFT
and the Median, all these estimators are less efficient than the MNBLE. Fig. 3 shows that while for sample sizes greater
than 20 the relative efficiency of Hub, Tuk, RFT and Median remains almost constant about its asymptotic value, that
of the MNBLE increases with the sample size.
Finally, we study the robustness properties of the estimators in finite sample sizes. Although we showed that some
of the estimators are B-robust, this concept is asymptotic and the estimators may still be too sensitive to slight contam-
inations in the model when the sample size is small. Thus, we contaminate the Cauchy distribution at three different
levels of contamination and three different sample sizes. To have the same interpretation of the unknown parameter,
we use a symmetric distribution with larger tails as the contaminating distribution. In particular, we choose a Cauchy
distribution with location parameter  = 0 and scale parameter  = 10 as the contaminating function.
We generate 20,000 samples of size n=5 (low), n=20 (moderate) and n=40 (large) from the following distribution:

F (x) = (1 − )C1 (x) + C10 (x), (10)


G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1909

0.9

0.8

0.7

0.6

0.5

Pit
0.4 Hub
Tuk
MNBLE
RFT
0.3
Median
MLE

0.2
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Sample size

Fig. 3. Small sample efficiencies (CRLB/Var) of Pitman estimator, Huber’s estimator, Tukey’s estimator and L-estimators.

20.0%

18.0%

16.0%

14.0%

12.0%

10.0%

8.0%

6.0%

4.0%

2.0%

0.0%
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Sample size

Fig. 4. Gain in efficiency of Pitman estimator relative to MLE.


1910 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

Table 4
Ration between variances from contaminated and original samples

n Pitman MLE r∗ PitApp PitQ Hub Tuk Median RFT MNBLE

= 0.1
5 2.013 2.175 2.118 2.356 1.723 2.196 2.014 2.178 2.178 2.178
20 1.211 1.210 1.129 1.084 1.211 1.326 1.219 1.309 1.373 1.284
40 1.190 1.173 1.173 1.174 1.174 1.276 1.169 1.259 1.285 1.226

= 0.2
5 4.219 5.799 5.525 5.279 3.485 4.422 3.798 4.371 4.371 4.371
20 1.478 1.499 1.421 1.587 1.478 1.810 1.565 1.774 1.979 1.897
40 1.393 1.387 1.387 1.392 1.393 1.663 1.384 1.614 1.718 1.642

= 0.4
5 12.831 14.230 13.423 11.911 9.869 15.721 14.181 15.510 15.510 15.510
20 2.920 3.049 2.888 4.796 2.936 4.076 4.364 3.953 5.143 4.334
40 2.204 2.200 2.233 2.244 2.205 3.193 2.241 3.052 3.619 3.592

where C1 (x) and C10 (x) are the Cauchy distribution with location parameter  = 0 and scale parameter  equal to 1
and 10, respectively. We explore the effect of the contamination on the estimators when the level of contamination is
low ( = 0.1), moderate ( = 0.2) and large ( = 0.4). Table 4 summarizes the ratio between the variances from the
contaminated and the original samples for different sample sizes.
The biases of all the estimators are almost not affected by the contamination, thus the MSE are approximately equal
to the variances in all cases (the results are available from the author). Tables 4 and 6 show that variances of the
estimators increase with the level of contamination and this sensitivity decreases with the sample size. We can see that
when n = 40, the MLE and the Pitman estimator have the smallest variance among the other B-robust estimators (Hub,
Tuk, Median, RFT and MNBLE). Although even in all contaminated samples the Pitman estimator has the smallest
variance among all estimators, the change in the estimated variance is not the smallest. For example, Table 4 shows that
the ratio between the variances from the 10% contaminated and the original samples of size 20 is 1.211 for the Pitman
estimator and 1.129 for the r ∗ -estimator. In general, while Tuk is not the most efficient among the B-robust estimators
under the Cauchy model, it seems to be more resistant (in terms of its variance) than Hub and the L-estimators.

5.2. Inference on the location parameter

In this section we use the results of the simulation to compare the unconditional and conditional confidence levels of
the (unconditional) CI based on the Pitman estimator and the MLE. Although the case based on the MLE was previously
examined by Lawless (1972), for the sake of comparison we compute the conditional levels of the CI based on both
estimators using the same simulated samples.
Let Z = ˜ − , where ˜ is any equivariant estimator based on x = (x1 , x2 , . . . , xn ) from the joint density in (2). Given
the equivariance of ,˜ it is easy to show that Z is a pivotal quantity, ai = xi − , ˜ i = 1, . . . , n − 1 is a set of (n − 1)
˜
independent ancillary statistics and an = xn −  can be written as a function of a1 , . . . , an−1 . Thus, the conditional
confidence level, given a = (a1 , . . . , an ), of a confidence interval with unconditional level  is given by
 ˜ n
C +l 
P (−l Z1 l|a1 , . . . , an−1 ) = [1 + (xi − )2 ]−1 d, (11)
n ˜ −l
i=1

where C = C(a) is a normalizing constant given by


 n
+∞ 
1
C −1 = [1 + (t + ai )2 ]−1 dt (12)
n −∞ i=1

and l = z(1−/2) / n where z(1−/2) are the critical values given in Table 1 (see Lawless, 1972; Pace and Salvan, 1997;
Lawless, 2003 for further details).
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1911

Table 5
Conditional confidence levels of the (unconditional) 95% confidence intervals based on the Pitman estimator, for different sample sizes n

n < 0.8 0.8–0.85 0.85–0.9 0.9–0.92 0.92–0.93 0.93–0.94 0.94–0.96 0.96–0.97 0.97–0.98 > 0.98 Mean

5 0.0463 0.0291 0.0563 0.0403 0.0241 0.0277 0.0834 0.0687 0.1109 0.5135 0.9505
10 0.0273 0.0249 0.0615 0.0464 0.0317 0.0446 0.1300 0.1006 0.1441 0.3891 0.9531
15 0.0206 0.0250 0.0693 0.0562 0.0426 0.0544 0.1747 0.1304 0.1645 0.2625 0.9495
20 0.0134 0.0209 0.0715 0.0666 0.0465 0.0625 0.1963 0.1433 0.1767 0.2025 0.9491
25 0.0081 0.0177 0.0614 0.0614 0.0523 0.0682 0.2150 0.1548 0.1839 0.1775 0.9509
30 0.0049 0.0124 0.0574 0.0651 0.0507 0.0692 0.2349 0.1723 0.1891 0.1444 0.9514
35 0.0038 0.0104 0.0607 0.0682 0.0569 0.0803 0.2649 0.1742 0.1754 0.1053 0.9495
40 0.0031 0.0085 0.0509 0.0703 0.0543 0.0847 0.2819 0.1925 0.1705 0.0834 0.9500

Table 6
Conditional confidence levels of the (unconditional) 95% confidence intervals based on the MLE, for different sample sizes n

n < 0.8 0.8–0.85 0.85–0.9 0.9–0.92 0.92–0.93 0.93–0.94 0.94–0.96 0.96–0.97 0.97–0.98 > 0.98 Mean

5 0.0538 0.0309 0.0609 0.0382 0.0218 0.0264 0.0740 0.0585 0.0927 0.5430 0.9510
10 0.0302 0.0249 0.0597 0.0437 0.0308 0.0404 0.1205 0.0953 0.1350 0.4197 0.9540
15 0.0218 0.0254 0.0690 0.0561 0.0420 0.0521 0.1708 0.1281 0.1627 0.2722 0.9497
20 0.0143 0.0221 0.0723 0.0678 0.0447 0.0641 0.1945 0.1422 0.1752 0.2030 0.9487
25 0.0086 0.0179 0.0633 0.0621 0.0517 0.0683 0.2155 0.1524 0.1831 0.1774 0.9506
30 0.0052 0.0125 0.0585 0.0663 0.0503 0.0695 0.2347 0.1720 0.1876 0.1436 0.9511
35 0.0040 0.0107 0.0621 0.0699 0.0564 0.0810 0.2651 0.1735 0.1738 0.1038 0.9492
40 0.0030 0.0085 0.0507 0.0683 0.0546 0.0838 0.2801 0.1911 0.1731 0.0872 0.9503

As we discuss previously in this section, for each generated sample of size 5:5:40 we compute the Pitman estimator
and the MLE. In addition, using the critical values shown in Table 1, we compute the unconditional 95% confidence
interval based on both estimators. For example, the Pitman and the MLE corresponding to the first simulated sample
of size 5 are 0.266 and 0.799, respectively.
√ Then, the 95% confidence
√ intervals based on the Pitman estimator and the
MLE are given by (0.266 ± 4.029/ 5) and (0.799 ± 4.266/ 5), respectively. Solving (11) and (12), we compute the
conditional confidence levels of the intervals based on the Pitman estimator and the MLE, which are summarized in
Tables 5 and 6, respectively. To reduce numerical problems, we use a procedure based on the Residue Theorem similar
to the one described in Section 2 to compute the normalizing constant in (12). Finally, we use numerical integration to
solve the integral in (11).
From Tables 5 and 6, we can see that for small-sample sizes the distribution of the conditional confidence levels
of the 95%-CI is not highly concentrated around the average value. This dispersion is even more notorious for the
conditional levels of the intervals based on the MLE than for those based on the Pitman estimator and in both cases
it decreases as the sample size increases. In addition, the average values are close to the unconditional level and the
agreement between conditional and unconditional levels increases with the sample size.

6. Conclusions

This article examines the estimation of the location parameter of the Cauchy distribution from a different perspective.
Even though almost all the estimators suggested in the literature are equivariant, none of them is the MRE estimator. This
article adds to previous work by presenting a closed form of the Pitman estimator of the Cauchy location parameter,
which minimizes the risk in the class of all equivariant estimators under the squared error loss. We discuss some
important properties of this estimator that hold for the Cauchy family. In addition, we address the inference problem
of the location parameter based on the Pitman estimator and compare it with that based on the MLE. We note that in
general the confidence intervals based on the Pitman estimator have a similar precision as those based on the MLE.
However, given an ancillary statistic, the corresponding conditional confidence levels, when using the MLE, are more
disperse .
1912 G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913

While the MLE has been considered as a gold standard to construct location equivariant estimators for the Cauchy
location family, it is not optimum in the class. Although the MLE has well-known asymptotic properties, in any practical
situation we may be concerned with estimation from small samples. The simulation study shows that, in small samples,
the gain in relative efficiency when using the Pitman estimator rather than the MLE can reach 19%. In addition, for this
location family, the efficiency of the MLE is not improved by any other estimator included in the simulation except
for the Pitman estimator. Thus, for small samples, the Pitman estimator represents the best alternative to the MLE. To
study the small sample robustness of the location estimators examined in our simulation, we contaminate the nominal
model using a Cauchy distribution with heavier tails under three different levels of contaminations. The variance of
the Pitman estimator remains below that of other estimators even for the contaminated samples.
Thus, from a theoretical and a practical point of view, when using the squared error loss function, the Pitman estimator
is better than any other estimator suggested in the literature. In addition, the computational intractability of the MLE
compared with the ease of applicability of the Pitman estimator makes the latter a very attractive estimator of the
location parameter of the Cauchy distribution.

Acknowledgments

The author thanks Subir Ghosh (the Editor), the Associate Editor, and two anonymous referees for several comments
and suggestions that substantially improved the paper. The author is also grateful to Grace Yang and to Hernan Ortiz
Molina for their helpful comments and continuous encouragement. Special thanks are extended to Ruben Zamar,
Abraham Kagan and Marcelo Ruiz for many useful discussions. The author alone is responsible for any errors or
omissions.

References

Bai, Z.D., Fu, J.C., 1987. On the maximum-likelihood estimator for the location parameter of a Cauchy distribution. Canad. J. Statist. 15, 137–146.
Balmer, D.W., Boulton, M., Sack, R.A., 1974. Optimal solutions in parameter estimation problems for the Cauchy distribution. J. Amer. Statist.
Assoc. 69, 238–242.
Barndorff-Nielsen, O.E., 1986. Inference on full or partial parameters based on the standardized signed log likelihood ratio. Biometrika 73,
307–322.
Barndorff-Nielsen, O.E., Cox, D.R., 1994. Inference and Asymptotics. Monographs on Statistics and Applied Probability, vol. 52. Chapman & Hall,
London.
Barnett, V.D., 1966a. Order statistics estimators of the location of the Cauchy distribution. J. Amer. Statist. Assoc. 61, 1205–1218.
Barnett, V.D., 1966b. Evaluation of the maximum-likelihood estimator where the likelihood equation has multiple roots. Biometrika 53, 151–165.
Beaton, A.E., Tukey, J.W., 1974. The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics 16,
147–185.
Bloch, D., 1966. A note on the estimation of the location parameter of the Cauchy distribution. J. Amer. Statist. Assoc. 61, 852–855.
Bondesson, L., 1975. Uniformly minimum variance estimation in location parameter families. Ann. Statist. 3, 637–660.
Brown, L.D., 1966. On the admissibility of invariant estimators of one or more location parameters. Ann. Math. Statist. 37, 1087–1136.
Chan, L.K., 1970. Linear estimation of the location and scale parameters of the Cauchy distribution based on sample quantiles. J. Amer. Statist.
Assoc. 65, 851–859.
De Bruijn, N.G., 1961. Asymptotic Methods in Analysis. North-Holland, Amsterdam.
Easton, G.S., 1991. Compromise maximum likelihood estimators for location. J. Amer. Statist. Assoc. 86, 1051–1064.
Girshick, M.A., Savage, L.J., 1951. Bayes and minimax estimates for quadratic loss functions. Proceedings of the Second Berkeley Symposium on
Mathematical Statistics and Probability. University of California Press, Berkeley and Los Angeles, 1950, pp. 53–73.
Giummolè, F., Ventura, L., 2002. Practical point estimation from higher-order pivots. J. Statist. Comput. Simul. 72, 419–430.
Goodall, C., 1983. M-estimators of location: an outline of the theory. In: Hoaglin, D.C., Mosteller, F., Tukey, J.W. (Eds.), Understanding Robust and
Exploratory Data Analysis. New York, pp. 339–403.
Haas, G., Bain, L.J., Antle, C.E., 1970. Inference for the Cauchy distribution based on maximum likelihood estimators. Biometrika 57, 403–408.
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A., 1986. Robust Statistics. The Approach Based on Influence Functions. Wiley,
New York.
Hanson, K.M., Wolf, D.R., 1996. Estimators for the Cauchy distribution. In: Heidbreder, G.R. (Ed.), Maximum Entropy and Bayesian Methods,
pp. 255–263.
Howlander, H.A., Weiss, G., 1988. On Bayesian estimation of the Cauchy parameters. Sankhyā: Indian J. Statist. 50, 350–361.
Huber, P.J., 1964. Robust estimation of a location parameter. Ann. Math. Statist. 35, 73–101.
Jensen, D.R., Foutz, R.V., 1991. Pitman estimation on Rk . J. Statist. Plann. Inference 28, 233–240.
Johns, M.V., 1979. Robust Pitman-like estimators. In: Launer, R.L., Wilkinson, G.N. (Eds.), Robustness in Statistics. Academic Press, New York,
pp. 49–60.
Karlin, S., 1958. Admissibility for estimation with quadratic loss. Ann. Math. Statist. 29, 406–436.
G.V. Cohen Freue / Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913 1913

Kendall, M.G., Stuart, A., 1977. The Advanced Theory of Statistics, vol. 2. Griffin, London.
Koutrouvelis, I.A., 1982. Estimation of location and scale in Cauchy distributions using the empirical characteristic function. Biometrika 69,
205–213.
Lawless, J.F., 1972. Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions.
Biometrika 59, 377–386.
Lawless, J.F., 2003. Statistical Models and Methods for Lifetime Data. second ed.. Wiley-Interscience, Hoboken, NJ.
Lehmann, E.L., 1998. Theory of Point Estimation. Springer, New York.
Pace, L., Salvan, A., 1997. Principles of Statistical Inference. From a Neo-Fisherian Perspective. World Scientific Publishing Co., Inc.,
River Edge, NJ.
Pace, L., Salvan, A., 1999. Point estimation based on confidence intervals: exponential families. J. Statist. Comput. Simul. 64, 1–21.
Parsian, A., Sanjari Farsipour, F., Nematollahi, N., 1993. On the minimaxity of Pitman type estimator under a LINEX loss function. Comm. Statist.
Theory Methods 22, 97–113.
Pitman, E.J.G., 1939. Tests of hypotheses concerning location and scale parameters. Biometrika 31, 200–215.
Port, S.C., Stone, C.J., 1974. Fisher information and the Pitman estimator of a location parameter. Ann. Statist. 2, 225–247.
Prabakaran, T.E., Chandrasekar, B., 1994. Simultaneous equivariant estimation for location-scale models. J. Statist. Plann. Inference 40, 51–59.
Reeds, J.A., 1985. Asymptotic number of roots of Cauchy location likelihood equations. Ann. Statist. 13, 775–784.
Rothenberg, T.J., Fisher, F.M., Tilanus, C.B., 1964. A note on estimation from a Cauchy sample. J. Amer. Statist. Assoc. 59, 460–463.
Small, C.G., Wang, J., Yang, Z., 2000. Eliminating multiple root problems in estimation. Statist. Sci. 15, 313–341.
Spiegelhalter, D.J., 1985. Exact Bayesian inference on the parameters of a Cauchy distribution with vague prior information. Bayesian Statist.
2, 743–750.
Stein, C., 1959. The admissibility of Pitman’s estimator of a single location parameter. Ann. Math. Statist. 30, 970–979.
Stone, C.J., 1974. Asymptotic properties of estimators of a location parameter. Ann. Statist. 2, 1127–1137.
Tiku, M.L., Suresh, R.P., 1992. A new method of estimation for location and scale parameters. J. Statist. Plann. Inference 30, 281–292.
Ventura, L., 1998. Higher-order approximations for Pitman estimators and for optimal compromise estimators. Canad. J. Statist. 26, 49–55.

Potrebbero piacerti anche