Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley-Blackwell and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend
access to Journal of the Royal Statistical Society. Series B (Methodological).
http://www.jstor.org
J. R. Statist.Soc. B (1992)
54, No. 1,pp. 273-284
SUMMARY
in thefitting
Thereis considerableinterest of modelsjointlyto themeanand dispersionof a
response. For the mean parameter,the Wedderburnestimatingequations are widely
accepted.However,thereis some controversy about estimating thedispersionparameters.
Finitesamplingpropertiesof severaldispersionestimatorsare investigated
forthreemodels
by simulation. We compare the maximum extended quasi-likelihoodestimator,the
maximumpseudolikelihoodestimatorand themaximumlikelihoodestimator,if it exists.
themaximumextendedquasi-likelihoodestimatoris usuallysuperiorin
Of theseestimators,
minimizingthemean-squarederror.
Keywords: DISPERSION PARAMETER; EXPONENTIAL FAMILY; EXTENDED QUASI-LIKELIHOOD;
MEAN-SQUARE ERROR; PSEUDOLIKELIHOOD; SIMULATION; VARIANCE FUNCTION
1. INTRODUCTION
There is considerableinterestin the fittingof models jointly to the mean and
dispersionof a response(Nelder and Pregibon,1987; Davidian and Carroll, 1987,
1988;Breslow,1990),and severalapproacheshavebeendeveloped.If itis possibleto
specifya likelihoodthenmaximumlikelihood(ML) can be used; see Aitkin(1987) for
a discussionof the normalcase whenboth it and a2 are modelledas functionsof
covariates.For non-normalerrorsa commonextensionis to theclass of generalized
linear models (GLMs) (McCullagh and Nelder, 1989); however,the GLMs with
Poisson and binomialerrorshave thevarianceas a fixedfunctionof the mean. To
allow thedispersionto varyindependently it is usuallynecessaryto move to models
forwhichthereis no exactlikelihood,but forwhicha quasi-likelihood(QL) can be
defined,based on the firsttwo momentsonly of the distribution(McCullagh and
Nelder(1989), chapter9). For some models,theremayexista truelikelihoodwitha
distributionnot belonging to the GLM family, and also an alternativeQL
formulation.The latteris usually easier to fit, and hence one question to arise
concernsthepropertiesof themaximumquasi-likelihood(MQL) estimatorsrelative
to theML estimatorsforsuch models.
Wedderburn(1974) gave estimatingequationsfortheparametersin themodel for
themean,and theseare widelyaccepted.For GLMs witha fulllikelihoodtheyare the
ML equations; otherwisetheyare MQL equations. Workingfromtheviewpointof
optimumestimatingequations Godambe and Thompson(1989) gave anotherform
fortheseequations; however,as Nelder(1989) pointedout, fordistributions whose
and
var(y) = b"(?).
In a GLM we assume il = g(,t) = Exj1j, wherethexj are covariatesand g( ) is a
monotonicfunctionknownas thelinkfunction.The varianceis theproductof two
terms,thedispersionparameterq and thevariancefunctionb"(6), whichwe writein
the formV(,u) = adt/dO. In thestandardformof GLM we modelit as a functionof
unknownparameters,B,assumingq fixedand with V( ) containingno unknown
parameters.Of thedistributions withlikelihoodsof theform(2.1), thePoisson and
binomial have q = 1, i.e. fixed a priori, while the normal, gamma and inverse
have q variableand usuallyunknown.The negativebinomial
Gaussian distributions
whose variancefunctioncan be writtenin theformV(1t) = it + it2/k,
distribution,
givesan exampleof a variancefunctioncontainingan unknownparameter.
For q fixedtheML equationsforthea are independentof q and are givenby
Z W(y-,u)Odnl -O (2.2)
whereW is theweightfunctiondefinedby
d =-2 YV Udu
y V(u)
The devianceD is thesum of d overtheobservations.
For all GLMs, we have therelation
al/da = (y-,u)/{4V(,u)} (2.3)
so thatthisfirstderivativeofthelikelihooddependsonlyon thefirsttwomomentsof
y. This led Wedderburn(1974) to definea QL (morestrictly
a quasi-log-likelihood)q
by therelation
aq/ala = (y- At)/{l V(A)}. (2.4)
The use of q as a criterionfor fittingallows the class of GLMs to be extendedto
modelsdefinedonlyby thepropertiesof the firsttwo moments.The QL q willbe a
oftheGLM typehavingvar(y) = q V(1t).QLs
truelikelihoodifthereis a distribution
allow two kindsof extensionto GLMs. In the first,GLMs withfixedq = 1 can be
extendedto allow a variableq; forexamplelog-linearmodelswithPoisson errorsfor
whichvar(y) = ftcan be enlargedto allow overdispersionwithvar(y) = Olt. In the
secondextension,V(,u)takesa formwhichdoes notcorrespondto thatof a standard
GLM, forexample V(jt) =,t, forvariablea.
2.2. Quasi-distributions
A quasi-(log)-likelihood q can be made intoa distribution
bynormalizing it. Such a
distribution If q has parametervectorp.,thenthequasi-
we call a quasi-distribution.
distributionhas frequencyfunction
276 NELDER AND LEE [No. 1,
fq = exp(q)/w,
wherecX = expq dy, and likelihood
lq = q - log .
The ML equations forthe quasi-distribution,
alq/la =0, and the MQL equations,
aq/a1 = 0, willdifferby a term
a(logc) 1 acLa/
= dj-- expqudy
!Ialexpaq Yi dy
d: 0 V(,)
whereA* is the quasi-meanI yfqdy. If ,u*- A is small comparedwithy - ,uwe can
expecttheMQL estimatesto approximatecloselytheML estimatesfromthe quasi-
distribution.
3. SIMULATIONS
3.1. Example 1: NBa-distribution
The standard negativebinomial model assumes a gamma distributionfor u,
the mean of the Poisson distribution.If we writethis gamma distributionin the
form
expl - -1
(-X dI- (3.1)
then,forthemixturedistribution,
E(y) = = av
and
var(y) = av + a2v.
The standardnegativebinomialmodelforGLMs arisesfromassumingthatvis fixed
as 1tvaries.This gives
var(y) = t + t2/V. (3.2)
If, however,we assume that i varies withv and a remainsconstant,we obtain a
thatwe call theNBa-distribution;forthis
distribution
var(y) = A(1+c) (3.3)
so that we obtain a model withan errordistributionresemblingan overdispersed
withdispersionparameterc = 1 + a. This distribution
Poisson distribution, is not a
standardGLM-type exponentialfamily,so that the ML and MQL equations are
different.The NBa-distributionhas found applicationsin social science; see, for
example,Hausman et al. (1984) and King (1988).
Let dlg, ddg and dtg be differencesof log-gamma,digamma and trigamma
functionsrespectively,definedby
dlg(y,!?) =logP(y+y)
r- logPQ?),
278 NELDER AND LEE [No. 1,
0O, ify=0,
dy-1+p1 +/a+y-2+,
+.
+/a W/a ify= 192, ..
and
09 ify=0,
dtg(Yi9 ) = { - ify=1,2 .
can be written
The log-likelihoodof theNBa-distribution
) A
4(y; t,CZ) dlg(L
( alog a + (y+La) log (I+) + logY!3
Aa =
Addg
a ( - log(l +a)3 + Z f - -0
=
(y-A,u)/cx = O. (3.5)
H ____ r - , [? { dg
d a?) - log(l + a)3] - Ex,x5 2dtg (y, ?).
However,H may have a non-positiveeigenvalue;if so, we replaceH by
E Xs?
Zxr g(Y,Lai)
&cql= d/(n- p) - 1
and
p= X2/(n-p)- 1.
and
Dp= E ++ PElog(4t )I
TABLE 2
for theMEQL and MPL estimatorsof a
Samplingstatistics
TABLE 3
for a frommodel (e) of Dean et al. (1989) with100 runs
Samplingstatistics
4. DISCUSSION
Our studiesof the finitesample propertiesof estimatesof dispersionparameters
showthemarkedlimitationsof asymptoticarguments.Two estimators,whichin the
limitshow one biased and one unbiased, may have verysimilarbiases in finite
samples; similarlyasymptoticrelativeefficienciesmay be quite misleadingwhen
extrapolatedto finitesamples. Taguchi-typeexperiments, used in qualityimprove-
ment,are examplesof data setswheredispersionis to be modelledwithfairlysmall
amountsof data.
As Pierce and Schafer(1986) have pointedout, devianceresidualsgenerallyare
verynearlythesame as thosebased on thebestpossiblenormalizing transformation.
Their arguments,in relation to the saddlepoint approximation,are based on
asymptoticsrelativeto it. They also presentMonte Carlo resultsshowingthatthis
normalizationargumentseems to hold even for small values of it. However, bias
correctionis necessarywhenit is small. This dependson the distributionand hence
requiresestimationof higherordercumulants.Hence, we mayexpectthattheMEQL
estimatorwould have smallervariancethan otherestimators,but would have non-
trivialbias whenitis small.However,thePL estimatoris based on thesecondmoment
of Pearson x2-residualsand hence would be consistenteven for small it with
increasingsamplesizeunderappropriateregularity conditions.However,thePearson
residualsmay, in finitesamples,differconsiderablyfromthose based on the best
normalizingtransformation. This suggeststhattheMPL estimatorhas smallerbias
whenItis smallbutthesamplesizeis large.However,itsvarianceis alwayslargerthan
that of the MEQL estimator.Our simulationstudyshows that varianceinflation
dominatesand hencetheMSE of theMPL estimatoris largerthanthatof theMEQL
estimator.
The efficiencyof the ML estimatoris an asymptoticproperty.But under the
normalityassumptionthe efficiency of the ML estimatorholds forfinitesamples.
Hence, if thereis a normalizingtransformation, thebestestimationequationwould
be obtained.So we mayexpectthatin finitesamplestheMEQL estimatorwouldhave
the smallestvariance, and our simulationstudyshows this. Our studyseems to
indicate that the approximatenormalityof the deviance residual holds quite
generally,and thiswould explainwhythe MEQL estimatorquite oftenhas smaller
MSE thanthatof theML estimator.
Althoughthe scope of the simulationsreportedis necessarilylimited,we can
concludethatof theML, MPL and MEQL estimatorstheMEQL estimatorin these
examplesis neverappreciablyinferior to theMPL estimatorand is oftenmuchbetter
(in termsof MSE). The relationsbetweenthe ML and MEQL estimatorsare more
284 NELDER AND LEE [No. 1,
complex,buttheMEQL estimatorhas a smallerMSE overa widerangeofconditions.
SupportfortheMQL estimatorin estimating theparameterin thevariancefunction
of thenegativebinomialdistribution
comes fromPiegorsch(1990).
ACKNOWLEDGEMENTS
The researchof the second authorwas supportedby the Ministryof Education,
Korean Government.We thankDr N. Breslowforhelpfulcommentson our initial
draft.
REFERENCES