A Catalog of Noninformative Priors

A Catalog of Noninformative Priors
Ruoyong Yang Parexel International 60 Revere Drive, Suite 200 Northbrook, IL 60062 ruoyong.yang@parexel.com James O. Berger ISDS Duke University Duahrm, NC 27708-0251 berger@stat.duke.edu
December, 1998
Abstract
PRELIMINARY DRAFT: This draft of the catalog is incomplete, with much remaining to be lled in and/or added. We are circulating this draft in the hopes that readers will know of relevant information that should be added. Please send such information to the authors above.] A variety of methods of deriving noninformative priors have been developed, and applied to a wide variety of statistical models. In this paper we provide a catalog of many of the resulting priors, and list known properties of the priors. Emphasis is given to reference priors and the Je reys prior, although other approaches are also considered. Key words and phrases. Je reys prior, reference prior, maximal data informa-
tion prior.
Contents
1 Introduction
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Approaches to Development of Noninformative Priors . . . . . . . . . . . . . . .
This research was supported by NSF grants DMS-8923071 and DMS-9303556 at Purdue University.
4 5
2 Organization and Notation 3 AR(1) 4 Behrens-Fisher Problem 5 Beta 6 Binomial 7 Bivariate Binomial 8 Box-Cox Power Transformed Linear Model 9 Cauchy 10 Dirichlet 11 Exponential Regression Model 12 F Distribution 13 Gamma 14 Generalized Linear Model 15 Inverse Gamma 16 Inverse Normal or Gaussian 17 Linear Calibration 18 Location-Scale Parameter Models 19 Logit Model 20 Mixed Model 21 Mixture Model
2
6 6 7 7 8 9 9 10 11 11 12 12 13 15 15 16 17 19 21 21
22 Multinomial 23 Negative Binomial 24 Neyman and Scott Example 25 Nonlinear Regression Model 26 Normal 27 Pareto 28 Poisson 29 Product of Normal Means 30 Random E ects Models 31 Ratio of Exponential Means 32 Ratio of Normal Means 33 Ratio of Poisson Means 34 Sequential Analysis 35 Stress-Strength System 36 Sum of Squares of Normal Means 37 T Distribution 38 Uniform 39 Weibull
22 23 23 24 24 28 28 29 30 34 34 35 36 36 37 38 38 39
1 Introduction
1.1 Motivation
The literature on noninformative priors has grown enormously over recent years. There have been several excellent books or review articles that have been concerned with discussing or comparing di erent approaches to developing noninformative priors (e.g., Kass and Wasserman, 1993), but there has been no systematic e ort to catalog the noninformative priors that have been developed. Since use of noninformative priors is becoming routine in Bayesian practice, preparation of such a catalog seemed in order. Although general discussion is not the purpose of this catalog, it is useful to review the numerous reasons that noninformative priors are important to Bayesian analysis: (i) Frequently, elicitation of subjective prior distributions is impossible, because of time or cost limitations, or resistance or lack of training of clients. Automatic or default prior distributions are then needed. (ii) The statistical analysis is often required to appear objective. Of course, true objectivity is virtually never attainable, and the prior distribution is usually the least of the problems in terms of objectivity, but use of a subjectively elicited prior signi cantly reduces the appearance of objectivity. Noninformative priors not only preserve this appearance, but can be argued to result in analyses that are more objective than most classical analyses. (iii) Subjective elicitation can easily result in poor prior distributions, because of systematic elicitation bias and the fact that elicitation typically yields only a few features of the prior, with the rest of the prior (e.g, its functional form) being chosen in a convenient, but possibly inappropriate, way. It is thus good practice to compare answers from a subjective analysis with answers from a noninformative prior analysis. If there are substantial di erences, it is important to check that the di erences are due to features of the prior that are trusted, and not due to either unelicited \convenience" features of the prior, or suspect elicitations. (iv) In high dimensional problems, the best one can typically hope for is to develop subjective priors for the \important" parameters, with the unimportant or \nuisance" parameters being given noninformative priors. (v) Good noninformative priors can be somewhat magical in multiparameter problems. As 4
an example, the Je reys prior seems to almost always yield a proper posterior distribution. This is \magical," in that the common constant (or uniform) prior will much more frequently fail to yield a proper posterior. Even better, the reference prior approach has repeatedly yielded multiparameter priors that overcome limitations of the Je reys prior, and yield surprisingly good performance from almost any perspective. The point here is that, in multiparameter problems, inappropriate aspects of priors (even proper ones) can accumulate across dimensions in very detrimental ways reference priors seem to \magically" avoid such inappropriate accumulation. (vi) Bayesian analysis with noninformative priors is being increasingly recognized as a method for classical statisticians to obtain good classical procedures. For instance, the frequentistmatching approach to developing noninformative priors is based on ensuring that one has Bayesian credible sets with good frequentist properties, and it turns out that this is probably the best way to nd good frequentist con dence sets.
1.2 Approaches to Development of Noninformative Priors

We do not attempt a thorough discussion of the various approaches. See, e.g., Kass and Wasserman (1993), for such discussion. We primarily will just de ne the various approaches, and give relevant references. The Uniform Prior: By this, we just mean the constant density, with the constant typically being chosen to be 1 (unless the constant can be chosen to yield a proper density). This choice was, of course, popularized by Laplace (1812). p The Je reys Prior: This is de ned as ( ) = det(I ( )), where I ( ) is the Fisher information matrix. This was proposed in Je reys (1961), as a solution to the problem that the uniform prior does not yield an analysis invariant to choice of parameterization. Note that, in speci c situations, Je reys often recommended noninformative priors that di ered from the formal Je reys prior. The Reference Prior: This approach was developed in Bernardo (1979), and modi ed for multiparameter problems in Berger and Bernardo (1992c). The approach cannot be simply described, but it can be roughly thought of as trying to modify the Je reys prior by reducing the dependence among parameters that is frequently induced by the Je reys prior there are many well-known examples in which the Je reys prior yields poor performance (even inconsistency) 5
because of this dependence. The Maximal Data Information Prior (MDIP): This approach was developed in Zellner R (1971), based on an information argument. It is given by ( ) = expf p(xj ) log p(xj )dxg, where p(xj ) is the data density function.
2 Organization and Notation

The catalog is organized around statistical models, with the models being listed in alphabetical order. Each model-entry is kept as self-contained as possible. Listed for each are (i) the model density (ii) various noninformative priors and (iii) certain of the resulting posteriors and their properties. Category (iii) information is often very limited. Notation is standard. This include ( jD), jAj =determinant of A, ( ) is a density with respect to d . Important Notation: Noninformative priors that are proper (i.e., integrate to 1) are given in bold type. Others are improper. (The distinction is important for testing problems, where proper distributions are typically needed for estimation and prediction, improper noninformative priors are typically ne.)
3 AR(1)
The AR(1) model, in which the data X = (X1 ::: XT ) follow the model
Xt = Xt;1 +
where the t are i.i.d. N (0 2 ). The expressions below are for
2
known. If
is unknown, multiply by 1 d (or
d 2 ).
8 > < Reference2 > :
Prior Uniform Je reys Reference1
E X0 ] ; 1 g 1; 2 2 1; + 1; 2 f PT X1; )]g expf 1 E log( i=1 i2;1 p2 2 1= 2 1 ; ] if j j< 1 p2 ; 1] if j j> 1: 1= 2 j j T2

2
2T
() 1
(Marginal) Posterior
i1=2
all are proper
1. Nonasymptotic reference prior. 2. Symmetrized reference prior, recommended for typical use. See Berger and Yang (1994) for comparison of the noninformative priors.
4 Behrens-Fisher Problem
Let x1 : : : xn be i.i.d. observations from N ( 2 ) and y1 : : : yn be i.i.d. observations from N ( 2 ), the parameters of interest are = ; and = + . Liseo (1993) computed the Je reys prior and reference prior for this problem as
2 2 ) (Marginal) Posterior Prior ( Uniform 1 Je reys 1=( )3 proper Reference1 1=( )2 1. Independent of the group ordering of the parameters.
5 Beta
The Be( ), > 0, > 0, density is
f (xj
The Fisher information matrix is
;( + ) = ;( );( ) x ;1 (1 ; x) ;1 I 0 1](x): )
I(
1 0 PG(1 ) ; PG(1 + ) ;PG(1 + ) C )=B A @ ;PG(1 + ) PG(1 ) ; PG(1 + )

7
where PG(1 x) = 1 (x + i);2 is the PolyGamma function. The Je reys prior is thus the i=0 square root of the Fisher information matrix.
6 Binomial
The B (n p), 0 p 1, density is
0 1 n f (xjn p) = B C px(1 ; p)(n;x): @ A

x
Case 1: Priors for p, given n
1:6186pp (1 ; p)(1;p) proper Novick and Hall's1 p;1 (1 ; p);1 Be(pjx n ; x) 1. See Novick and Hall (1965). Note this prior is uniform in = log p=(1 ; p). Case 2: Priors for n Prior Uniform Je reys Reference1 (n ) 1 (Marginal) Posterior
Uniform Je reys Reference MDIP
Prior
(pjn) 1 1 ;1=2 p (1 ; p);1=2
(Marginal) Posterior Be(pjx + 1 n ; x + 1) 1 Be(pjx + 2 n ; x + 1 ) 2
n;1=2
1/n
log log::: log(n)
Universal2
1 2:865n log(n) log log(n)
1. Discussed by Alba and Mendoza (1995). 2. The product in the denominator is up to the last term for which log log : : : log(n) > 1. See Rissanen (1983).
7 Bivariate Binomial
0 1 0 1 m r f (r sjp q m) = B C pr (1 ; p)m;r B C qs (1 ; q)r;s @ A @ A

r s
for r = 1 : : : m and s = 1 : : : r. Polson and Wasserman (1990) compute the Fisher information matrix of this distribution as I (p q) = mdiag(fp(1 ; p)g;1 pfq(1 ; q)g;1 ). (p qjm) (Marginal) Posterior Uniform 1 1 ;=2 q;1=2 (1 ; q);1=2 Je reys proper 2 (1 ; p) 1 ;1=2 Reference1 (1 ; p);1=2 q;1=2 (1 ; q);1=2 2p 1 ;1=2 q;1=2 (1 ; q);1=2 (1 ; pq);1=2 Reference2 2 (1 ; p) Crowder and Sweeting's3 p;1(1 ; p);1q;1 (1 ; q);1 1. Parameter of interest is p or q. 2. Parameter of interest is = pq or = p(1 ; q)=(1 ; pq). 3. See Crowder and Sweeting (1989). Prior
8 Box-Cox Power Transformed Linear Model

Given observations fy1 : : : yn g, the model is
8 y ;1 > i < ( ) z => : ln yi
6= 0 = =0
+ xti + i
where is a (k 1) vector of regression coe cients, xi is a vector of covariates, and i N (0 1) truncated at ;( + xti )= . Je reys prior was obtained by Pericchi (1981), ( where p( ) is some unspeci ed prior for . ( ) ) / pk+1
Box and Cox (1964) proposed the prior ( ) / g;(k+1)( ;1) ;1
where g is the geometric mean of the y0 s. Based on the so called data-translated parameterization,
8 > ;1 + xti + i < ( ) z => : ln + xti + i
6= 0 =0
corresponding to
= ( ; 1)= or ln , Wixley (1993) proposed the following two priors, ( ) / g;(k+1)( ;1) ;1 p( )
where g is the geometric mean of the y0 s and p( ) is some unspeci ed prior for . This prior is apart from the prior of Box and Cox (1964) only by a factor p( ). It is also the prior used in Box and Tiao (1992). ( ) / 1+ ];(k+1)(1;1=
)
; 1 p( )
;(k+1)( ;1) ;1 p( )
where g is the geometric mean of the y0 s and p( ) is some unspeci ed prior for . This prior will give a very close resultant posterior distribution to that of Box and Cox (1964).
9 Cauchy
The C ( ), ;1 < < 1, > 0, density is
f (xj
)=
+ (x ; )2 ] :
This is a location-scale parameter problem see that section for priors. Posterior analysis can be found in Spiegelhalter (1985) and Howlader and Weiss (1988). 10
10 Dirichlet
P The D( ) density, where k=1 xi = 1, 0 xi 1, and = ( i given by k Y ( i ;1) xi f (xj ) = Qk;( 0 ) i=1 ;( i ) i=1 P
:::
k )t , i > 0
for all i, is
where 0 = k=1 i , i The Fisher information matrix is
0 1 PG(1 1 ) ; PG(1 0 ) ;PG(1 0 ) B C B C ... C I( 1 : : : k) = B B C @ A ;PG(1 0 ) PG(1 k ) ; PG(1 0 ) P where PG(1 x) = 1 (x+i);2 is the PolyGamma function. The Je reys prior is jI ( 1 : : : i=0
k )j1=2 .
11 Exponential Regression Model

See Ye and Berger (1991).
Yij N ( +
x+xi a
where 2 R, 0 < < 1, x 0 a > 0, x and a known constants, 0 i k ; 1 1 j m, the xi 's are known nonnegative regressors with xi 6= xj for i 6= j and the variance 2 > 0 is an unknown constant. It is assumed that xi < xj for i < j . Prior Uniform adhoc Je reys Reference1 Reference2 Reference3 ( 1 1= ) (Marginal) Posterior improper proper 2-dimensional numerical integration proper 1-dimensional numerical integration
jj
2x;1
p1 ( a )=
x;1 p( a )= x;1 p( a )=
x;1 p( a )=
2 3
11
where p( a ) = p1 ( a )=p2 ( a ), and
p1(w) = (
k;1 X
;(
p2(w) =
i=0 k ;1 X
1 w2xi ; k (
k;1 X i=0
wxi )2 )(
i=0
k;1 X i=0 k;1 X i=0
i=0 k ;1 X i=0
xi w2xi ; k
k;1 X i=0
X 1 k;1
1 x2w2xi ; k ( i
k;1 X i=0
xi wxi )2 )
w xi
xiwxi )2
1 w2xi ; k (
wxi )2 :
this
1. Group ordering f g or f ( ) g or with all permutations of is recommended for typical use, and appears to be approximately frequentist matching. 2. Group ordering f ( )g and with all permutations of . 3. Group ordering f ( )g and with all permutations of . The marginal posterior of for the prior ;s x;1 p( a ) is given by ( jy) where with
p( a ) 1+ax0 (1 ; a )h(
s y)
2x0
h( s y) = f s2 ; md2 ( y)=p2 ( )]km+s;3 yy k 2 dk ( y) =

k;1 X i=0
p2 ( ) 2
(1 ; )2 g
1=2
(yi: ; y) xi :
12 F Distribution
The F ( ), > 0, > 0, density is
f (xj
+ )=2] =2 =2 x =2;1 I (x): ) = ; ( ;( =2);( =2) ( + x)( + )=2 (0 1)
13 Gamma
The G( ), > 0, > 0, density is
f (xj
) = ;( 1) x ;1 e;x= I(0 1) (x): 12
If is known, is a scale parameter. The Je reys, reference, and MDIP priors are all 1= , P and the posterior is therefore IG(n 1= n=1 xi ). i The Fisher information matrix is
I(
0 1 PG(1 ) 1= C )=B @ A 2
1=
P where PG(1 x) = 1 (x + i);2 is the PolyGamma function. i=0

Prior ( ) Uniform 1 p PG(1 ) ; 1= Je reys p Reference1 ( PG(1 ) ; 1)= = p Reference2 PG(1 )= (Marginal) Posterior
proper3
1. Group ordering f g. 2. Group ordering f g. 3. See Liseo (1993) and Sun and Ye (1994b) for marginal posterior.
14 Generalized Linear Model

Let y1 : : : yn be independent observations, with the exponential density
f (yij
) = expfa;1 ( )(yi i ; b( i )) + c(yi )g i
where the ai (), b() and c() are known functions, and ai ( ) is of the form ai ( ) = =wi , where the wi 's are known weights. The i 's are related to regression coe cients, = ( 1 : : : p )t , by the link function i = 1 ::: n i = ( i) where i = xti , and xti = (xi1 : : : xip ) is a 1 p vector denoting the ith row of the n p matrix of covariates X , and is a monotonic di erentiable function. This model includes a large class of regression models, such as normal linear regression, logistic and probit regression, Poisson regression, gamma regression, and some proportional 13
hazards models. Ibrahim and Laud (1991) studied this model, using the Je reys prior. They focus on the case where is known. The Fisher information matrix they obtained for is given by I ( ) = ;1 (X t WV ( ) 2 ( )X ), where W is an n n diagonal matrix with ith diagonal element wi . V ( ) and ( ) are n n diagonal matrices with ith diagonal elements vi = v(xti ) = d2 b( i )=d i2 and i = (xti ) = d i =d i , respectively. Je reys prior is thus given by ( ) / jX t WV ( ) 2 ( )X j1=2 : Ibrahim and Laud (1991) show that the Je reys prior can lead to an improper posterior for this model, but that the posterior is proper for most GLM's. A su cient condition for the posterior to be proper is that the integral
Z
S
2b expf ;1 w(yr ; b(r))g( d dr(2r) )1=2 dr
be nite, the likelihood function of be bounded above, and X be of full rank. Here S denotes the range of . In addition, Ibrahim and Laud (1991) give a su cient and necessary condition for the Je reys prior to be proper, namely Z d2 b(r) ( dr2 )1=2 dr < 1:
S
When is unknown, the Fisher information matrix is
I(
where I1 (
0 I( )=B 1 @
) 0
I2 (
0 )
1 C A
) ;1 (X t WV ( ) 2 ( )X ) as above, and
I2 (
)=;
n X i=1
f2wi
;3 (b( i ) i ; b( i )) + E (c(yi ))g _
_ here b( i ) = db( i )=d i , and c = @ 2 c(yi )=@ 2 . The Je reys prior is then the square root of the determinant of I ( ). 14
15 Inverse Gamma
The IG( ), > 0, > 0, density is
f (xj
)=
1 ;( ) x(
+1)
e;1=x I(0 1)(x):
If is known, 1= is a scale parameter. The Je reys, reference, and MDIP prior is 1= , and P the posterior is IG(n 1= n=1 1=xi ). i If is unknown, the Fisher information matrix is the same as that of the Gamma distribution. Thus the reference prior and the Je reys prior are the same as for the Gamma distribution.
16 Inverse Normal or Gaussian

The IN ( ), > 0, > 0, density is
f (xj
1 ) = ( 2 )1=2 x;3=2 expf; 2x ( ; x )2 gI(0 1) (x): ) = diag( = 1=2 2 ). ) (Marginal) Posterior
The Fisher information matrix is given by I ( Prior Uniform Je reys Scale Reference1 (
p 1=
;1=2 ;1
1=
proper2
1. See Liseo (1993). 2. See Sun and Ye (1994b) for marginal posteriors. P De ne = n ; 1, = (ux= );1=2 , q = ( x);1 , x = xi =n. Banerjee and Bhattacharyya (1979) show that the marginal posterior of has a left-truncated t-distribution with degrees of freedom, location parameter 1=x, and scale parameter q, the point of truncation being zero. i.e., 1 ( jD) / (nu=2);n=2 f1 + 1 2 ( ; x )2 g;( +1)=2 q
1 where u = xr ; 1=x, and xr = n (1=xi ). The marginal posterior of is the modi ed gamma
15
((n =x)1=2 ) exp(;nu =2) =2;1 H() where () is the standard normal cdf, and H () denotes the cdf of Student's t-distribution with degrees of freedom. The posterior mean and variance are also available see Banerjee and Bhattacharyya (1979) for more detail. Another parametrization of inverse Gaussian distribution, the IN ( 2 ), > 0, 2 > 0, density is f (xj 2 ) = (2 2 );1=2 x;3=2 expf;(x ; )2 =(2 2 2 x)gI(0 1) (x): ( jD) / (nu=2)2) ;( = The Fisher information matrix is given by I ( Prior Uniform Je reys Reference1 ( 1
2 2
distribution
=2
) = diag( ;3 ;2 (2 4 );1 ).
;3=2 ;3 ;3=2 ;2
1. See Datta and Ghosh (1993).
17 Linear Calibration
Consider the model
yi = + xi + 1i i = 1 : : : n y0j = + x0 + 2j j = 1 : : : k
where , yi , and y0j are (p 1) vectors, xi 's are known values of the precise measurements, the 2 I ). The object is to predict x . 1i and 2j are i.i.d. Np (0 p 0 Pn x =n, y = Pn y =n, y = Pk y =k, c = Pn (x ; x)2 , ^ = Pn (x ; Denote x = i=1 i 0 x i=1 i j =1 0j i=1 i i=1 i x)(yi ; y)=cx , and s1 and s2 be the two sums of residual squared errors based on the calibration and prediction experiments, respectively. The statistics ^, y, y0 , and s = s1 + s2 are minimal su cient and mutually independent. Kubokawa and Robert (1994) then consider the reduced 16
model
y z s
Np ( Np(x0
2 2
Ip ) 2 Ip )
= c1=2 , and x0 = x
where y = c1=2 ^, z = (y0 ; y)(n;1 + k;1 );1=2 , q = (n + k ; 3)p, x ;1=2 (n;1 + k;1 );1=2 . (x0 ; x)cx The Fisher information matrix for the reduced model is given by
I(
0 B 1+x 2 B 0 B 2 Ip B B x0 ) = B B B B B 0 0 B @ t 2
x0 =
q+2p
2 4
0 .. . 0
x0 =
0
jj jj2 =
1 C C C C C C: C C C C C A
2) Prior (x0 (Marginal) Posterior Uniform 1 Je reys (1 + x0 2 )(p;1)=2 jj jj( 2 );(p+3)=2 q Reference1 ( 2 );(p+2)=2 = 1 + x0 2 proper
1. With respect to the group ordering fx0 (
)g. See Kubokawa and Robert (1994).
18 Location-Scale Parameter Models

Location Parameter Models: The LP ( ), 2 Rp, density is
f (yj ) = g(y ; X )
where X is a (n p) constant matrix and g is a n-dimensional density function. 17
Prior ( ) (Marginal) Posterior Uniform1 1 proper if rank(X t X ) = p Reference2 t (X t X ) ];(p;2) proper if rank(X t X ) = p > 2 1. Also Je reys and the usual reference prior. 2. Baranchik (1964) \shrinkage" prior also, the reference prior for (jj jj O), where = O (jj jj 0 : : : 0)t . This also arises from admissibility considerations see Berger and Strawderman (1993).
Location-Scale Parameter Models: The LSP ( ), 2 Rp , > 0 density is

f (yj
) = 1n g( 1 (y ; X ))
where X is a (n p) constant matrix and g is a n-dimensional density function. Prior Uniform Je reys Reference1 Reference2 Reference3 ( 1 ) (Marginal) Posterior
1= (p+1) 1= t (X t X ) ];(p;2) ) / ;1 (c1 2 + c2 )
1. Reference with respect to the group ordering f g also the prior actually recommended by Je reys (1961). 2. Baranchik (1964) \shrinkage" prior also, the reference prior with respect to the parameters fjj jj O g (see Location Parameter Models). 3. = = is parameter of interest, selecting rectangular compacts on and nuisance parameter c1 2 + c2 2 . See Datta and Ghosh (1993).
Scale Parameter Models:
18
The SP (
:::
p ), i > 0 for
8i, density is
p Y ;ni p) = ( i )g ( i=1
f (y j 1 : : :
1 y 1 y ::: 1 y ) 1 2 p 1~ 2~ p~ + np-dimensional density function.
where yi are (ni 1) vectors, i = 1 : : : p, g is a n1 + ~ Prior ( 1 : : : p ) (Marginal) Posterior Uniform 1 Je reys Qp ;1 Reference i=1 i MDIP
19 Logit Model
Poirier (1992) analyzes the Je reys prior for the conditional logit model as follows. An experiment consists of N trials. On trial n exactly one of Jn discrete alternatives is observed. Let ynj be a binary variable which equals unity i alternative j is observed on trial n otherwise t t ynj equals zero. Let zn = zn1 : : : znJn ]t , with each znj being a K 1 vector. is a K 1 vector of unknown parameters. The probability of alternative j on trial n is speci ed to be exp(z t ) pnj ( ) = Prob(ynj = 1jzn ) = PJn nj t : i=1 exp(zni ) When Jn J and znj = ej xn , where ej is a (J ; 1) 1 vector with all elements equal to zero except the j th which equals unity and xn is a M 1 vector of observable characteristics of trial n, this is the multinomial logit model. Poirier (1992) gives the Je reys prior for this model as ( )/
N Jn XX n=1 j =1
pnj ( ) znj ; zn ( )] znj ; zn ( )]t ~ ~
1=2
Pn where zn ( ) = J=1 pni ( )zni . The following special cases are also given therein. ~ i
Multinomial Logit Without Covariates:

19
Suppose K = J ; 1 and znj = ej then the Je reys prior reduces to the proper density
J 2) Y ( ) = ;(J=2 ] p j ( )]1=2 J= j =1
where p j = pnj for any n. The density for p = (p 1 : : : p J ;1 ) is the Dirichlet density 2) (p ) = ;(J=2 J= In the binomial case, exp( =2) ( ) = ;1 fp 1 ( ) 1 ; p 1 ( )]g1=2 = 1 + exp( )] and (p ) = ;1 p 1 (1 ; p 1 )];1=2 : the uniform prior on
J Y j =1
p;j1=2:
The other competing noninformative prior are: the uniform prior on p 1 and the MDIP prior for p 1. See Geisser (1984) for discussion.
Logistic Regression: Suppose yij i.i.d. Bernouli ( i ), with

exti : i = 1 : : : n: i= 1 + exti
Then the likelihood function is
f (y j ) = e
and the Fisher Information matrix is
Pn
t i=1 yi xi
= (1 + exti )
i=1
n Y
0 B 1(1 ; 1) B ... I ( ) = (x1 : : : xn ) B B @
n (1 ; n)
10 CB CB CB CB A@
xt1 C C xtn
1 A
.. C : . C
20
The Je reys prior is the square root of the determinant of above matrix. For more special cases of the logit model, see Poirier (1992).
20 Mixed Model
Consider mixed model
yijk = Bijk b + Wijk
w + Ti + Cj + ijk
i = 1 ::: I j = 1 ::: J k = 1 ::: tij
2 2 assuming ijk N (0 2 ), and independently, Cj N (0 c ), with b , w , Ti , 2 , and c the parameters of interest. Box and Tiao (1992) used the following prior in the balanced mixed model case, with tij = n 8ij , 2 ( c 2) / 2( 2 1 n 2) : + 2 2 Chaloner (1987) used three priors for this model, ( 2 + c );1 , ;2 ( 2 + c );1 , and ;2 ( 2 + 2 ;3=2 . c) Yang and Pyne (1996) derived the Je reys prior,
2 ( c
2 0 J X 2 ) / 6Det B 4 @
j =1
2 2(tj c + 2 )2 2 2(tj c + 2 )2
t2 j tj
P where tj = I=1 tij , and reference prior, i
2 2(tj c + 2 )2 tj ;1 1 2 2 4 2(tj c + 2 )2
tj
1 31=2 C7 A5
J uX t2 j 2 ( c 2 ) / 12 t (t 2 + 2 )2 : j c j =1
v u
21 Mixture Model
For arbitrary density functions p1 (x) and p2 (x), consider the model
p(xj ) = p1 (x) + (1 ; )p2 (x):
21
Bernardo and Giron (1988) discussed the reference prior for this model. They found that the reference prior is always proper.
22 Multinomial
+1 The M (n p) density, where k=1 xi = n and each xi is an integer between 0 and n, i +1 p = (p1 : : : pk+1 )t , with Pk=1 pi = 1 and 0 pi 1 for all i, is given by i
n f (xjp) = Qk+1!
Prior
p i: (xi !) i=1 i i=1

k+1 Y
integers l. See Berger and Bernardo (1992b), Zellner (1993), and Bernardo and Ramon (1996). 1. One-group reference prior. 2. k-group reference prior. 3. m-group reference prior. Ni and ni are as de ned in Berger and Bernardo (1992c). The posterior for the m-group reference prior, which includes the posteriors for Reference1 and Reference2 as special cases, is (pjD) / (
k Y i=1
Q ( ;k ) k=1 p;1=2 (1 ; i );1=2 ] i i Qm C ;1)(Qk p;1=2 )(Qm;1 (1 ; );ni+1 =2 )(1 ; );1=2 ( i=1 ni i=1 i Ni Nm i=1 Qk fp;1=2 (1 ; p (1 + ) ; Pi p );1=2 g ;1=2 (1 + );1=2 2 i=2 i j =3 j Pk p )1;Pk=1 pi i pp1 pp2 ppk (1 ; i=1 i 1 2 k Qk p;1 Novick and Hall's5 i=1 i Pj p , C = l =(l ; 1)!, and C = (2 )l = (2l ; 1)(2l ; 3) (1)], for all positive Here j = i=1 i 2l;1 2l
Uniform Je reys Reference1 Reference2 Reference3 Reference4 MDIP
(p) 1 ; Ck 1 (Qk=1 p;1=2 )(1 ; k );1=2 i i
all are proper
pixi ; 2 )(
1
m;1 Y i=1
(1 ; Ni );ni+1 =2 )(1 ; Nm )n;r; 2 :

1
4. k-group reference prior, suppose = p1 =p2 is parameter of interest. the reference posterior 22
for is
( jD) / (1 + )x1 +x2 +1 :
x1 ;1=2
5. See Novick and Hall (1965). Sono (1983) also derived a noninformative prior for this model, using the assumption of prior independence of transformed parameters and an approximate data-translated likelihood function.
23 Negative Binomial
The NB (
p), > 0, 0 < p 1, density is f (xj p) = ;(;(+ + x) ) p (1 ; p)x: 1);(
is given: (p) (Marginal) Posterior Uniform 1 Be(pj + 1 x ; + 1) p1 ; p] Be(pj x ; + 1=2) Je reys 1= p Reference Prior
24 Neyman and Scott Example

This model consists of 2n independent observations,
Xij N (
Prior Uniform Je reys Reference1 Reference2 (
1
) i = 1 : : : n j = 1 2:
:::
1
;1 ;n
1
n)
;(n+1)
proper
1. Group ordering f (
:::
n )g
or f 23
:::
n g.
Yields a sensible posterior. The
posterior mean of 2 is S 2 =(n ; 2), with S 2 = n=1 2=1 (Xij ; Xi )2 , Xi = (Xi1 + Xi2 )=2. See i j Berger and Bernardo (1992c) for discussion. 2. Group ordering f 1 ( 2 : : : n )g. Strong inconsistency occurs for this prior, along with the Je reys prior. The posterior mean of 2 for the Je reys prior is S 2 =(2n ; 2).
P P
25 Nonlinear Regression Model

Eaves (1983) considered the following nonlinear regression model
y = g( ) + e
where the n 1 vector-valued design function g of a d-dimensional vector parameter is no longer assumed linear as compared with linear regression, although it remains 1-1 and smooth. The noninformative prior proposed therein is ( where 1 I ( ) = 2 E (D jjy ; g( )jj2 j ) = d g ( )0 d g ( ) is the regression information matrix. Here D is the d d matrix-valued second-order partial operator: d g( ) is the n d Jacobian of g. Note this prior is also the reference prior provided that and are in separate group. ) / jI ( )j1=2 =
26 Normal
Univariate Normal:
24
The N (
), ;1 < < 1,
> 0, density is
2
f (xj
2
)=
(2
2 1=2
e;(x; )
)2 =2 2
Known:
( ) 1 (Marginal) Posterior ( jD) = N (x 2 =n)
1 P Here x = n n=1 xi . i
Prior All
Known:
Prior ( 2) Uniform 1 Je reys Reference 1= 2 MDIP (Marginal) Posterior ( 2 jD) IG((n ; 2)=2 2=S 2 ) ( 2 jD) IG(n=2 2=S 2 )
P 1 P Here x = n n=1 xi , and S 2 = n=1 (xi ; x)2 . i i
and
Both Unknown:
( 1 1= (
4 2
Prior Uniform Je reys Reference1 Reference2 MDIP
) / (2 + 2 );1=2 ;1 1= 2
(Marginal) Posterior ( jD) T (n ; 3 x S 2 =n(n ; 3)) ( 2 jD) IG((n ; 3)=2 2=S 2 ) ( jD) T (n + 1 x S 2 =n(n + 1)) ( 2 jD) IG((n + 1)=2 2=S 2 ) ( jD) T (n ; 1 x S 2 =n(n ; 1)) ( 2 jD) IG((n ; 1)=2 2=S 2 )
P 1 P Here x = n n=1 xi , and S 2 = n=1 (xi ; x)2 . i i 1. = = is parameter of interest, parameter ordering f (1994).
25
g. See Bernardo and Smith
2. If and The Np ( given by
are in separate groups.
p-Variate Normal:
) density, where = (
:::
1
p)
2 Rp and is a positive de nite matrix, is

1=2 )t ;1 (x; )=2
f (xj
)=
(2
)p=2 (det
e;(x; )
Known:
Prior ( ) Uniform Je reys 1 Reference Shrinkage1 ( t ;1 );(p;2) (Marginal) Posterior ( jD) Np(x
=n)
1 Here x = n n=1 xi . i 1. See Baranchik (1964) and Berger and Strawderman (1993).
Known:
Prior ( ) Uniform 1 Je reys 1=j j(p+1)=2 Q Reference1 1=j j i<j (di ; dj ) Q Reference2 1=j j(log d1 ; log dp )(p;2) i<j (di ; dj ) Reference3 1=R(1 ; R2 ) MDIP (Marginal) Posterior ( ;1 jD) Wp(n ; p ; 1 S ;1 =n) ( ;1 jD) Wp (n S ;1 =n) proper (RjD) (R2 );1=2 (1 ; R2 )n=2;1 2 F1 ( n 2 n p;1 (RR)2 )= 3 F2 ( n n 1 p;1 n+1 R2 ) 2 2 2 2 2 2 2 ;1 jD) Wp (n ; p S ;1 =n) (
1 Here S = n n=1 (xi ; )((xi ; )t , d1 < d2 < < dp are the eigenvalues of , and R and i R are population and sample multiple correlation coe cients, respectively. If we write
1=j j
0 =B @
11 (1)
(1) 22
1 0 1 t ^11 ^(1) C C ^ =S=B A @ A ^

^(1)
22
26
t t then, R = (1) ;1 (1) = 11 , and R = ^(1) ^ ;1 ^(1) = ^11 . 22 22 1. Group ordering lists the ordered eigenvalues of rst, and is recommended for typical use. See Yang and Berger (1994). 2. Group ordering lists the eigenvalues of rst, with fd1 dp g proceeding the other ordered eigenvalues. See Yang and Berger (1994). 3. Population multiple correlation coe cient is parameter of interest, and uses sample multiple correlation coe cient as data. Prior and posterior are with respect to dR. 2 F1 and 3 F2 are the hypergeometric function. See Tiwari, Chib and Jammalamadaka (1989). Also see Muirhead (1982).
and Both Unknown:

Prior ( ) (Marginal) Posterior Uniform 1 Je reys 1=j j(p+2)=2 Q Reference1 1=j j i<j (di ; dj ) Q Reference2 1=j j(log d1 ; log dp )(p;2) i<j (di ; dj ) MDIP 1=j j
1 Here S = n n=1 (xi ; )((xi ; )t , and d1 < d2 < < dp are the eigenvalues of . i 1. Group ordering lists the ordered eigenvalues of rst. and are in separate groups. It is recommended for typical use. See Yang and Berger (1994). 2. Group ordering lists the eigenvalues of rst, with fd1 dp g proceeding the other ordered eigenvalues. and are in separate groups. See Yang and Berger (1994).
The N2 ( ) density, where, = ( 1 2 ) 2 R2 and = ( ij ) is a 2 2 positive de nite matrix, is given by 1 f (xj ) = (2 )(det )1=2 e;(x; )t ;1 (x; )=2 :
Bivariate Normal:
and Both Unknown:

27
Prior ( ) Reference1 (1 ; 2 );1 ( 11 22 );1=2 1=2 2 2 Reference2 11 =( 11 22 ; 12 ) q 2 2 2 Reference3 11 22 + 12 =( 11 22 ; 12 ) p MDIP4 1= 11 22 (1 ; 2 )
(Marginal) Posterior (1; 1 F ( 2 1 n ; 1 1+ r ) 2 2 2 (1; r

2 )(n;3)=2 )n;3=2
1. The correlation coe cient, , is parameter of interest. Parameters ordered as f 1 2 11 r is the sample correlation coe cient. Prior and posterior are with respect to d d 1 d 2 d 11 d 22 . F is the hypergeometric function. See Bayarri (1981). 2. 11 is parameter of interest. Parameters ordered as f 11 ( 12 22 1 2 )g. Limiting 2 sequence of compact sets is f 12 = 22 2 ( 11 1;1 11 (1 ; l;1 ) 22 2 (l;1 l)g. 3. 12 is parameter of interest. Parameters ordered as f 12 ( 11 22 1 2 )g. Limiting 2 2 sequence of compact sets is f 11 22 2 ( 12 (1 + 1=l) 12 l) 11 2 (l;1 l)g. 4. Prior and posterior are with respect to d d 1 d 2 d 11 d 22 .
22
g.
27 Pareto
The Pa(x0 ) density, where, 0 < x0 < 1, > 0, is given by
f (xjx0 ) = x ( x0 ) 0 x
+1
I(x0 1)(x):
If x0 is known, this is a scale density, and the Je reys prior and reference prior is 1= .
28 Poisson
The P ( ), > 0, density is
f (xj ) = e x!
; x
Prior Uniform Je reys Reference
;1=2
( ) 1
(Marginal) Posterior G(Pn=1 xi + 1 1=n) i Pn x + 1=2 1=n) G( i=1 i 28
Possion Process:
For a Poisson process X1 X2 : : : with unknown parameter , Je reys (1961), Novick and Hall (1965), and Villegas (1977) proposed ignorance prior ( ) = ;1 , also called a logarithmic uniform prior because it implies a uniform distribution for log .
29 Product of Normal Means

Qp
Consider the Np( i=1 i . ) density, where =(
1
:::
p ),
and the parameter of interest is
p=2, = I2:
Prior ( 1 2 ) (Marginal) Posterior Uniform 1 proper Reference1 ( 2 + 2 )1=2 1 2 1. See Berger and Bernardo (1989).
p=n, = In, n > 2:

Prior ( 1 : : : n) (Marginal) Posterior Uniform 1 proper Qn qPn ;2 Reference1 i=1 i i=1 i 1. See Sun and Ye (1995).
2 p=n, = diag( 1 : : :
n ),
2
and i > 0 for i = 1 : : : n:

1
Prior Uniform Je reys1 Reference2 Reference3
:::
1
;2
n)
( 1
1. Proposed by Je reys (1961).
i=1 i r 2 )2=k;1 g1 ( 1 ::: n ) Pn n Qn 2 i=1 ni i 2 i i=1 i r 2 ( 1 ) Pn Qn n2 i=1 ni i 2 i i=1 i
Qn
proper
29
2. See Sun and Ye (1994a), where g1 () is any positive real function and ni is the number of observations from ith population. 3. Also see Sun and Ye (1994a), where ni is the number of observations from ith population. This prior is also the Tibshirani's matching prior.
30 Random E ects Models

See Berger and Bernardo (1992a).
One-Way Model (balanced):
Xij = + i +
ij
i = 1 : : : p and j = 1 : : : n
2
where the i are i.i.d. N (0 2 ) and, independently, the ij are i.i.d. N (0 ( 2 2 ) are unknown. The reference priors for this model are, Ordered Grouping f( 2 2)g f( 2 ) 2 g f( 2 ) 2 g f 2 ( 2 )g f 2 ( 2 )g f ( 2 2)g, f( 2 2 ) g
). The parameters
Reference Prior posterior ;2 (n 2 + 2 );3=2 ;5=2 (n 2 + 2 );1 ;3Cn =2 ;2 ( 2 = 2 ) ;1 (n 2 + 2 );3=2 ;1 ;2 (n 2 + 2 );1=2 ( 2 = 2 ) proper
The posterior computation involves only one dimensional numerical integration. For details see Berger and Bernardo (1992a). Suppose = n 2 = 2 is the parameter of interest (see, Ye, 1994).
;2 (n 2 + 2 );1 g 2 2 g, f 2 2 g 2 2 g, f 2 2 g, f 2 2 g ;Cn ;2 ( 2 = 2 ) p p p Here Cn = f1 ; n ; 1( n + n ; 1);3 g, ( 2 = 2 ) = (n ; 1) + (1 + n 2 = 2 );2 ]1=2 .
f f f
30
Prior Uniform Je reys Reference1 Reference2 Reference3 1. 2. 3. 4.
1 ;3 (1 + );3=2 ;2 (1 + );3=2 ;2 (1 + );1 ;3 (1 + );1
proper4
Group ordering f( ) 2 g. 2g f 2 Group ordering f g f( 2 ) g, recommended for typical use. Group ordering f ( 2 )g. The marginal posterior for , corresponding to the prior ;a (1 + );b is given by
q;1 ( jD) = W q B (1 (+ )(1 + W )) (1 + 1 W );(p+q) W= + pq
where B (x) = 01 t ;1 (1 ; t) ;1 dt is the incomplete Beta function, p = (p + 2b ; 3)=2, 1 q = p(n ; 1) + a ; 2b]=2, W = S2 =S1 , S1 = Pp=1 Pn=1 (Yij ; Yi)2 Yi = n Pn=1 Yij , S2 = j j i Pn (Y ; Y )2 , and Y = 1 Pp Pn Y . n j=1 i pn i=1 j =1 ij
One-Way Model (unbalanced):

See Ye (1990).
Xij = + i +
ij
i = 1 : : : k and j = 1 : : : ni
2
where the i are i.i.d. N (0 2 ) and, independently, the ij are i.i.d. N (0 ( 2 2 ) are unknown. The reference priors for this model are, Prior Uniform Je reys Reference1 Reference2 (
;5 s1 1 (
2 2 2
). The parameters
= 2 )(ns2 2 ( 2 = 2 ) ; s1 1 ( 2 = 2 )2 )]1=2 ;2 ;2Cn ( 2 = 2 )1=2 ;4 s2 2 ( 2 = 2 )1=2

31
proper
Here the limiting sequence of compact sets for reference prior is chosen to be l = al bl ] cl dl ] el fl ], for ( 2 2 ), and
sp q (x) =
k X
(x) = n ; k +
Cn
1 (1 + ni x)2 p i=1 p p n p k + pnp n ; pn ; k) 2 ; ( = p n + n ; k 2( n + n ; k)2 = llim log dl1 !1 log c; l

2 2
np i (1 + ni x)q for p q = 0 1 2 : : : i=1

k X
2 g, and f 2 2 g. 1. Group ordering f 2 2 g, f 2 2 g, f 2 2 g, f ( 2. Group ordering f 2 2 g, f 2 Suppose v = 2 = 2 is the parameter of interest.
)g, and f(
) g.
Prior Uniform Je reys Reference1 Reference2
(v
1 ;3 s1 1 (v)(ns2 2 (v) ; s1 1 (v)2 )]1=2 ;2 ns2 2 (v) ; s1 1 (v)2 ]1=2 ;2 s2 2(v)1=2

2
proper
1. Group ordering f v 2 g, fv 2. Group ordering f 2 vg, f 2
g, and fv 2 g. vg, f 2 v g, f ( 2 v)g, and f( 2 v) g.
Random Coe cient Regression Model:

yi = Xi i +
~ ~
i
where yi is a (ti 1) vector of observations, Xi is a (ti p) constant design matrix, i is ~ ~ a (p 1) vector of random coe cients for the ith experimental subject and i is a vector of errors for i = 1 : : : n. Furthermore, we assume that ( i i i = 1 : : : n) are independent, and ~ ) and i MV N (0 2 Ii ), where is the (p 1) mean of the i and Ii is the i MV N ( ~ ~ ~ ~ (ti ti ) identity matrix. 32
De ning = = 2 , the Je reys prior is

J(
)/
n X i=1
Bi
1 2
n X i=1
(Bi
vec(Pn=1 Bi)(vec(Pn=1 Bi))t ]Gt 2 = i P i Bi) ; n t i=1 i

1
p+2 :
The reference prior with respect to the Group Ordering f

R(
) / jG
n X i=1
gf
g or f
g is
(Bi Bi )Gt j 2 = 2 :
1
The reference prior with respect to the Group Ordering f

R(
) / jG
n X i=1
(Bi
2g f 2 g or f ~ ~ vec(Pn=1 Bi )(vec(Pn=1 Bi ))t ]Gt j 1 = 2 i P i 2 B );
g is
n t i=1 i
where Bi def Xit (Xi Xit + Ii );1 Xi , and G denotes a (p(p + 1)=2) p2 constant matrix @ (vecV )= = @ (vecpV ), where V is a p p symmetric matrix. The posteriors corresponding to the priors above are proper, provided we have at least 2p +1 full rank design matrices. Computation is discussed in Yang and Chen (1995). ) and i Np ( T ). Fatti (1982) considers the usual di use prior distribution for and T , and Box and Tiao's noninformative prior distribution. Speci cally, Fatti (1982) consider the following type of di use joint prior density
1
Random E ects Model: X = fxij j = 1 : : : ni i = 1 : : : kg arises from k populations
:::
k, i
Np (
T ) / j j;v1 =2 jT j;v2 =2 :
This prior has been used by Geisser (1964) and Geisser and Corn eld (1963), for v1 = v2 = p +1. Fatti (1982) also studies the predictive density of a new observation x, under the hypothesis x 2 r . Fatti found that v2 must be less than 2 for the predictive density to exist. So, T cannot have the usual di use prior distribution with v2 = p + 1. If we assign the values v1 = p + 1 and
33
v2 = 1, f (xjX
r ) / jA3 j
;(N ;p;1)=2
2 1
F (p=2 (N ; p ; 1)=2 (k ; 1)=2 A3 ;1 A1 )
for k > p
P P P where A1 = n k=1 (xi: ; x::)(xi: ; x::)t , A3 = k=1 n=1 (xij ; x:: )(xij ; x::)t , F (a1 a2 b1 !) i i j is the hypergeometric function, and 8 > ni < i 6= r ni = > : nr + 1 i = r
P P 1P where we assume ni = n 8i, xi: = n1 n=1 xij , x:: = k k=1 xi: , and N = k=1 ni . j i i Box and Tiao (1992) propose the following noninformative joint prior,
(
T ) / j j;(p+1)=2 j + nT j;(p+1)=2:
This prior distribution gives the predictive density as,
f (xjX
r ) / jA3 j
;(N ;1)=2 2 F1 ((p +1)=2 (N ; 1)=2 (k + p)=2
A3 ;1 A1)
for N > p and k > p:
31 Ratio of Exponential Means

Let Xi ind Exponential( i ), i = 1 2. The parameter of interest is 1 = 2 = 1 . With nuisance parameter 2 = 1 2 , Datta and Ghosh (1993) get the reference prior as ( 1 : 2 ) = ( 1 2 );1 .
32 Ratio of Normal Means

Suppose X = fX1 : : : Xn g and Y = fY1 : : : Ym g are available from two independent normal populations with unknown means and unknown common variance 2 . The problem is to make inferences about the value of = = , the ratio of the means.
34
The Fisher information matrix is (see Bernardo, 1977),
I(
It follows that the reference prior is (
0 B 1B )= 2B B @
1+ 0 0
0 0 4
1 C C: C C A
) / (1 + 2 );1=2 ;1
or, in terms of the original parameterization ( The reference posterior is ) / ( 2 + 2 );1=2 ;1 :
where x = n=1 xi =n y = m yi =m, and S 2 = n=1 (xi ; x)2 + m (yi ; y)2 . i i=1 i i=1 Note for this problem, the usual noninformative prior 1= entails Fieller-Creasy problem. Kappenman, Geisser and Antle (1970) showed that 1= can lead to a con dence interval of consisting the whole real line.
y2 ( jD) / (1 + 2 );1=2 (m + 2 n);1=2 fS 2 + nm(x ; 2 n ) g;(n+m;1)=2 m+
33 Ratio of Poisson Means

Let X and Y be independent Poisson random variables with means and . The parameter of interest is . Liseo (1993) computed the Je reys prior and reference prior for this problem as Prior ( ) (Marginal) Posterior Uniform 1 ;1=2 Je reys proper p (1 + ) Reference 1= 35
Here the Je reys prior and reference prior give the same marginal posterior for , ( jD) / (1 + )x+y+1 :
x;1=2
34 Sequential Analysis
Suppose X1 X2 : : : is an i.i.d. sample with common density function f (xi j ) which satis es the regular continuous condition. Here Xi and are k 1 and p 1 vectors, respectively. Let N be the stopping time. From Ye (1993), if 0 < E (N ) < 1, the Je reys prior is
J(
) / (E N ])p=2 det(I ( ))
where I ( ) is the Fisher information matrix of X1 . Suppose = ( (1) (2) : : : (m) ) is an m-ordered group. Furthermore, assume E (N ) / E (1) (N ) depends only on (1) and 0 < E (N ) < 1. Then the reference prior is (see Ye, 1993)
R(
(1)
:::
(m)
) / (E N ])p1 =2 R (
(m)
(1)
:::
(m)
where p1 is the dimension of (1) and R ( (1) : : : the same group order and compact subsets.
) is the reference prior of for X1 , using
35 Stress-Strength System
Suppose X1 : : : Xm are i.i.d. Weibull( 1 ) random variables, and independently, Y1 : : : Ym are i.i.d. Weibull( 2 ) random variables. Parameter of interest is
!1 = P (X1 < Y1) = 2 =( 1 + 2 ):

When = 1, this is the simple stress-strength system under exponential distribution, with parameter as scale parameter. Thompson and Basu (1993) computed the reference prior and showed that the reference prior for ( 1 2 ), when !1 is the parameter of interest and !2 = 1 + 2 36
is the nuisance parameter, coincide with the Je reys prior. With reparameterization of !1 and !2 = 2 =( 1 + 2 ), Basu and Sun (1994) computed the following reference priors and gave a necessary and su cient condition for the existence of a proper posterior, Prior (!1 !2 ) (Marginal) Posterior Uniform 1 Je reys1 !1 (1 ; !1)!2 ];1 Reference2 g1 (!1 !2 )= !1 (1 ; !1 )!2 ] Reference3 g2 (!1 !2 )= !1 (1 ; !1 )!2 ] Reference4 g3 (!1 !2 )= !1 (1 ; !1 )!2 ] Where
g1 (!1 !2 ) = 1= + a(1 ; a)flog (1 ; !1 )=!1 ]g2 q 2 g2 (!1 !2 ) = a(1 ; !1 )2 + (1 ; a)!1 q g3 (!1 !2 ) = g1 (!1 !2 )= + af + log (1 ; !1)!2 ]g2 + (1 ; a)f + log(!1 !2 )g2
and = 1+ =
0
Z1
0
Z1
(log z )e;z dz
(log z )2 e;z dz ; f
Z1
0
(log z )e;z dz g2
a = m=(m + n):
1. 2. 3. 4.
Also the reference for the group ordering f(!1 !2 )g f !1 !2 g, and f (!1 !2 )g. Group ordering f!1 (!2 )g and f!1 !2 g. Group ordering f(!2 ) !1 g: Group ordering f!1 !2 g:
36 Sum of Squares of Normal Means

Pp
Consider the Np ( 2 i=1 i .
Ip) density, where = (
:::
p ),
and the parameter of interest is
37
Prior Uniform Reference1
:::
1
jj jj;(p;1)
(Marginal) Posterior proper
1. See Datta and Ghosh (1993). This is also the prior obtained by Stein (1985) and Tibshirani (1989). It can be viewed as a hierachical prior with (i) 1 : : : p j i:i:d: N (0 ;1 ), (ii) has the improper gamma density function f ( ) /;3=2 .
37 T Distribution
The T (
2
) density, where > 0, ;1 < < 1, and

2
> 0, is given by
+1)=2
f (xj
) = ; ( 1+ 1)=2] ( ) =2 ;( =2)
! (x ; )2 ;( 1+
2
This is a location-scale parameter problem see that section for priors.
38 Uniform
The U ( ; a + a), ;1 < < 1, density is
f (xj ) = 21a Ix2( ;a
+a)
The reference prior (Bernado and Smith, 1994) is the same as the Uniform prior, and the reference posterior is Uniform distribution U (max(x1 : : : xn ) ; a min(x1 : : : xn ) + a). The U (0 ), 0 < < 1, density is
f (xj ) = 1 Ix2(0 ):
The reference prior (Bernado and Smith, 1994) is ( ) / ;1 , and the reference posterior is Pareto distribution Pa(max(x1 : : : xn ) n).
38
39 Weibull
The Weibull density, with shape parameter c > 0 and quasi-scale parameter > 0, is
f (xj c) = c xc;1 exp(; xc ) x > 0: w = log x has the extreme value density f (wj c) = c exp(cw) exp(; exp(cw)) ;1 < w < 1:
As indicated in Evans and Nigm (1980), setting c = ;1 and = exp(; = ), the density of w becomes f (wj ) = ;1 expf(w ; )= g exp expf(w ; )= g] in which and are seen to be location-scale parameters. Therefor, ( ) = 1= . Prior transformed back to the original parameterization is ( c) = 1=( c2 ). Analysis using the usual noninformative prior is discussed by Evans and Nigm (1980). Sun and Berger (1994) computed the following reference prior Prior Reference1 p Reference2 (c 1 + ( c) (Marginal) Posterior 1=(c ) proper3 ; 2 ; 2(1 ; ) log + (log )2 );1
Here is Euler's constant, i.e., = ; 01 (log z )e;z dz and 1. Group ordering f( c)g and fc g (also Je reys prior). 2. Group ordering f cg. 3. Provided n > 1 and not all observations are equal. This density can also be written in the form
R = 01 (log z )2 e;z dz .
f (xj
;1 ) = x expf; x ] g:
Under this form, Sun and Berger (1994) computed the reference prior as
39
Prior ( ) (Marginal) Posterior ;1 Reference1 proper3 Reference2 ( );1 1. Group ordering f( )g (also Je reys prior). 2. Group ordering f g and f g. 3. Provided n > 1 and not all observations are equal.
References
Alba, E.D. and Mendoza, M. (1995). A discrete model for Bayesian forcasting with stable seasonal patterns. Advances in Econometrics 11. Banerjee, A.K. and Bhattacharyya, G.K. (1979). Bayesian results for the inverse Gaussian distribution with an application. Technometrics 21, No 2, 247-251. Baranchik, A. (1964). Multiple regression and estimation of the mean of a multivariate normal distribution. Technical Report No 51, Department of Statistics, Stanford University. Basu, A.P. and Sun, D. (1994). Bayesian analysis for a stres-strength system via reference prior approach. Submitted. Bayarri, M.J. (1981). Inferencia Bayesiana sobre el coe ciente de correlacion de una problacion normal bivariante. Trab. Estadist. 32, 18-31. Berger, J.O. (1985). Statistical Decision Theory and Bayesian Analysis. New York: SpringerVerlag. Berger, J.O. and Bernardo, J.M. (1989). Estimating a product of means: Bayesian analysis with reference priors. J. Amer. Statist. Assoc. 84, 200-207. Berger, J.O. and Bernardo, J.M. (1992a). Reference priors in a variance components problem. In Bayesian Analysis in Statistics and Econometrics, P. Goel and N.S. Iyengar(eds.). New York: Springer Verlag. Berger, J.O. and Bernardo, J.M. (1992b). Ordered group reference priors with application to a multinomial problem. Biometrika 79, 25-37. Berger, J.O. and Bernardo, J.M. (1992c). On the development of the reference prior method. In Bayesian Statistics 4, J.M.Bernardo, J.O.Berger, D.V.Lindley, and A.F.M.Smith (eds.). London: Oxford University Press. 40
Berger, J.O. and Strawderman, W.E. (1993). Choice of hierarchical priors: admissibility in estimation of normal means. Technical Report 93-34C, Department of Statistics, Purdue University. Berger, J.O. and Yang, R. (1994). Noninformative priors and Bayesian testing for the AR(1) model. Econometric Theory, 10, No 3-4, 461-482. Bernardo, J.M. (1977). Inferences about the ratio of normal means: a Bayesian approach to the Fieller-Creasy problem. In Recent Developments in Statistics, J.R. Barra et al. (eds.). Amsterdam: North-Holland. Bernardo, J.M. (1979). Reference posterior distributions for Bayes inference. J. Roy. Statist. Soc. Ser B 41, 113-147 (with discussion). Bernardo, J.M. and Giron, F.J. (1988) A Bayesian analysis of simple mixture problems. In Bayesian Statistics 3, J.M.Bernardo, M.H.DeGroot, D.V.Lindley, and A.F.M.Smith (eds.). London: Oxford University Press. Bernardo, J.M. and Ramon, J.M. (1996). An elementary introduction to reference analysis. Tech. Rep. BR96, Universitat de Valencia, Spain. Bernardo, J.M. and Smith, A.F.M. (1994). Bayesian Theory. New York: John Wiley & Sons. Box, G.E.P. and Cox, D.R. (1964). An analysis of transformation. J. Roy. Statist. Soc. Ser B 26, 211-252. (with discussion). Box, G.E.P. and Tiao, G.C. (1992). Bayesian Inference in Statistical Analysis. John Wiley & Sons, New York. Chaloner, K. (1987). A Bayesian approach to the estimation of variance components for the unbalanced one-way random model. Technometrics, 29, 323-337. Crowder, M. and Sweeting, T. (1989). Bayesian inference for a bivariate binomial. Biometrika, 76, 599-604. Datta, G.S. and Ghosh, M. (1993). Some remarks on noninformative priors. Technical Report 93-16, Department of Statistics, University of Georgia. Eaves, D.M. (1983). On Bayesian nonlinear regression with an enzyme example. Biometrika, 70, 2, 373-379. Evans, I.G. and Nigm, A.M. (1980). Bayesian prediction for two-parameter Weibull lifetime models. Commun. Statist. - Theor. Meth., A9(6), 649-658. 41
Fatti, L.P. (1982). Predictive discrimination under the random e ect model. South African Statist. J., 16, 55-77. Geisser, S. (1984). On prior distributions for binary trials (with discussion). Amer. Statist., 38, 244-251. Geisser, S. (1964). Posterior odds for multivariate normal classi cations. J. Roy. Statist. Soc. Ser B 26, 69-76. Geisser, S. and Corn eld, J. (1963). Posterior distributions for multivariate normal parameters. J. Roy. Statist. Soc. Ser B 25, 368-376. Howlader, H.A. and Weiss, G. (1988). Bayesian reliability estimation of a two-parameter Cauchy distribution. Biom. J., 30, 3, 329-337. Ibrahim, J.G. and Laud, P.W. (1991). On Bayesian analysis of generalized linear models using Je reys prior. J. Amer. Statist. Assoc. 86, 981-986. Je reys, H. (1961). Theory of Probability. London: Oxford University Press. Joshi, S. and Shah, M. (1991). Estimating the mean of an inverse Gaussian distribution with known coe cient of variation. Commun. Statist. - Theor. Meth., 20(9), 2907-2912. Kappenman, R.F., Geisser, S. and Antle, C.F. (1970). Bayesian and ducial solutions to the Fieller-Creasy problem. Sankhya B, 32, 331-340. Kass, R.E. and Wasserman, L. (1993). Formal rules for selecting prior distributions: a review and annotated bibliography. Kubokawa, T. and Robert, C.P. (1994). New perspective on Linear calibration. J. Multivariate Anal., 51, No 1, 178-200. Liseo, B. (1993). Elimination of nuisance parameters with reference noninformative priors. Biometrika, 80, 295-304. Muirhead, R.J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York. Novick, W.R. and Hall, W.J. (1965). A Bayesian indi erence procedure. J. Amer. Statist. Assoc., 60, 1104-1117. Pericchi, L.R. (1981). A Bayesian approach to transformation to normality. Biometrika, 68, 35-43. Poirier, D.J. (1992). Je reys prior for logit models. To appear in Econometrics. Polson, N. and Wasserman, L. (1990). Prior distribution for the bivariate binomial. Biometrika, 42
Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Annal. Statist., 11, No 2, 416-431. Sono, S. (1983). On a noninformative prior distribution for Bayesian inference of multinomial distribution's parameters. Ann. Inst. Statist. Math., 35, Part A, 167-174. Spiegelhalter, D.J. (1985). Exact Bayesian inference on the parameters of a Cauchy distribution with vague prior information. In Bayesian Statistics 2, J.M.Bernardo, M.H. DeGroot, D.V.Lindley, and A.F.M.Smith (eds.). North-Holland: Elsevier Science Publishers B.V.. Stein, C. (1985). On the coverage probability of con dence sets based on a prior distribution. In Sequential Methods in Statistics, Banach center publications, 16. PWN-Polish Scienti c Publishers, Warsaw. Sun, D.C. and Berger, J.O. (1994). Bayesian sequential reliability for Weibull and related distributions. Ann. Inst. Statist. Math., 46, No 2, 221-249. Sun, D.C. and Ye, K.Y. (1994a). Inference on a product of normal means with unknown variances. Submitted to Biometrika. Sun, D.C. and Ye, K.Y. (1994b). Frequentist validity of posterior quantiles for a two-parameter exponential family. Submitted to Biometrika. Sun, D.C. and Ye, K.Y. (1995). Reference prior Bayesian analysis for normal mean products. J. Amer. Statist. Assoc., 90. Tibshirani, R. (1989). Noninformative priors for one parameter of many. Biometrika, 76, 604608. Tiwari, R.C., Chib, S. and Jammalamadaka, S.R. (1989). Bayes estimation of the multiple correlation coe cient. Commun. Statist. - Theor. Meth., 18(4), 1401-1413. Villegas, C. (1977). On the representation of ignorance. J. Amer. Statist. Assoc., 72, 651-654. Wixley, R.A.J. (1993). Data translated likelihood and choice of prior for power transformation to normality. Yang, R. and Berger, J.O. (1994). Estimation of a covariance matrix using the reference prior. Annal. Statist., 22, No 3, 1195-1211. Yang, R. and Chen, M.H. (1995). Bayesian analysis for random coe cient regression models using noninformative priors. J. Multivariate Anal., 55, No 2, 283-311. 43
77, 4, 901-904.
Yang, R. and Pyne, D. (1996). Bayesian analysis with mixed model in unbalanced case. Draft. Ye, K.Y. (1990). Noninformative priors in Bayesian analysis. Ph.D. Dissertation. Purdue University. Ye, K.Y. (1994). Bayesian reference prior analysis on the ratio of variances for the balanced one-way random e ect model. J. Statist. Plann. Inference, 41, No 3, 267-280. Ye, K.Y. (1993). Reference priors when the stopping rule depends on the parameter of interest. J. Amer. Statist. Assoc., 88, 360-363. Ye, K.Y. and Berger, J.O. (1991). Noninformative priors for inference in exponential regression models. Biometrika 78, 645-656. Zellner, A. (1971). An Introduction to Bayesian Inference in Econometrics. New York: John Wiley & Sons. Zellner, A. (1984). Maximal data information prior distributions. In Zellner, A. Basic Issues in Econometrics, U. of Chicago Press. Zellner, A. (1993). Models, prior information and Bayesian analysis.
44

A Catalog of Noninformative Priors

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

A Catalog of Noninformative Priors

Caricato da

Copyright:

Formati disponibili

A Catalog of Noninformative Priors

1.2 Approaches to Development of Noninformative Priors

2 Organization and Notation

is unknown, multiply by 1 d (or

8 > < Reference2 > :

Prior Uniform Je reys Reference1

E X0 ] ; 1 g 1; 2 2 1; + 1; 2 f PT X1; )]g expf 1 E log( i=1 i2;1 p2 2 1= 2 1 ; ] if j j< 1 p2 ; 1] if j j> 1: 1= 2 j j T2

all are proper

1 0 PG(1 ) ; PG(1 + ) ;PG(1 + ) C )=B A @ ;PG(1 + ) PG(1 ) ; PG(1 + )

0 1 n f (xjn p) = B C px(1 ; p)(n;x): @ A

Case 1: Priors for p, given n

Uniform Je reys Reference MDIP

(pjn) 1 1 ;1=2 p (1 ; p);1=2

(Marginal) Posterior Be(pjx + 1 n ; x + 1) 1 Be(pjx + 2 n ; x + 1 ) 2

1 2:865n log(n) log log(n)

0 1 0 1 m r f (r sjp q m) = B C pr (1 ; p)m;r B C qs (1 ; q)r;s @ A @ A

8 Box-Cox Power Transformed Linear Model

8 y ;1 > i < ( ) z => : ln yi

Box and Cox (1964) proposed the prior ( ) / g;(k+1)( ;1) ;1

8 > ;1 + xti + i < ( ) z => : ln + xti + i

where 0 = k=1 i , i The Fisher information matrix is

11 Exponential Regression Model

where p( a ) = p1 ( a )=p2 ( a ), and

k;1 X i=0 k;1 X i=0

h( s y) = f s2 ; md2 ( y)=p2 ( )]km+s;3 yy k 2 dk ( y) =

+ )=2] =2 =2 x =2;1 I (x): ) = ; ( ;( =2);( =2) ( + x)( + )=2 (0 1)

) = ;( 1) x ;1 e;x= I(0 1) (x): 12

P where PG(1 x) = 1 (x + i);2 is the PolyGamma function. i=0

14 Generalized Linear Model

) = expfa;1 ( )(yi i ; b( i )) + c(yi )g i

2b expf ;1 w(yr ; b(r))g( d dr(2r) )1=2 dr

When is unknown, the Fisher information matrix is

;3 (b( i ) i ; b( i )) + E (c(yi ))g _

e;1=x I(0 1)(x):

16 Inverse Normal or Gaussian

1 ) = ( 2 )1=2 x;3=2 expf; 2x ( ; x )2 gI(0 1) (x): ) = diag( = 1=2 2 ). ) (Marginal) Posterior

1. See Datta and Ghosh (1993).

1. With respect to the group ordering fx0 (

)g. See Kubokawa and Robert (1994).

18 Location-Scale Parameter Models

Location-Scale Parameter Models: The LSP ( ), 2 Rp , > 0 density is

1= (p+1) 1= t (X t X ) ];(p;2) ) / ;1 (c1 2 + c2 )

Scale Parameter Models:

1 y 1 y ::: 1 y ) 1 2 p 1~ 2~ p~ + np-dimensional density function.

pnj ( ) znj ; zn ( )] znj ; zn ( )]t ~ ~

Multinomial Logit Without Covariates:

Logistic Regression: Suppose yij i.i.d. Bernouli ( i ), with

0 B 1(1 ; 1) B ... I ( ) = (x1 : : : xn ) B B @

yijk = Bijk b + Wijk

i = 1 ::: I j = 1 ::: J k = 1 ::: tij

P where tj = I=1 tij , and reference prior, i

p(xj ) = p1 (x) + (1 ; )p2 (x):

p i: (xi !) i=1 i i=1

Uniform Je reys Reference1 Reference2 Reference3 Reference4 MDIP

(p) 1 ; Ck 1 (Qk=1 p;1=2 )(1 ; k );1=2 i i

all are proper

(1 ; Ni );ni+1 =2 )(1 ; Nm )n;r; 2 :

( jD) / (1 + )x1 +x2 +1 :

p), > 0, 0 < p 1, density is f (xj p) = ;(;(+ + x) ) p (1 ; p)x: 1);(

24 Neyman and Scott Example

Yields a sensible posterior. The

25 Nonlinear Regression Model

P 1 P Here x = n n=1 xi , and S 2 = n=1 (xi ; x)2 . i i