Sei sulla pagina 1di 13

Tina Memo No.

2003-003
Internal Report

Products and Convolutions of Gaussian Probability Density


Functions

P.A. Bromiley

Last updated
22 / 6 / 2018

Division of Informatics, Imaging and Data Sciences,


School of Health Sciences, University of Manchester,
Stopford Building, Oxford Road,
Manchester, M13 9PT.
Products and Convolutions of Gaussian Probability
Density Functions
P. A. Bromiley
Division of Informatics, Imaging and Data Sciences,
School of Health Sciences, University of Manchester,
Manchester, M13 9PT, UK
paul.bromiley@manchester.ac.uk

Abstract

It is well known that the product and the convolution of Gaussian probability density functions (PDFs)
are also Gaussian functions. This document provides proofs of this for several cases; the product of
two univariate Gaussian PDFs, the product of an arbitrary number of univariate Gaussian PDFs, the
product of an arbitrary number of multivariate Gaussian PDFs, and the convolution of two univari-
ate Gaussian PDFs. These results are useful in calculating the effects of smoothing applied as an
intermediate step in various algorithms.

1 The Product of Two Univariate Gaussian PDFs


Let f (x) and g(x) be Gaussian PDFs with arbitrary means µf and µg and standard deviations σf and σg
(x−µf )2 (x−µ ) 2
1 −
2σ 2 1 − 2σ2g
f (x) = √ e f and g(x) = √ e g

2πσf 2πσg

Their product is
(x−µf )2
 
(x−µg )2
1 −
2σ 2
+ 2
2σg
f (x)g(x) = e f (1)
2πσf σg
Examine the term in the exponent
(x − µf )2 (x − µg )2
β= 2 +
2σf 2σg2
Expanding the two quadratics and collecting terms in powers of x gives

(σf2 + σg2 )x2 − 2(µf σg2 + µg σf2 )x + µ2f σg2 + µ2g σf2
β=
2σf2 σg2

Dividing through by the coefficient of x2 gives


µf σg2 +µg σf2 µ2f σg2 +µ2g σf2
x2 − 2 σf2 +σg2
x + σf2 +σg2
β= σ2 σ2
(2)
2 σ2f+σg2
f g

This is again a quadratic in x, and so Eq. 1 is a Gaussian function. Compare the terms in Eq. 2 to a the usual
Gaussian form
1 (x−µ)2 1 (x2 −2µx+µ2 )
P (x) = √ e− 2σ2 = √ e− 2σ 2
2πσ 2πσ
Since a term ǫ that is independent of x can be added to complete the square in β, this is sufficent to complete the
proof in cases where the normalisation can be ignored. The product of two Gaussian PDFs is proportional to a
Gaussian PDF with a mean that is half the coefficient of x in Eq. 2 and a standard deviation that is the square
root of half of the denominator i.e.
s
σf2 σg2 µf σg2 + µg σf2
σf g = and µ f g = (3)
σf2 + σg2 σf2 + σg2
i.e. the variance σf2 g is half the harmonic mean of the individual variances σf2 and σg2 , and the mean µf g is the
sum of the individual means µf and µg weighted by their variances. In general, the product is not itself a PDF
as, due to the presence of the scaling factor, it will not have the correct normalisation.
The product f (x)g(x) can now be written in the usual Gaussian form directly, with an unknown scaling constant
(this may be sufficient in cases where renormalisation can be applied). Alternatively, proceeding from Eq. 2,
suppose that ǫ is the term required to complete the square in β i.e.
2  2
µf σg2 +µg σf2 µf σg2 +µg σf2

2
σf +σg2 − 2
σf +σg2
ǫ= 2σ 2 σ 2
=0
f g
(σf2 +σg2 )

Adding this term to β gives


2 2
µf σg2 +µg σf2 µf σg2 +µg σf2 µ2f σg2 +µ2g σf2 µf σg2 +µg σf2
 
x2 − 2x σf2 +σg2
+ σf2 +σg2 σf2 +σg2
− σf2 +σg2
β= 2σf2 σg2
+ 2σf2 σg2
(σf2 +σg2 ) (σf2 +σg2 )

After some manipulation, this reduces to


2
µf σ 2 +µg σ 2

x − σg2 +σ2 f (µf − µg )2 (x − µf g )
2
(µf − µg )2
f g
β= + = +
2
σ σ
2 σ2f+σg2
2
2(σf2 + σg2 ) 2σf2 g 2(σf2 + σg2 )
f g

Substituting back into Eq. 1 gives


" # " #
1 (x − µf g )2 (µf − µg )2
f (x)g(x) = exp − exp −
2πσf σg 2σf2 g 2(σf2 + σg2 )

Multiplying by σf g /σf g and rearranging gives


" # " #
1 (x − µf g )2 1 (µf − µg )2
=√ exp − exp −
2σf2 g 2(σf2 + σg2 )
q
2πσf g 2π(σf2 + σg2 )

Therefore, the product of two Gaussians PDFs f (x) and g(x) is a scaled Gaussian PDF
" #
Sf g (x − µf g )2
f (x)g(x) = √ exp −
2πσf g 2σf2 g

where s
σf2 σg2 µf σg2 + µg σf2
σf g = and µf g = (4)
σf2 + σg2 σf2 + σg2
q
and the scaling factor S is itself a Gaussian PDF on both µf and µg with standard deviation σf2 + σg2
" #
1 (µf − µg )2
Sf g =q exp −
2π(σf2 + σg2 ) 2(σf2 + σg2 )

These can be written more conveniently as


! " #
2
1 1 1 µf µg 1 1 (µf − µg ) 2
= 2 + 2 , µf g = + 2 σf2 g and Sf g = r exp − σf g (5)
σf2 g σf σg σf2 σg σ2 σ2 2 σf2 σg2
2π σf2 g
fg

It is much easier to generate a proof by induction for the scaling factor of products of larger numbers of Gaussians
if it is written in the form of a sum of terms, each of which involves a single subscript i.e. the parameters of a
single Gaussian PDF. Appendix A provides the necessary proof, giving
" !#
1 1 µ2f µ2g µ2f g
Sf g = r exp − + 2− 2 (6)
σ2 σ2 2 σf2 σg σf g
2π σf2 g
fg

3
2 The Product of n Univariate Gaussian PDFs
Let N (µ, σ) represent a Gaussian PDF with mean µ and standard deviation σ. Let subscript i refer to an
individual Gaussian PDF in a product of n univariate Gaussian PDFs. Furthermore, let the subscript i = 1...n
refer to the parameters of the distribution that is the product n individual Gaussian PDFs and subscripts of the
form i = (1...n − 1)n refer to the parameters of a distribution that is the product of two Gaussian PDFs, one of
which is itself the product of n − 1 Gaussian PDFs. Therefore, the results from Section 1 can be applied to the
first two Gaussian PDFs in the product of n Gaussian PDFs to produce a Gaussian PDF and a scaling factor. The
remaining n − 2 PDFs can then be introduced iteratively using the same expressions i.e.
n
Y n
Y
N (µi , σi ) = Si=1...2 N (µi=1...2 , σi=1...2 ) N (µi , σi )
i=1 i=3

n
Y
= Si=1...2 S(i=1...2)3 N (µ(i=1...2)3 , σ(i=1...2)3 ) N (µi , σi ) = ...
i=4

= Si=1...2 ...S(...((i=1...2)3)...n) N (µ(...((i=1...2)3)...n) , σ(...((i=1...2)3)...n) ) = Si=1...n N (µi=1...n , σi=1...n )


Applying the expression for the standard deviation from Eq. 5 iteratively gives
n
1 1 1 1 1 1 X 1
2 = 2 + = 2 + 2 + = ... = (7)
σi=1...n σi=1...n−1 σn2 σi=1...n−2 σn−1 σn2 σ2
i=1 i

Similarly, the mean is given by


 2
µn−1 σi=1...n−1
   
µi=1...n−1 µn 2 µi=1...n−2 µn 2
µi=1...n = 2 + 2 σi=1...n = 2 + 2 2 + 2 σi=1...n
σi=1...n−1 σn σi=1...n−2 σn−1 σi=1...n−1 σn
n
  " #
µi=1...n−2 µn−1 µn 2 X µi 2
= 2 + 2 + σ i=1...n = ... = 2 σi=1...n (8)
σi=1...n−2 σn−1 σn2 σ
i=1 i

By inspection of Eq. 6, state the form


s
n
" !#
1 2
σi=1...n 1 X µ2 i µ2
Si=1...n = Qn 2 exp − 2 2 − i=1...n
2 (9)
(2π)(n−1)/2 σ
i=1 i σ
i=1 i
σi=1...n

for the scaling factor. Similarly, using Eq. 7 to manipulate some of the standard deviation terms,
s
µ2(i=1...n)(n+1)
" !#
2
1 σi=1...n+1 1 µ2i=1...n µ2n+1
S(i=1...n)(n+1) = 2 2 exp − 2 + 2 − 2
(2π)(1/2) σi=1...n σn+1 2 σi=1...n σn+1 σ(i=1...n)(n+1)

The scaling factor is the product of individual scaling factors for each pairwise multiplication, so

Si=1...n+1 = Si=1...n S(i=1...n)(n+1) =


s
n
µ2(i=1...n)(n+1)
" !#
2
1 2
σi=1...n σi=1...n+1 1 X µ2 µ2
Qn 2 2 2 exp − i
2 + n+1
2 − 2
(2π)n/2 i=1 σi σi=1...n σn+1 2 i=1
σ i σn+1 σ(i=1...n)(n+1)
This gives two terms to deal with: First, the standard devation term
Qn 2 2 2 2
Qn 2
Qn+1 2
i=1 σi σi=1...n σn+1 σn+1 i=1 σi i=1 σi
2 2 = 2 = 2
σi=1...n σi=1...n+1 σi=1...n+1 σi=1...n+1

Second, the term in the exponent; using Eq. 8 gives

µ2(i=1...n)(n+1) µ2i=1...n µ2n+1 Xn


µ2i µ2n+1
n+1
X µ2
i µ2i=1...n+1
2 = 2 + 2 = + = =
σ(i=1...n)(n+1) σi=1...n σn+1 σ2
i=1 i
2
σn+1 σ2
i=1 i
2
σi=1...n+1

Therefore
n
X µ2 i µ2n+1 µ2(i=1...n)(n+1) n+1
X µ2
i
µ2(i=1...n+1)
2 + 2 − 2 = −
i=1
σ i σn+1 σ(i=1...n)(n+1) σ2
i=1 i
2
σ(i=1...n+1)

4
So s
n+1
" !#
2
1 σi=1...n+1 1 X µ2i µ2
Si=1...n+1 = Qn+1 2 exp − 2 − i=1...n+1
2
(2π)n/2 i=1 iσ 2 i=1
σi σi=1...n+1
which, together with Eq. 6, constitutes a proof by induction of Eq. 9. As with the product of two univariate
Gaussian PDFs, the scaling factor is a Gaussian function. However, it is not a PDF, as it does not have the correct
normalisation.

3 The Product of n Multivariate Gaussian PDFs


The multivariate Gaussian PDF can be written as
 
1 1
p(x) = p exp − (x − µ)T V −1 (x − µ)
(2π)d/2 |V | 2

where d is the dimensionality of x, µ is the d-dimensional mean vector, and V is the d-by-d dimensional covariance
matrix; this document adopts the standard notation of using bold face symbols to represent vectors and matrices.
The Gaussian PDF can also be written in canonical notation as
 
T 1 T
p(x) = exp ζ + η x − x Λx (10)
2

where
1
Λ = V −1 η = V −1 µ d log 2π − log|Λ| + η T Λ−1 η

, and ζ=−
2
So the product of n Gaussian PDFs i = 1...n is
 !T ! 
n n n
Y X 1 X
pi (x) = exp ζi=1...n + ηi x − xT Λ i x
i=1 i=1
2 i=1

where
n n n
!
X 1 X X
ζi=1...n = ζi = − nd log 2π − log |Λi | + ηiT Λ−1
i ηi
i=1
2 i=1 i=1

So  !T ! 
n n n
Y X 1 X
pi (x) = exp ζi=1...n + ζn − ζn + ηi x − xT Λi x
i=1 i=1
2 i=1
 
1
= exp(ζi=1...n − ζn )exp ζn + ηnT x − xT Λ n x (11)
2
where
n
X n
X
Λn = Λi , ηn = ηi
i=1 i=1

and
1
d log 2π − log|Λn | + ηnT Λ−1

ζn = − n ηn (12)
2
Comparing Eqs. 10, 12 and 11 shows that the result is, as in the previous sections, a scaled Gaussian PDF over x
with a mean vector and covariance matrix given by
n
X n
X
Vn−1 = Vi−1 and Vn−1 µn = Vi−1 µi
i=1 i=1

The scaling factor is again a Gaussian function.

5
4 The Convolution of Two Univariate Gaussian PDFs
We wish to find the convolution of two Gaussian PDFs
(x−µf )2 (x−µ ) 2
1 −
2σ 2 1 − 2σ2g
f (x) = √ e f and g(x) = √ e g

2πσf 2πσg

in the most general case i.e. non-identical means. The convolution of two functions f (t) and g(t) over a finite
range1 is defined as Z x
f (x − τ )g(τ )dτ = f ⊗ g
0

However, the usual approach is to use the convolution theorem [2],

F −1 [F (f (x))F (g(x))] = f (x) ⊗ g(x)

where F is the Fourier transform Z ∞


F (f (x)) = f (x)e−2πikx dx
−∞

and F −1 is the inverse Fourier transform


Z ∞
F −1 (F (k)) = F (k)e2πikx dk
−∞

Using the transformation


x ′ = x − µf
the Fourier transform of f (x) is given by
∞ x′2 ∞ x′2
1 e−2πikµf
Z Z
− −
2σ 2 ′ 2σ 2 ′
F (f (x)) = √ e f e−2πik(x +µf ) dx′ = √ e f e−2πikx dx′ (13)
2πσf −∞ 2πσf −∞

Using Euler’s formula [2],


e−iθ = cos θ − i sin θ

we can split the term in ex to give
∞ x′2
e−2πikµf
Z

2σ 2
F (f (x)) = √ e f [cos(2πkx′ ) − i sin(2πkx′ )] dx′
2πσf −∞

The term in sin (x′ ) is odd and so its integral over all space will be zero, leaving
∞ x′2
e−2πikµf
Z

2σ 2
F (f (x)) = √ e f cos(2πkx′ ) dx′
2πσf −∞

This integral is given in standard form in [1]



r
1 π − x2
Z
2
e−at cos (2xt) dt = e a
0 2 a

and so 2
σf2 k2
F (f (x)) = e−2πikµf e−2π (14)
The second term in this expression is a Gaussian PDF in k: the Fourier transform of a Gaussian PDF is another
Gaussian PDF. The first term is a phase term accounting for the mean of f (x) i.e. its offset from zero. The Fourier
transform of g(x) will give a similar expression, and so
2
σf2 k2 −2πikµg −2π 2 σg2 k2 2
(σf2 +σg2 )k2
F (f (x))F (g(x)) = e−2πikµf e−2π e e = e−2πik(µf +µg ) e−2π (15)
1 In practice, convolutions are more often performed over an infinite range
Z ∞
f (x − τ )g(τ )dτ = f ⊗ g
−∞

6
Comparing Eq. 15 to Eq. 14, we can see that it is the Fourier transform of a Gaussian PDF with mean and
standard deviation q
µf ⊗g = µf + µg and σf ⊗g = σf2 + σg2
and therefore, since the Fourier transform is invertible,
(x−(µf +µg ))2
1 −
2(σ 2 +σg
2)
Pf ⊗g (x) = F −1 [F (f (x))F (g(x))] = q e f

2π(σf2 + σg2 )

It may be worth noting a general result at this point; the area under a convolution is equal to the product of the
areas under the factors Z ∞ Z ∞ Z ∞ 
(f ⊗ g)dt = f (u)g(t − u)du dt
−∞ −∞ −∞
Z ∞ Z ∞ 
= f (u) g(t − u)dt du
−∞ −∞
Z ∞  Z ∞ 
f (u)du g(t)dt
−∞ −∞

Therefore, the preservation of the normalisation when convolving PDFs i.e. the fact that the convolution is also a
PDF, normalised such that the area under the function is equal to unity, is a special case rather than being true
in general.

5 Summary
It is well known that the product and the convolution of a pair of Gaussian PDFs are also Gaussian. In the case
of the product of two univariate Gaussian PDFs N (µf , σf ) and N (µg , σg ), the result is a scaled Gaussian PDF
where the scaling factor is itself a Gaussian PDF on both µf and µg
" # !
Sf g (x − µf g )2 1 1 1 µf µg
N (µf , σf )N (µg , σg ) = √ exp − where = 2 + 2 , µf g = + 2 σf2 g
2πσf g 2σf2 g σf2 g σf σg σf2 σg
" #
2
1 1 (µf − µg ) 2
and Sf g = r exp − σf g
σf2 σg2 2 σf2 σg2
2π σf2 g

It should be noted that this result is not the PDF of the product of two Gaussian random variates; in that case,
the product normal distribution applies.
The product of n univariate Gaussian PDFs is given by
n n n
" #
(x − µi=1...n )2
 
Y Si=1...n 1 X 1 X µi 2
N (µi , σi ) = p exp − 2 where 2 = , µi=1...n = 2 σi=1...n
i=1
2
2πσi=1...n 2σi=1...n σi=1...n σ2
i=1 i i=1
σ i

s
n
" !#
1 2
σi=1...n 1 X µ2 i µ2
and Si=1...n = Qn 2 exp − 2 2 − i=1...n
2
(2π)(n−1)/2 σ
i=1 i σ
i=1 i
σi=1...n
i.e. is a Gaussian PDF scaled by a Gaussian function.
The product of n multivariate Gaussian PDFs is given by
n  
Y 1
N (µi , Vi−1 ) = exp(ζi=1...n − ζn )exp ζn + ηnT x − xT Λn x
i=1
2

where
n
X n
X
Λi = Vi−1 , ηi = Vi−1 µi , Λn = Λi , ηn = ηi ,
i=1 i=1

1
d log 2π − log|Λn | + ηnT Λ−1

ζn = − n ηn and
2

7
n n n
!
X 1 X X
ζi=1...n = ζi = − nd log 2π − log |ηi | + ηiT Λ−1
i ηi
i=1
2 i=1 i=1
i.e. a Gaussian PDF scaled by a Gaussian function.
The convolution of two Gaussian PDFs is a Gaussian PDF with mean and standard deviation
q
µf ⊗g = µf + µg and σf ⊗g = σf2 + σg2

These results can be useful in a number of applications; for example, the convolution of Gaussian distributions
freqently occurs in smoothing applied as an intermadiate step in various machine vision algorithms. Products of
Gaussian PDFs may occur during the application of Bayes theorem, and in some problems related to Gaussian
processes.

6 Acknoweldgements
Since the original version of this memo was uploaded in 2003, several correspondents have been kind enough to
suggest extensions or corrections. In particular, thanks are due to:

• David Kirchheimer, University of Bristol, for pointing out the importance of the product of an arbitrary
number of Gaussian PDFs and providing Matlab code for numerical testing of the results.
• Thomas Dent, Institute for Gravitational Physics (Albert Einstein Institute), Hannover, Germany, for point-
ing out several typographical errors.
• Abdulazeez Olaseni Atanda of the University of Tartu, Estonia, for pointing out a typo in Eq. 13.
• Duane A. McVay of Texas A&M University, for pointing out a correction in the discussion of Eq. 3.

References
[1] M Abramowitz and I A Stegun. Handbook of Mathematical Functions. National Bureau of Standards, Wash-
ington DC, 1972.
[2] M L Boas. Mathematical Methods in the Physical Sciences. John Wiley and Sons Ltd., 1983.

A Rewriting the Scaling Factor


Using Eq. 7 and 8
n
!2 n n−1 n
µ2i=1...n X µi X µ2 i
X X µi µj
4 = = +2
σi=1...n σ2
i=1 i i=1
σ 4
i σ2 σ2
i=1 j=i+1 i j

So
n−1 n n
X X µi µj µ2i=1...n X µ2i
2 = −
σ2 σ2
i=1 j=i+1 i j
4
σi=1...n σ4
i=1 i

The terms in the exponent of the scaling factor for the product of univariate Gaussians take the form
n−1 n n−1 n
!
X X (µi − µj )2 2 X X µ2i 2µi µj µ2j 2
2 σ2 σi=1...n = 2 σ2 − σ2 σ2 + σ2 σ2 σi=1...n
i=1 j=i+1
σ i j i=1 j=i+1
σ i j i j i j

which, substituting the above expression for the cross term,


n−1 n n
!
X X µ2i 2 µ2j 2 X µ2i 2 µ2i=1...n 2
= σ + σ + σ − σi=1...n
i=1 j=i+1
σi2 σj2 i=1...n σi2 σj2 i=1...n σ 4 i=1...n σi=1...n
i=1 i
4

n−1 n n
!
2
σ2 µ2i=1...n µ2i µ2i=1...n
 
X X σi=1...n X
= µ2i 2 2 + i=1...n − = −
i=1 j=i+1
σi σj σi4 2
σi=1...n i=1
σi2 2
σi=1...n

8
Note: Appendices B and C are an older version of the derivation for the product of n univariate Gaussian PDFs;
they do not use the manipulation given in Appendix A and so are considerably more complicated. The aim here
was to illustrate the derivation using the proof for three Gaussian PDFs, and then to replicate each step with
n Gaussian PDFs. However, these derivations are made redundant by the simpler versions given in the main
document.

B The Product of Three Univariate Gaussian PDFs


Since the product of two Gaussian PDFs is a scaled Gaussian PDF, the above proof can be extended to give the
product of larger numbers of Gaussian PDFs. We adopt the following notation: N (µ, σ) denotes a Gausian PDF
with mean µ and standard deviation σ; subscripts f , g, h etc. indicate the parameters of individual Gaussian
PDFs in the product; subscripts e.g. f g indicate the parameters of the products of those distributions; subscripts
e.g. (f g)h indicate the parameters of the product of the distribution h with a distribution that is itself the product
of the distributions f and g. Therefore, the product of three Gaussian PDFs is

N (µf , σf )N (µg , σg )N (µh , σh )

= Sf g N (µf g , σf g )N (µh , σh )
= Sf g S(f g)h N (µ(f g)h , σ(f g)h )
Defining
Sf gh N (µf gh , σf gh ) = Sf g S(f g)h N (µ(f g)h , σ(f g)h ) (16)
we have
Sf gh = Sf g S(f g)h , µf gh = µ(f g)h and σf gh = σ(f g)h

Since the expressions for the mean and standard deviation in Eq. 5 are expressed as the sums over individual
terms that feature only the parameters of a single distribution f , g, they can be extended to multiple distributions
easily
1 1 1 1 1 1 1
= 2 = 2 + 2 = 2 + 2+ 2 (17)
σf2 gh σ(f g)h σf g σh σf σg σh
and ! !
µf g µh 2 µf µg µh
µf gh = µ(f g)h = + 2 σ(f g)h = + 2+ 2 σf2 gh (18)
σf2 g σh σf2 σg σh

The scaling factor is given by


" # " #
1 (µf − µg )2 1 (µf g − µh )2
Sf gh = Sf g S(f g)h =q exp − exp −
2(σf2 + σg2 ) 2(σf2 g + σh2 )
q
2π(σf2 + σg2 ) 2π(σf2 g + σh2 )
" !#
1 1 (µf − µg )2 (µf g − µh )2
= exp − +
σf2 + σg2 σf2 g + σh2
q
2π (σf2 + σg2 )(σf2 g + σh2 ) 2

This can be dealt with as two separate terms; first


σf2 σg2
(σf2 + σg2 )(σf2 g + σh2 ) = (σf2 + σg2 )( + σh2 ) = σf2 σg2 + σf2 σh2 + σg2 σh2
σf2 + σg2

Second
(µf − µg )2 (µf g − µh )2 (µf − µg )2 (σf2 g + σh2 ) + (µf g − µh )2 (σf2 + σg2 )
2 + 2 2 =
σf + σg2 σf g + σh (σf2 + σg2 )(σf2 g + σh2 )
The denominator is the same as the first term, above, so only the numerator need be dealt with

M = (µf − µg )2 (σf2 g + σh2 ) + (µf g − µh )2 (σf2 + σg2 )

Substituting the expressions for µf g and σf g from Eq. 4,


" #2
σf2 σg2 µf σg2 + µg σf2
M = (µf − µg )2 σh2 + (µf − µg ) 22
+ − µh (σf2 + σg2 )
σf + σg2 σf2 + σg2

9
One approach is expand the expression fully and then pair all of the terms to give an overall factor of σf2 + σg2 .
However, this is impractical as a route to a proof for the product of arbitrary numbers of Gaussians. Instead,
h i2
σf2 σg2 µf σg2 + µg σf2 − µh (σf2 + σg2 )
M = (µf − µg )2 σh2 + (µf − µg )2 2 +
σf + σg2 (σf2 + σg2 )
h i2
2 2
σ 2 2
σ (µ f − µ h )σ g + (µ g − µ h )σ f
f g
= (µf − µg )2 σh2 + (µf − µg )2 2 +
σf + σg2 σf2 + σg2
σf2 σg2 (µf − µh )2 σg4 σf2 σg2 (µg − µh )2 σf4
= (µf − µg )2 σh2 + (µf − µg )2 + + 2(µ f − µ h )(µ g − µ h ) +
σf2 + σg2 σf2 + σg2 σf2 + σg2 σf2 + σg2
Now, observe that

(A − B)2 + 2(A − C)(B − C) = A2 − 2AB + B 2 + 2AB − 2AC − 2BC + 2C 2

= A2 − 2AC + C 2 + B 2 − 2BC + C 2 = (A − C)2 + (B − C)2 (19)


Therefore
σf2 σg2 σf2 σg2 σf2 σg2 σf2 σg2
(µf − µg )2 + 2(µ f − µ h )(µ g − µ h ) = (µ f − µ h ) 2
+ (µ g − µ h ) 2
σf2 + σg2 σf2 + σg2 σf2 + σg2 σf2 + σg2

and so
(µf − µh )2 σg4 (µg − µh )2 σf4 σf2 σg2 σf2 σg2
M = (µf − µg )2 σh2 + + + (µ f − µ h ) 2
+ (µ g − µ h ) 2
σf2 + σg2 σf2 + σg2 σf2 + σg2 σf2 + σg2

σg4 + σf2 σg2 σ 4 + σf2 σg2


2 f
= (µf − µg )2 σh2 + (µf − µh )2 + (µ g − µ h )
σf2 + σg2 σf2 + σg2

= (µf − µg )2 σh2 + (µf − µh )2 σg2 + (µg − µh )2 σf2


Collecting terms, this gives
" #
1 1 (µf − µg )2 σh2 + (µf − µh )2 σg2 + (µg − µh )2 σf2
Sf gh = exp −
σf2 σg2 + σf2 σh2 + σg2 σh2
q
2π σf2 σg2 + σf2 σh2 + σg2 σh2 2

which can be written more conveniently as


" ! #
1 1 (µf − µg )2 (µf − µh )2 (µg − µh )2
Sf gh = exp − + + σf2 gh (20)
σf2 σg2 σf2 σh2 σg2 σh2
r
2 2 2
σf σg σh 2
2π σ2 f gh

Therefore, the product of three Gaussian PDFs is a scaled Gaussian PDF


" #
Sf gh (x − µf gh )2
f (x)g(x)h(x) = √ exp −
2πσf gh 2σf2 gh

where σf gh , µf gh and Sf gh are given by Eqs. 17, 18 and 20 respectively.


As in Section 1, the scaling factor can be rewritten using Appendix A to give
" !#
1 1 µ2f µ2g µ2h µ2f gh
Sf gh = exp − + 2+ 2 − 2
2 σf2
r
σf2 σg2 σh
2 σg σh σf gh
2π σ2 f gh

10
C The Product of n Univariate Gaussian PDFs
Let subscript i refer to an individual Gaussian PDF in a product of n univariate Gaussian PDFs. Based on
the derivations in Sections 1 and B, it is clear that the product is also a Gaussian PDF, multiplied by a scaling
factor. The notation used in Section B is extended, so that the subscript i = 1...n refers to the parameters of the
distribution that is the product n individual Gaussian PDFs and subscript i = (1...n − 1)n refers to the parameters
of a distribution that is the product of two Gaussian PDFs, one of which is itself the product of n − 1 Gaussian
PDFs. In addition, define  
n
X n n
Y Y
σi2  σi2

αn = 
  and γn =
i=1 j=1 i=1
j6=i

By inspection of the results for the products of two and three Gaussian PDFs, state
n (x−µi=1...n )2
Y Si=1...n −
2σ 2
N (µi , σi ) = p 2
e i=1...n

i=1 2πσi=1...n

where
n n
" #
1 X 1 2 γn X µi 2
2 = 2 or σi=1...n = , µi=1...n = 2 σi=1...n
σi=1...n σ
i=1 i
αn i=1
σ i

and   
2 n−1 n 2
1 σ X X (µ i − µ j )
Si=1...n =p exp − i=1...n  2 σ2
 (21)
(2π)n−1 αn 2 i=1 j=i+1
σ i j

The above expressions can be proved by observing that, following Eq. 16,

Si=1...n N (µi=1...n , σi=1...n ) = Si=1...n−1 N (µi=1...n−1 , σi=1...n−1 )N (µn , σn )

= Si=1...n−1 S(i=1...n−1)n N (µi=1...n , σi=1...n ) (22)


Therefore, using Eq. 5
 
1 1 1 µi=1...n−1 µn 2
2 = 2 + and µi=1...n = 2 + 2 σi=1...n
σi=1...n σi=1...n−1 σn2 σi=1...n−1 σn
2 2 2
Eq. 5 can then be substituted to expand σi=1...n−1 into σi=1...n−2 and σn−1 , and µi=1...n−1 into µi=1...n−2 and
µn−1 ; repeating this gives
n
1 1 1 1 1 1 X 1
2 = 2 + = 2 + 2 + 2 = ... = Q.E.D.
σi=1...n σi=1...n−1 σn2 σi=1...n−2 σn−1 σn σ2
i=1 i
 2
µn−1 σi=1...n−1
   
µi=1...n−1 µn 2 µi=1...n−2 µn 2
µi=1...n = 2 + 2 σi=1...n = 2 + 2 2 + 2 σi=1...n
σi=1...n−1 σn σi=1...n−2 σn−1 σi=1...n−1 σn
  " n #
µi=1...n−2 µn−1 µn 2 X µi
2
= 2 + 2 + σ i=1...n = ... = 2 σi=1...n Q.E.D.
σi=1...n−2 σn−1 σn2 i=1
σ i

Eq. 21 can be written using α as


    
n n−1 n
1  1 X X
2
Y  1 
Si=1...n =p exp −  (µi − µj ) σk2 

n−1 2 αn

(2π) αn i=1 j=i+1 k=1
k6=i,j

Similarly, Eq. 5 gives the scaling factor for the product of N (µi=1...n , σi=1...n ) and N (µn , σn ) as
" #
2
1 1 (µi=1...n − µn+1 )
S(i=1...n)n+1 = q exp − 2 2
2π(σ 2 + σ2 ) 2 σi=1...n + σn+1
i−1...n n+1

11
Therefore, the aim here is to show that
   
n n−1 n 2
1 1  X X
2
Y  1 (µi=1...n − µn+1 ) 
Si=1...n+1 = q exp −  (µi − µj ) σk2  +

2 α 2
σi=1...n 2
+ σn+1

(2π)n α 2 2 n
n (σi−1...n + σn+1 ) i=1 j=i+1 k=1
k6=i,j

The standard deviation term is


2 2 2 2 2
αn (σi−1...n + σn+1 ) = αn σi−1...n + αn σn+1 = γn + αn σn+1 = αn+1

The exponential term, ignoring the −1/2, is


  
n−1 n n 2
X X  2
Y  1 (µi=1...n − µn+1 )
(µi − µj ) σk2  + 2 2
αn σi=1...n + σn+1

i=1 j=i+1 k=1
k6=i,j
  
Pn−1 Pn Qn 2
i=1 j=i+1 (µi − µj )2 k=1 σk2 2
(σi=1...n 2
+ σn+1 ) + (µi=1...n − µn+1 ) αn
k6=i,j
= 2 2
(σi=1...n + σn+1 )αn
The denominator is, as expected, the standard deviation term that was dealt with above; ignore this, and let the
numerator be called M
  
n−1 n n
2
X X Y
2
σk2  (σi=1...n
 2 2
M = (µi − µj ) + σn+1 ) + (µi=1...n − µn+1 ) αn
 
i=1 j=i+1 k=1
k6=i,j
   
n−1 n n n−1 n n
" n
! #2
X X
2
Y X X Y X µi
σk2  σi=1...n
 2 2
σk2  σn+1
 2 2
= (µi − µj ) + (µi − µj ) + σi=1...n − µn+1 αn
 
i=1 j=i+1 i=1 j=i+1
σ2
i=1 i
k=1 k=1
k6=i,j k6=i,j

Focussing on the last of these three terms


   2
" n
! #2 " n
! #2 n n
2
X µi 2
X µi γi=1...n 1   X
2
 Y
2

αn σi=1...n − µn+1 = αn − µn+1 =  µ i σ j − α i=1...n µ n+1

i=1
σi2 i=1
σi2 2
αi=1...n αn  i=1  j=1  
j6=i

    2    2
n n n n n n
1  X 2
 Y  2 
 X1   Y
2 
 X Y
=  µ i σ j − µn+1 σ j =  (µi − µn+1 ) σ j
αn  i=1  j=1  i=1  j=1
 αn  i=1  j=1

j6=i j6=i j6=i
    
n n n−1 n n n
1  (µi − µn+1 )2
X Y X X Y Y
σj4  σk2 σl2 
  
= +2 (µi − µn+1 )(µj − µn+1 )
 
αn i=1
 
j=1 i=1 j=i+1

k=1 l=1
j6=i l6=i,j

However, Qn
2 k=1 σk2
= σi=1...n
αn
so, recombining this with the first two terms of M gives
   
n−1
X X n n
Y n−1
X X n n
Y
2
σk2 σn+1
2 2
σk2  σi=1...n
 2
M= (µi − µj ) + (µi − µj ) +
  
i=1 j=i+1 k=1 i=1 j=i+1 k=1
k6=i,j k6=i,j
   
n n n−1 n n
1 X
(µi − µn+1 )2
Y X X Y
σj4  σk2  σi=1...n
   2
+2 (µi − µn+1 )(µj − µn+1 )

αn i=1  j=1 i=1 j=i+1 k=1
j6=i k6=i,j

12
Applying Eq. 19 to the second and fourth terms gives
   
n−1 n n n n
X X
2
Y 1 X
(µi − µn+1 )2
Y
σk2 σn+1
2
σj4 
 
(µi − µj ) +
 
i=1 j=i+1
αn i=1

j=1

k=1
k6=i,j j6=i
 
n−1
X n
X n
Y
2 2
σk2  σi=1...n
  2
+  (µi − µn+1 ) + (µj − µn+1 )

i=1 j=i+1 k=1
k6=i,j
 
n−1 n n n  2

X X
2
Y 1 X 2 γn
= (µi − µj ) σk2 σn+1
2
+ (µi − µn+1 ) 4
 
αn i=1 σi

i=1 j=i+1 k=1
k6=i,j
 
n−1 n n
X X
 (µi − µn+1 )2 + (µj − µn+1 )2
  X γn   σ2
+
i=1 j=i+1

j=1
σi2 σj2  i=1...n
j6=i
   
n−1 n n n  2
 n n
X X
2
Y 1 X 2 γn X X γn 
σk2 σn+1
2 (µi − µn+1 )2  σ2

= (µi − µj ) + (µi − µn+1 ) +
 
i=1 j=i+1
αn i=1
σi4 
i=1
σ 2 σ 2  i=1...n
j=1 i j
k=1
k6=i,j j6=i
    
n−1 n n n n
 2 2
(µi − µn+1 )2  γn + γn σi=1...n
X X Y  X X
2
σk2 σn+1
2

= (µi − µj ) +
 
  αn σ 4 σ 2 σ2 
i=1 j=i+1 i=1
k=1 i j=1 i j
k6=i,j j6=i

Now, examine
 
n n n n n
γn2 X 2
γn σi=1...n γn2 
1 + 1  2 X
 = γn 1 X
2 1 X 1 σi=1...n Y
+ = = σ γ
i=1...n n 2 = = σj2
αn σi4 j=1 σi2 σj2 αn  σi4 j=1 σi2 σj2  αn j=1 σi2 σj2 σi j=1 σj2 σi2 j=1
j6=i j6=i j6=i

So,
     
n−1
X n
X n
Y n
X n
Y n n+1
X X n+1
Y
2
σk2 σn+1
2 (µi − µn+1 )2 σj2  2
σk2 
 
M= (µi − µj ) + = (µi − µj )
   

i=1 j=i+1 k=1 i=1 j=1 i=1 j=i+1 k=1
k6=i,j j6=i k6=i,j

Collecting terms
  
2 n n+1
X (µi − µj )2
1 σi=1...n+1 X
Si=1...n S(i=1...n)n+1 = p exp −   = Si=1...n+1 Q.E.D.
(2π)n αn+1 2 i=1 j=i+1
σi2 σj2

13

Potrebbero piacerti anche