Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
2003-003
Internal Report
P.A. Bromiley
Last updated
22 / 6 / 2018
Abstract
It is well known that the product and the convolution of Gaussian probability density functions (PDFs)
are also Gaussian functions. This document provides proofs of this for several cases; the product of
two univariate Gaussian PDFs, the product of an arbitrary number of univariate Gaussian PDFs, the
product of an arbitrary number of multivariate Gaussian PDFs, and the convolution of two univari-
ate Gaussian PDFs. These results are useful in calculating the effects of smoothing applied as an
intermediate step in various algorithms.
2πσf 2πσg
Their product is
(x−µf )2
(x−µg )2
1 −
2σ 2
+ 2
2σg
f (x)g(x) = e f (1)
2πσf σg
Examine the term in the exponent
(x − µf )2 (x − µg )2
β= 2 +
2σf 2σg2
Expanding the two quadratics and collecting terms in powers of x gives
(σf2 + σg2 )x2 − 2(µf σg2 + µg σf2 )x + µ2f σg2 + µ2g σf2
β=
2σf2 σg2
This is again a quadratic in x, and so Eq. 1 is a Gaussian function. Compare the terms in Eq. 2 to a the usual
Gaussian form
1 (x−µ)2 1 (x2 −2µx+µ2 )
P (x) = √ e− 2σ2 = √ e− 2σ 2
2πσ 2πσ
Since a term ǫ that is independent of x can be added to complete the square in β, this is sufficent to complete the
proof in cases where the normalisation can be ignored. The product of two Gaussian PDFs is proportional to a
Gaussian PDF with a mean that is half the coefficient of x in Eq. 2 and a standard deviation that is the square
root of half of the denominator i.e.
s
σf2 σg2 µf σg2 + µg σf2
σf g = and µ f g = (3)
σf2 + σg2 σf2 + σg2
i.e. the variance σf2 g is half the harmonic mean of the individual variances σf2 and σg2 , and the mean µf g is the
sum of the individual means µf and µg weighted by their variances. In general, the product is not itself a PDF
as, due to the presence of the scaling factor, it will not have the correct normalisation.
The product f (x)g(x) can now be written in the usual Gaussian form directly, with an unknown scaling constant
(this may be sufficient in cases where renormalisation can be applied). Alternatively, proceeding from Eq. 2,
suppose that ǫ is the term required to complete the square in β i.e.
2 2
µf σg2 +µg σf2 µf σg2 +µg σf2
2
σf +σg2 − 2
σf +σg2
ǫ= 2σ 2 σ 2
=0
f g
(σf2 +σg2 )
Therefore, the product of two Gaussians PDFs f (x) and g(x) is a scaled Gaussian PDF
" #
Sf g (x − µf g )2
f (x)g(x) = √ exp −
2πσf g 2σf2 g
where s
σf2 σg2 µf σg2 + µg σf2
σf g = and µf g = (4)
σf2 + σg2 σf2 + σg2
q
and the scaling factor S is itself a Gaussian PDF on both µf and µg with standard deviation σf2 + σg2
" #
1 (µf − µg )2
Sf g =q exp −
2π(σf2 + σg2 ) 2(σf2 + σg2 )
It is much easier to generate a proof by induction for the scaling factor of products of larger numbers of Gaussians
if it is written in the form of a sum of terms, each of which involves a single subscript i.e. the parameters of a
single Gaussian PDF. Appendix A provides the necessary proof, giving
" !#
1 1 µ2f µ2g µ2f g
Sf g = r exp − + 2− 2 (6)
σ2 σ2 2 σf2 σg σf g
2π σf2 g
fg
3
2 The Product of n Univariate Gaussian PDFs
Let N (µ, σ) represent a Gaussian PDF with mean µ and standard deviation σ. Let subscript i refer to an
individual Gaussian PDF in a product of n univariate Gaussian PDFs. Furthermore, let the subscript i = 1...n
refer to the parameters of the distribution that is the product n individual Gaussian PDFs and subscripts of the
form i = (1...n − 1)n refer to the parameters of a distribution that is the product of two Gaussian PDFs, one of
which is itself the product of n − 1 Gaussian PDFs. Therefore, the results from Section 1 can be applied to the
first two Gaussian PDFs in the product of n Gaussian PDFs to produce a Gaussian PDF and a scaling factor. The
remaining n − 2 PDFs can then be introduced iteratively using the same expressions i.e.
n
Y n
Y
N (µi , σi ) = Si=1...2 N (µi=1...2 , σi=1...2 ) N (µi , σi )
i=1 i=3
n
Y
= Si=1...2 S(i=1...2)3 N (µ(i=1...2)3 , σ(i=1...2)3 ) N (µi , σi ) = ...
i=4
for the scaling factor. Similarly, using Eq. 7 to manipulate some of the standard deviation terms,
s
µ2(i=1...n)(n+1)
" !#
2
1 σi=1...n+1 1 µ2i=1...n µ2n+1
S(i=1...n)(n+1) = 2 2 exp − 2 + 2 − 2
(2π)(1/2) σi=1...n σn+1 2 σi=1...n σn+1 σ(i=1...n)(n+1)
The scaling factor is the product of individual scaling factors for each pairwise multiplication, so
Therefore
n
X µ2 i µ2n+1 µ2(i=1...n)(n+1) n+1
X µ2
i
µ2(i=1...n+1)
2 + 2 − 2 = −
i=1
σ i σn+1 σ(i=1...n)(n+1) σ2
i=1 i
2
σ(i=1...n+1)
4
So s
n+1
" !#
2
1 σi=1...n+1 1 X µ2i µ2
Si=1...n+1 = Qn+1 2 exp − 2 − i=1...n+1
2
(2π)n/2 i=1 iσ 2 i=1
σi σi=1...n+1
which, together with Eq. 6, constitutes a proof by induction of Eq. 9. As with the product of two univariate
Gaussian PDFs, the scaling factor is a Gaussian function. However, it is not a PDF, as it does not have the correct
normalisation.
where d is the dimensionality of x, µ is the d-dimensional mean vector, and V is the d-by-d dimensional covariance
matrix; this document adopts the standard notation of using bold face symbols to represent vectors and matrices.
The Gaussian PDF can also be written in canonical notation as
T 1 T
p(x) = exp ζ + η x − x Λx (10)
2
where
1
Λ = V −1 η = V −1 µ d log 2π − log|Λ| + η T Λ−1 η
, and ζ=−
2
So the product of n Gaussian PDFs i = 1...n is
!T !
n n n
Y X 1 X
pi (x) = exp ζi=1...n + ηi x − xT Λ i x
i=1 i=1
2 i=1
where
n n n
!
X 1 X X
ζi=1...n = ζi = − nd log 2π − log |Λi | + ηiT Λ−1
i ηi
i=1
2 i=1 i=1
So !T !
n n n
Y X 1 X
pi (x) = exp ζi=1...n + ζn − ζn + ηi x − xT Λi x
i=1 i=1
2 i=1
1
= exp(ζi=1...n − ζn )exp ζn + ηnT x − xT Λ n x (11)
2
where
n
X n
X
Λn = Λi , ηn = ηi
i=1 i=1
and
1
d log 2π − log|Λn | + ηnT Λ−1
ζn = − n ηn (12)
2
Comparing Eqs. 10, 12 and 11 shows that the result is, as in the previous sections, a scaled Gaussian PDF over x
with a mean vector and covariance matrix given by
n
X n
X
Vn−1 = Vi−1 and Vn−1 µn = Vi−1 µi
i=1 i=1
5
4 The Convolution of Two Univariate Gaussian PDFs
We wish to find the convolution of two Gaussian PDFs
(x−µf )2 (x−µ ) 2
1 −
2σ 2 1 − 2σ2g
f (x) = √ e f and g(x) = √ e g
2πσf 2πσg
in the most general case i.e. non-identical means. The convolution of two functions f (t) and g(t) over a finite
range1 is defined as Z x
f (x − τ )g(τ )dτ = f ⊗ g
0
The term in sin (x′ ) is odd and so its integral over all space will be zero, leaving
∞ x′2
e−2πikµf
Z
−
2σ 2
F (f (x)) = √ e f cos(2πkx′ ) dx′
2πσf −∞
and so 2
σf2 k2
F (f (x)) = e−2πikµf e−2π (14)
The second term in this expression is a Gaussian PDF in k: the Fourier transform of a Gaussian PDF is another
Gaussian PDF. The first term is a phase term accounting for the mean of f (x) i.e. its offset from zero. The Fourier
transform of g(x) will give a similar expression, and so
2
σf2 k2 −2πikµg −2π 2 σg2 k2 2
(σf2 +σg2 )k2
F (f (x))F (g(x)) = e−2πikµf e−2π e e = e−2πik(µf +µg ) e−2π (15)
1 In practice, convolutions are more often performed over an infinite range
Z ∞
f (x − τ )g(τ )dτ = f ⊗ g
−∞
6
Comparing Eq. 15 to Eq. 14, we can see that it is the Fourier transform of a Gaussian PDF with mean and
standard deviation q
µf ⊗g = µf + µg and σf ⊗g = σf2 + σg2
and therefore, since the Fourier transform is invertible,
(x−(µf +µg ))2
1 −
2(σ 2 +σg
2)
Pf ⊗g (x) = F −1 [F (f (x))F (g(x))] = q e f
2π(σf2 + σg2 )
It may be worth noting a general result at this point; the area under a convolution is equal to the product of the
areas under the factors Z ∞ Z ∞ Z ∞
(f ⊗ g)dt = f (u)g(t − u)du dt
−∞ −∞ −∞
Z ∞ Z ∞
= f (u) g(t − u)dt du
−∞ −∞
Z ∞ Z ∞
f (u)du g(t)dt
−∞ −∞
Therefore, the preservation of the normalisation when convolving PDFs i.e. the fact that the convolution is also a
PDF, normalised such that the area under the function is equal to unity, is a special case rather than being true
in general.
5 Summary
It is well known that the product and the convolution of a pair of Gaussian PDFs are also Gaussian. In the case
of the product of two univariate Gaussian PDFs N (µf , σf ) and N (µg , σg ), the result is a scaled Gaussian PDF
where the scaling factor is itself a Gaussian PDF on both µf and µg
" # !
Sf g (x − µf g )2 1 1 1 µf µg
N (µf , σf )N (µg , σg ) = √ exp − where = 2 + 2 , µf g = + 2 σf2 g
2πσf g 2σf2 g σf2 g σf σg σf2 σg
" #
2
1 1 (µf − µg ) 2
and Sf g = r exp − σf g
σf2 σg2 2 σf2 σg2
2π σf2 g
It should be noted that this result is not the PDF of the product of two Gaussian random variates; in that case,
the product normal distribution applies.
The product of n univariate Gaussian PDFs is given by
n n n
" #
(x − µi=1...n )2
Y Si=1...n 1 X 1 X µi 2
N (µi , σi ) = p exp − 2 where 2 = , µi=1...n = 2 σi=1...n
i=1
2
2πσi=1...n 2σi=1...n σi=1...n σ2
i=1 i i=1
σ i
s
n
" !#
1 2
σi=1...n 1 X µ2 i µ2
and Si=1...n = Qn 2 exp − 2 2 − i=1...n
2
(2π)(n−1)/2 σ
i=1 i σ
i=1 i
σi=1...n
i.e. is a Gaussian PDF scaled by a Gaussian function.
The product of n multivariate Gaussian PDFs is given by
n
Y 1
N (µi , Vi−1 ) = exp(ζi=1...n − ζn )exp ζn + ηnT x − xT Λn x
i=1
2
where
n
X n
X
Λi = Vi−1 , ηi = Vi−1 µi , Λn = Λi , ηn = ηi ,
i=1 i=1
1
d log 2π − log|Λn | + ηnT Λ−1
ζn = − n ηn and
2
7
n n n
!
X 1 X X
ζi=1...n = ζi = − nd log 2π − log |ηi | + ηiT Λ−1
i ηi
i=1
2 i=1 i=1
i.e. a Gaussian PDF scaled by a Gaussian function.
The convolution of two Gaussian PDFs is a Gaussian PDF with mean and standard deviation
q
µf ⊗g = µf + µg and σf ⊗g = σf2 + σg2
These results can be useful in a number of applications; for example, the convolution of Gaussian distributions
freqently occurs in smoothing applied as an intermadiate step in various machine vision algorithms. Products of
Gaussian PDFs may occur during the application of Bayes theorem, and in some problems related to Gaussian
processes.
6 Acknoweldgements
Since the original version of this memo was uploaded in 2003, several correspondents have been kind enough to
suggest extensions or corrections. In particular, thanks are due to:
• David Kirchheimer, University of Bristol, for pointing out the importance of the product of an arbitrary
number of Gaussian PDFs and providing Matlab code for numerical testing of the results.
• Thomas Dent, Institute for Gravitational Physics (Albert Einstein Institute), Hannover, Germany, for point-
ing out several typographical errors.
• Abdulazeez Olaseni Atanda of the University of Tartu, Estonia, for pointing out a typo in Eq. 13.
• Duane A. McVay of Texas A&M University, for pointing out a correction in the discussion of Eq. 3.
References
[1] M Abramowitz and I A Stegun. Handbook of Mathematical Functions. National Bureau of Standards, Wash-
ington DC, 1972.
[2] M L Boas. Mathematical Methods in the Physical Sciences. John Wiley and Sons Ltd., 1983.
So
n−1 n n
X X µi µj µ2i=1...n X µ2i
2 = −
σ2 σ2
i=1 j=i+1 i j
4
σi=1...n σ4
i=1 i
The terms in the exponent of the scaling factor for the product of univariate Gaussians take the form
n−1 n n−1 n
!
X X (µi − µj )2 2 X X µ2i 2µi µj µ2j 2
2 σ2 σi=1...n = 2 σ2 − σ2 σ2 + σ2 σ2 σi=1...n
i=1 j=i+1
σ i j i=1 j=i+1
σ i j i j i j
n−1 n n
!
2
σ2 µ2i=1...n µ2i µ2i=1...n
X X σi=1...n X
= µ2i 2 2 + i=1...n − = −
i=1 j=i+1
σi σj σi4 2
σi=1...n i=1
σi2 2
σi=1...n
8
Note: Appendices B and C are an older version of the derivation for the product of n univariate Gaussian PDFs;
they do not use the manipulation given in Appendix A and so are considerably more complicated. The aim here
was to illustrate the derivation using the proof for three Gaussian PDFs, and then to replicate each step with
n Gaussian PDFs. However, these derivations are made redundant by the simpler versions given in the main
document.
= Sf g N (µf g , σf g )N (µh , σh )
= Sf g S(f g)h N (µ(f g)h , σ(f g)h )
Defining
Sf gh N (µf gh , σf gh ) = Sf g S(f g)h N (µ(f g)h , σ(f g)h ) (16)
we have
Sf gh = Sf g S(f g)h , µf gh = µ(f g)h and σf gh = σ(f g)h
Since the expressions for the mean and standard deviation in Eq. 5 are expressed as the sums over individual
terms that feature only the parameters of a single distribution f , g, they can be extended to multiple distributions
easily
1 1 1 1 1 1 1
= 2 = 2 + 2 = 2 + 2+ 2 (17)
σf2 gh σ(f g)h σf g σh σf σg σh
and ! !
µf g µh 2 µf µg µh
µf gh = µ(f g)h = + 2 σ(f g)h = + 2+ 2 σf2 gh (18)
σf2 g σh σf2 σg σh
Second
(µf − µg )2 (µf g − µh )2 (µf − µg )2 (σf2 g + σh2 ) + (µf g − µh )2 (σf2 + σg2 )
2 + 2 2 =
σf + σg2 σf g + σh (σf2 + σg2 )(σf2 g + σh2 )
The denominator is the same as the first term, above, so only the numerator need be dealt with
9
One approach is expand the expression fully and then pair all of the terms to give an overall factor of σf2 + σg2 .
However, this is impractical as a route to a proof for the product of arbitrary numbers of Gaussians. Instead,
h i2
σf2 σg2 µf σg2 + µg σf2 − µh (σf2 + σg2 )
M = (µf − µg )2 σh2 + (µf − µg )2 2 +
σf + σg2 (σf2 + σg2 )
h i2
2 2
σ 2 2
σ (µ f − µ h )σ g + (µ g − µ h )σ f
f g
= (µf − µg )2 σh2 + (µf − µg )2 2 +
σf + σg2 σf2 + σg2
σf2 σg2 (µf − µh )2 σg4 σf2 σg2 (µg − µh )2 σf4
= (µf − µg )2 σh2 + (µf − µg )2 + + 2(µ f − µ h )(µ g − µ h ) +
σf2 + σg2 σf2 + σg2 σf2 + σg2 σf2 + σg2
Now, observe that
and so
(µf − µh )2 σg4 (µg − µh )2 σf4 σf2 σg2 σf2 σg2
M = (µf − µg )2 σh2 + + + (µ f − µ h ) 2
+ (µ g − µ h ) 2
σf2 + σg2 σf2 + σg2 σf2 + σg2 σf2 + σg2
10
C The Product of n Univariate Gaussian PDFs
Let subscript i refer to an individual Gaussian PDF in a product of n univariate Gaussian PDFs. Based on
the derivations in Sections 1 and B, it is clear that the product is also a Gaussian PDF, multiplied by a scaling
factor. The notation used in Section B is extended, so that the subscript i = 1...n refers to the parameters of the
distribution that is the product n individual Gaussian PDFs and subscript i = (1...n − 1)n refers to the parameters
of a distribution that is the product of two Gaussian PDFs, one of which is itself the product of n − 1 Gaussian
PDFs. In addition, define
n
X n n
Y Y
σi2 σi2
αn =
and γn =
i=1 j=1 i=1
j6=i
By inspection of the results for the products of two and three Gaussian PDFs, state
n (x−µi=1...n )2
Y Si=1...n −
2σ 2
N (µi , σi ) = p 2
e i=1...n
i=1 2πσi=1...n
where
n n
" #
1 X 1 2 γn X µi 2
2 = 2 or σi=1...n = , µi=1...n = 2 σi=1...n
σi=1...n σ
i=1 i
αn i=1
σ i
and
2 n−1 n 2
1 σ X X (µ i − µ j )
Si=1...n =p exp − i=1...n 2 σ2
(21)
(2π)n−1 αn 2 i=1 j=i+1
σ i j
The above expressions can be proved by observing that, following Eq. 16,
Similarly, Eq. 5 gives the scaling factor for the product of N (µi=1...n , σi=1...n ) and N (µn , σn ) as
" #
2
1 1 (µi=1...n − µn+1 )
S(i=1...n)n+1 = q exp − 2 2
2π(σ 2 + σ2 ) 2 σi=1...n + σn+1
i−1...n n+1
11
Therefore, the aim here is to show that
n n−1 n 2
1 1 X X
2
Y 1 (µi=1...n − µn+1 )
Si=1...n+1 = q exp − (µi − µj ) σk2 +
2 α 2
σi=1...n 2
+ σn+1
(2π)n α 2 2 n
n (σi−1...n + σn+1 ) i=1 j=i+1 k=1
k6=i,j
2 2
n n n n n n
1 X 2
Y 2
X1 Y
2
X Y
= µ i σ j − µn+1 σ j = (µi − µn+1 ) σ j
αn i=1 j=1 i=1 j=1
αn i=1 j=1
j6=i j6=i j6=i
n n n−1 n n n
1 (µi − µn+1 )2
X Y X X Y Y
σj4 σk2 σl2
= +2 (µi − µn+1 )(µj − µn+1 )
αn i=1
j=1 i=1 j=i+1
k=1 l=1
j6=i l6=i,j
However, Qn
2 k=1 σk2
= σi=1...n
αn
so, recombining this with the first two terms of M gives
n−1
X X n n
Y n−1
X X n n
Y
2
σk2 σn+1
2 2
σk2 σi=1...n
2
M= (µi − µj ) + (µi − µj ) +
i=1 j=i+1 k=1 i=1 j=i+1 k=1
k6=i,j k6=i,j
n n n−1 n n
1 X
(µi − µn+1 )2
Y X X Y
σj4 σk2 σi=1...n
2
+2 (µi − µn+1 )(µj − µn+1 )
αn i=1 j=1 i=1 j=i+1 k=1
j6=i k6=i,j
12
Applying Eq. 19 to the second and fourth terms gives
n−1 n n n n
X X
2
Y 1 X
(µi − µn+1 )2
Y
σk2 σn+1
2
σj4
(µi − µj ) +
i=1 j=i+1
αn i=1
j=1
k=1
k6=i,j j6=i
n−1
X n
X n
Y
2 2
σk2 σi=1...n
2
+ (µi − µn+1 ) + (µj − µn+1 )
i=1 j=i+1 k=1
k6=i,j
n−1 n n n 2
X X
2
Y 1 X 2 γn
= (µi − µj ) σk2 σn+1
2
+ (µi − µn+1 ) 4
αn i=1 σi
i=1 j=i+1 k=1
k6=i,j
n−1 n n
X X
(µi − µn+1 )2 + (µj − µn+1 )2
X γn σ2
+
i=1 j=i+1
j=1
σi2 σj2 i=1...n
j6=i
n−1 n n n 2
n n
X X
2
Y 1 X 2 γn X X γn
σk2 σn+1
2 (µi − µn+1 )2 σ2
= (µi − µj ) + (µi − µn+1 ) +
i=1 j=i+1
αn i=1
σi4
i=1
σ 2 σ 2 i=1...n
j=1 i j
k=1
k6=i,j j6=i
n−1 n n n n
2 2
(µi − µn+1 )2 γn + γn σi=1...n
X X Y X X
2
σk2 σn+1
2
= (µi − µj ) +
αn σ 4 σ 2 σ2
i=1 j=i+1 i=1
k=1 i j=1 i j
k6=i,j j6=i
Now, examine
n n n n n
γn2 X 2
γn σi=1...n γn2
1 + 1 2 X
= γn 1 X
2 1 X 1 σi=1...n Y
+ = = σ γ
i=1...n n 2 = = σj2
αn σi4 j=1 σi2 σj2 αn σi4 j=1 σi2 σj2 αn j=1 σi2 σj2 σi j=1 σj2 σi2 j=1
j6=i j6=i j6=i
So,
n−1
X n
X n
Y n
X n
Y n n+1
X X n+1
Y
2
σk2 σn+1
2 (µi − µn+1 )2 σj2 2
σk2
M= (µi − µj ) + = (µi − µj )
i=1 j=i+1 k=1 i=1 j=1 i=1 j=i+1 k=1
k6=i,j j6=i k6=i,j
Collecting terms
2 n n+1
X (µi − µj )2
1 σi=1...n+1 X
Si=1...n S(i=1...n)n+1 = p exp − = Si=1...n+1 Q.E.D.
(2π)n αn+1 2 i=1 j=i+1
σi2 σj2
13