Sei sulla pagina 1di 6

Chapter 10

Gaussian Mixture Model

10.1 Introduction
As we know that each Gaussian is represented by a
combination of mean and variance, if we have a mixture of M
Gaussian distributions, then the weight of each Gaussian will
be a third parameter related to each Gaussian distribution in a
Gaussian mixture model (GMM). The following equation rep-
resents a GMM with M components.
M
p ( x|θ ) = ∑w p( x |θ )
k =1
k k

where wk represents the weight of the kth component. The


mean and covariance of kth components are represented by
θk = ( µk , ∑k ). p( x | θk ), which is the Gaussian density of the
kth component and is a D-variate Gaussian function of the
following form:

{−(1/2 )( x −µ )′∑ }
( ( ∑ )) =2π
−1
1 ( x −µk )
p ( x|θk )or p x| µk ,
k
e k


1/2
k D /2
k
138 ◾ Machine Learning

Sum of values of w k for different values of k should not exceed 1


or ∑kM=1 w k = 1 and w k > 0, ∀k.
When one is performing clustering using GMM, the goal is
to find the model parameters (mean and covariance of each
distribution as well as the weights), so that the resulting model
best fits the data. In order to best fit the data, we should
maximize the likelihood of the data given in the GMM model.
This target can be achieved by using iterative expectation
maximization (EM) algorithm, but initial estimates are required
to execute this algorithm. If these initial estimates are poor,
the algorithm can get stuck in local optima. A solution to this
problem is to start with k-means and use the discovered mean
and covariances of clusters as input for the EM algorithm.
Once we are able to fit the mixture model, we can
explore the clusters by computing the posterior probabil-
ity of data instances using each mixture component. GMM
will assign each instance to a cluster based on calculated
likelihood.
GMM is used in a number of applications, including
speaker identification [1] and biometric verification [2].

10.2 Learning the Concept by Example


In order to understand how the GMMs can be used for
clustering purpose, we will describe the process of cluster-
ing with one-dimensional GMM with three components. Each
of the components has its respective mean and variance
values. The graph (bell curve) of the associated probability
density of each component has a peak at the mean. For our
example we will have three bell curves with three different
peaks. The data associated with each component is given in
Table 10.1.
In MATLAB®, the relevant code to manage the above infor-
mation is simple and is described as follows:
Gaussian Mixture Model ◾ 139

Table 10.1 The Means and Variances of Three


Gaussian Distributions
Gaussian Component Mean Variance
1 −1 2.25
2 0 1
3 3 0.25

mu1 = [-1];
mu2 = [0];
mu3 = [3];
sigma1 = [2.25];
sigma2 = [1]
sigma3 = [.25];

We will randomly generate values from the three distributions


in different proportion. For example, 30% of values will come
from the first distribution, 50% of data will be from the second
distribution, and 20% of data will be taken from the third
distribution. For the generation of a sample of 1000 random
values, the following MATLAB code will serve the purpose.

weight1 = [.3];
weight2 = [.5];
weight3 = [.2];
component_1 = mvnrnd(mu1,sigma1,300);
component_2 = mvnrnd(mu2,sigma2,500);
component_3 = mvnrnd(mu3,sigma3,200);
X = [component_1; component_2; component_3];

In order to understand how the distribution of three components


looks like, we will plot the three distributions in one graph. The
following MATLAB code will perform the required job.

gd1 = exp(- 0.5 * ((component_1 - mu1) / sigma1).^


2) / (sigma1 * sqrt(2 * pi));
gd2 = exp(- 0.5 * ((component_2 - mu2) / sigma2).^
2) / (sigma2 * sqrt(2 * pi));
140 ◾ Machine Learning

gd3 = exp(- 0.5 * ((component_3 - mu3) / sigma3).^


2) / (sigma3 * sqrt(2 * pi));
plot(component_1,gd1,‘.’)
hold on
plot(component_2,gd2,‘.’)
hold on
plot(component_3,gd3,‘.’)
grid on
title(‘Bell Curves of three components’)
xlabel(‘Randomly produced numbers’)
ylabel(‘Gauss Distribution’)

The result of the above code in shown in Figure 10.1.

Bell curves of three components


1.6

1.4

1.2
Gauss distribution

1.0

0.8

0.6

0.4

0.2

0
−6 −4 −2 0 2 4 6
Randomly produced numbers

Figure 10.1 The curves of the three Gaussian distributions with


means and variances as in Table 10.1.
Gaussian Mixture Model ◾ 141

In order to find the model that fits the three distributions,


we can use the following MATLAB code:

gm1 = gmdistribution.fit (X,3);


a = pdf (gm1,X)
plot(X, a,‘.’)

This will result in the mixed distribution shown in Figure 10.2.


In order to find which points belong to which cluster, one
single line of MATLAB code will suffice.

idx = cluster (gm1,X);

Since we are using 1000 random numbers from three


different distributions that are present in sequential manner
in 1000 × 1 array, idx will also be a one-dimensional array

Curve of Gaussian mixture distribution


0.35

0.30

0.25

0.20

0.15

0.10

0.05

0
−6 −4 −2 0 2 4 6

Figure 10.2 The mixed Gaussian distribution of the three Gaussian


curves.
142 ◾ Machine Learning

(1000 × 1) with each value having three possible values that


are 1, 2, and 3, describing the cluster number to which the
particular value belongs to. Since our pool of random numbers
belonging to the three distributions were inserted sequentially,
we expect that in variable idx there will be a long series of
300 1s followed by 500 2s and then 200 3s. We will plot idx in
one dimension to understand the allocation of points to clus-
ters. The following code will fulfill the requirement:

hold on;
for i = 1: 1000
if idx(i) == 1
plot(X(i),0,’r*’)
elseif idx(i) == 2
plot(X(i),0,’b+’)
else
plot(X(i),0,’go’)
end
end
title(‘Plot illustrating the cluster assignment’);
ylim([-0.2 0.2]);
hold off

The result of the above code is shown in Figure 10.3.

Plot illustrating the cluster assignment


0.20

0.15

0.10

0.05

−0.05

−0.10

−0.15

−0.20
−6 −4 −2 0 2 4 6

Figure 10.3 The three clusters corresponding to the three Gaussian


distributions.

Potrebbero piacerti anche