Sei sulla pagina 1di 8

MEDICAL IMAGE

PROCESSING
Title: K-Means Clustering

P Raja Rajeswari Chandni


19PEM008
I-M.E (Medical Electronics)
K- MEANS CLUSTERING-IMAGE
SEGMENTATION

Clustering algorithms are unsupervised algorithms but are similar to Classification


algorithms but the basis is different. In Clustering, you don't know what you are looking
for, and you are trying to identify some segments or clusters in your data. When you use
clustering algorithms in your dataset, unexpected things can suddenly pop-up like
structures, clusters, and groupings you would have never thought otherwise.
K-Means clustering algorithm is an unsupervised algorithm and it is used to segment
the interest area from the background. It clusters, or partitions the given data into K-
clusters or parts based on the K-centroids.
The algorithm is used when you have unlabeled data(i.e. data without defined categories
or groups). The goal is to find certain groups based on some kind of similarity in the
data with the number of groups represented by K.

1
In K-means clustering, the number of clusters is initialized and the center of each of
the cluster is randomly chosen. The Euclidean distance between each data point and all
the center of the clusters is computed and based on the minimum distance each data
point is assigned to certain cluster. The new center for the cluster is defined and the
Euclidean distance is calculated. This procedure iterates till convergence is reached.
The objective of K-Means clustering is to minimize the sum of squared distances
between all points and the cluster center.

Steps In K-Means Algorithm:


1. Choose the number of clusters K.
2. Select at random K points, the centroids(not necessarily from your dataset).
3. Assign each data point to the closest centroid → that forms K clusters.
4. Compute and place the new centroid of each cluster.
5. Reassign each data point to the new closest centroid. If any reassignment takes place, go to step
4, otherwise, the model is ready.

CHOOSING THE OPTIMAL VALUE OF K


For a certain class of clustering algorithms (in particular K-Means, K-medoids,
and expectation-maximization algorithm), there is a parameter commonly referred to as K
that specifies the number of clusters to detect. Hierarchical Clustering avoids the problem
altogether but that's beyond the scope of this article.

2
If we talk about K-Means then the correct choice of K is often ambiguous, with
interpretations depending on the shape and scale of the distribution of points in a data set
and the desired clustering resolution of the user. In addition, increasing K without penalty
will always reduce the amount of error in the resulting clustering, to the extreme case of
zero error if each data point is considered its own cluster (i.e., when K equals the number
of data points, n).
Intuitively then, the optimal choice of K will strike a balance between maximum
compression of the data using a single cluster, and maximum accuracy by assigning each
data point to its own cluster.
If an appropriate value of K is not apparent from prior knowledge of the properties of the
data set, it must be chosen somehow. There are several categories of methods for making
this decision and Elbow method is one such method.

ELBOW METHOD
The basic idea behind partitioning methods, such as K-Means clustering, is to define clusters
such that the total intra-cluster variation or in other words, total within-cluster sum of square
(WCSS) is minimized. The total WCSS measures the compactness of the clustering and
we want it to be as small as possible.

3
The Elbow method looks at the total WCSS as a function of the number of clusters: One
should choose a number of clusters so that adding another cluster doesn’t improve much
better the total WCSS.

STEPS TO CHOOSE THE OPTIMAL NUMBER OF CLUSTERS


K:(ELBOW METHOD)
1. Compute K-Means clustering for different values of K by varying K from 1 to 10 clusters.
2. For each K, calculate the total within-cluster sum of square (WCSS).
3. Plot the curve of WCSS vs the number of clusters K.
4. The location of a bend (knee) in the plot is generally considered as an indicator of the
appropriate number of clusters.

EXAMPLE OF K MEANS CLUSTERING:


Data point position X = 13, Y = 20
Cluster 1 position X = 8, Y = 19

4
Cluster 2 position X = 13, Y = 15
1. Find the Euclidean Distance:
2. Find the Euclidean distance(D1) between data point and the cluster 1 similarly, find the
Euclidean distance(D2) between data point and the cluster 2
3. Distance D1 = sqrt((13-8).^2+(20-19).^2)) = 5.0990
4. Distance D2 = sqrt((13-13).^2+(20-15).^2))= 5.0000
5. Find the minimum and assign the data point to a cluster
6. Now the minimum distance among the two results is for the cluster 2.
7. So the data point with (X,Y)=(13,20) is assigned to the cluster/group 2.
8. Perform the step 1 and step 2 for all the data points and assign group accordingly.
9. Assign a new position for the clusters based on the clustering done.
10. Find the average position of the newly assigned data points to a particular cluster and use
that average as the new position for the cluster.
11. Iterate this procedure till the position of the clusters are unchanged.
12. Number of clusters = 3
PROGRAM:
he = imread(hestain.png);

imshow(he), title(H&E image);

text(size(he,2),size(he,1)+15,...Image courtesy of Alan Partin, Johns Hopkins University,


...FontSize,7,HorizontalAlignment,right);

lab_he = rgb2lab(he);

ab = lab_he(:,:,2:3);

ab = im2single(ab);

nColors = 3;

% repeat the clustering 3 times to avoid local minima

pixel_labels = imsegkmeans(ab,nColors,NumAttempts,3);

imshow(pixel_labels,[])

title(Image Labeled by Cluster Index);

5
mask1 = pixel_labels==1;

cluster1 = he .* uint8(mask1);

imshow(cluster1)

title('Objects in Cluster 1');

mask2 = pixel_labels==2;

cluster2 = he .* uint8(mask2);

imshow(cluster2)

title(Objects in Cluster 2);

mask3 = pixel_labels==3;

cluster3 = he .* uint8(mask3);

imshow(cluster3)

title(Objects in Cluster 3);

L = lab_he(:,:,1);

L_blue = L .* double(mask3);

L_blue = rescale(L_blue);

idx_light_blue = imbinarize(nonzeros(L_blue));

blue_idx = find(mask3);

mask_dark_blue = mask3;

mask_dark_blue(blue_idx(idx_light_blue)) = 0;

blue_nuclei = he .* uint8(mask_dark_blue);

imshow(blue_nuclei)

title(BlueNuclei);

6
RESULT

Potrebbero piacerti anche