Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Select a measure of
similarity or
dissimilarity
Choose a clustering
algorithm
Clustering variables
1. There should be significant differences between the “dependent” variable(s)
across the clusters
2. Avoid using an abundance of clustering variables, as this increases the odds that
the variables are no longer dissimilar
If the variables are highly correlated, specific aspects covered by these variables
will be overrepresented in the clustering solution
3. Keep the sample size in mind (rule of thumb: The sample size should be at least
2m, where m equals the number of clustering variables)
Step 1
Step 5
A, B, C, D, E
Step 2
Agglomerative clustering
Step 4
Divisive clustering
A, B C, D, E
sequentially merged according
to their similarity
• Divisive clustering:
• At the beginning all objects are
Step 3
Step 3
A, B C, D E
initially merged into a single
cluster
• This cluster is then gradually
Step 4
Step 2
A, B C D E split up
Partitioning methods:
• k-means, k-medoids…
Step 5
Step 1
A B C D E
For Ordinal and Metric Variables Different Distances Measures
can be Used (Hierarchical Methods)
Distance measures*:
C • Euclidean distance:
( x B - x C) + ( y B- yC )
2 2
d Euclidian(B,C) =
Brand loyalty (y)
Chebychev distance
• Chebychev distance:
d Chebychev
( B, C ) max( x x , y y
B C
)
B C
Step 1 Step 2
A B A B
CC1 CC1
C C
D E D E
Brand loyalty (y)
F F
G G
Step 3 Step 4
A B A B
CC1 CC1‘
CC1‘
C
C
D E D E
Brand loyalty (y)
G G
• Each cluster’s geometric center is • The distances from each object to the
computed (=mean values of the objects newly located cluster centers are
contained in the cluster regarding each of computed
the clustering variables) • The objects are again assigned to a certain
cluster
Generally, k-means is More Flexible Compared to
Hierarchical Methods
• Scree plot
• Dendrogram
• A priori knowledge
• Practical considerations
Are the results interpretable and meaningful?
Are the segments manageable?
Does the solution warrant strategic attention?
Example SPSS (I)
• Dataset thaltegos.sav
Example SPSS (II)
Select Pearson
Example SPSS (III)
Check the box
Agglomeration schedule
and continue.
Choose to display a
dendrogram
?!?
?!?
?!?