Cluster Analysis

Chapter 12: Cluster analysis and segmentation of customers
Commercial applications
A chain of radio-stores uses cluster analysis for identifying three different customer types with varying needs. An insurance company is using cluster analysis for classifying customers into segments like the self confident customer, the price conscious customer etc. A producer of copying machines succeeds in classifying industrial customers into satisfied and non-satisfied or quarrelling customers.
Input-data
Cluster: Obs. 1 Obs. 2 Obs. 3 Obs. i Obs, m
2
Observation 1 Observation 2 Observation 3 Observation i Observation m
Output-data
CL1 1 0 1 1 0 CL2 0 1 0 0 1 CL 1 2 1 1 2
X1 X2 Xn
Cluster 1 Classify
rows
Cluster 2
Factor: X1 X2 X3
Obs. 1 Obs. 2 Obs. m
XjXn
4
X1 X2 X3 Xj Xn F1 0,8 0,2 -0,7 0,6 0,0 F2 -0,1 0,7 0,1 -0,2 0,5 Fj 0,0 -0,1 0,1 0,0 -0,6
Factor 1
Classify Factor 2 columns
Figure 11.1
Relatedness of multivariate methods: cluster analysis and factor analysis
Dependence and Independence methods

Dependence Methods: We assume that a variable (i.e. Y) depends on (are caused or determined by) other variables (X1, X2 etc.) Examples: Regression, ANOVA, Discriminant Analysis
Independence Methods: We do not assume that any variable(s) is (are) caused by or determined by others. Basically, we only have X1, X2 .Xn (but no Y)
Examples: Cluster Analysis, Factor Analysis etc.
Dependence and Independence methods

Dependence Methods: The model is defined apriori (prior to survey and/or estimation) Examples: Regression, ANOVA, Discriminant Analysis
Independence Methods: The model is defined aposteriori (after the survey and/or estimation has been carried out) Examples: Cluster Analysis, Factor Analysis etc. When using independence methods we let the data speak for themselves!
Dependence method: Multiple regression

Y (Sales)
Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10 1.700.000 1.400.000 1.200.000 1.500.000 . . . . . .
X1 (Price)
95 90 80 85 . . .. . .
X2 (Price Competitor)
100 80 75 90 . . . . . .
X3 (Adverting)
300.000 200.000 200.000 250.000 . . . . . .
The primary focus is on the variables!
Independence method: Cluster analysis

X1 X2
2 3 4 3 . . . . . .
X3
1 4 3 2 . . . . . .
X3
3 2 5 4 . . . . . .
Cluster 1
Cluster 2 Cluster 3
Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10
5 3 2 5 . . . . . .
The primary focus is on the observations!
Cluster analysis output: A new clustervariable with a cluster-number on each respondent

X1 Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10 5 3 2 5 . . . . . . X2 2 3 4 3 . . .. . . X3 1 4 3 2 . . . . . . X3 3 2 5 4 . . . . . . Cluster 1 2 3 1 2 3 3 1 3 2
Cluster analysis: A cross-tab between the clustervariable and background + opinions is established
Cluster 1 Age %-Females Household size Opinion 1 Opinion 2 Opinion 3 32 31 1.4 3.2 2.1 2.2 Cluster 2 44 54 2.9 4.0 3.4 3.3 Cluster 3 56 46 2.1 2.6 3.2 3.0
Younger male nerds

Core-families with Traditional values Senior-relaxers
Cluster profiling: (hypothetical)
Cluster 1: Ecological shopper
Cluster 2: Traditional shopper
Buy ecological food Advertisements funny Low price important
1 2 3 4 5 Note: Finally the clusters respective media-behaviour needs to be uncovered

1 = Totally Agree
A small example of cluster analysis

Friendly (X02) John Bob Cathy John-Bob John-Cathy Bob-Cathy Stagnant (X08) distances Cluster
5 1 4 4 1 3
1 5 2 4 1 3
A B A
8 2 6
Governing principle
Maximization of homogeneity within clusters and simultaneously Maximization of heterogeneity across clusters
Non-overlapping (Exclusive) Methods Hierarchical Non-hierarchical/ Partitioning/k-means - Sequential threshold - Parallel threshold - Neural Networks - Optimized partitioning (8) Variance Methods - Ward (7)
Overlapping Methods Non-hierarchical - Overlapping k-centroids -Overlapping k-means - Latent class techniques - Fuzzy clustering
Agglomerative
Divisive
- Q-type Factor analysis (9)

Name in SPSS 1 2 3 4 5 6 7 8 9 Between-groups linkage Within-groups linkage Nearest neighbour Furthest neighbour Centroid clustering Median clustering Wards method K-means cluster (Factor)
Linkage Methods - Average
Centroid Methods
- Between (1)
- Within (2)
- Weighted
- Centroid (5) - Median (6)
- Single
- Ordinary (3)
- Density - Two stage Density
Note: Methods in italics are available In SPSS. Neural networks necessitate SPSS data mining tool Clementine
- Complete (4)
Figure 12.1
Overview of clustering methods
Non overlapping
Overlapping *
Single Linkage:
Minimum distance
Complete Linkage:
* Hierarchical Non-hierarchical * * *
Maximum distance
Average Linkage:
Average distance
Agglomerative
Divisive
1a
1a 1c 1b
1b
1b1
* * * * *
* * * * * * *
Centroid method:
Distance between centres

Wards method:
2
Figure 12.2
1b2
* * * * *
Minimization of within-cluster variance
Illustration of important clustering issues in Figure 12.1
Euclidean distance (Default in SPSS):

Y * B (x2, y2)
y2-y1 A * (x1, y1) x2-x1
X
2 2
d =
(x2-x1) + (y2-y1)
Other distances available in SPSS: City-Block uses of absolute differences instead of squared differences of coordinates. Moreover: Minkowski distance, Cosine distance, Chebychev distance, Pearson Correlation.
Euclidean distance
Y B (3, 5)
5-2 A * (1, 2) 3-1 X d = (3-1) + (5-2)

2 2
= 3,61
Which two pairs of points are to be clustered first?

*G A* *B
C *
* F *D
H*
*E
Maybe A/B and D/E (depending on algorithm!)

*G A* *B
C *
* F *D
H*
*E
Quo vadis, C?
*G A* *B
C * *D
H*
*E
Quo vadis, C? (Continued)

*G A* *B
C * *D
H*
*E
How does one decide which cluster a newcoming point is to join?

Measuring distances from point to clusters or points:
Farthest neighbour (complete linkage) Nearest neighbour (single linkage) Neighbourhood centre (average linkage)
Quo vadis, C? (Continued)

*G A* 10,5 *B 8,5 7,0 11,0
C * 12,0
8,5
9,0 9,5
*D
H*
*E
Complete linkage
Minimize longest distance from cluster to point
*G
A* 10,5
*B
C * *D 9,5 H* *E
Average linkage
Minimize average distance from cluster to point
*G
A*
*B
8,5
C * 9,0 *D
H*
*E
Single linkage
Minimize shortest distance from cluster to point *G
A*
*B
7,0
C *
8,5
*D
H*
*E
Single linkage: Pitfall

* *
*
Cluster formation begins
A*
*
B*
*C
All the time the closest observation is put into the existing cluster(s)
Chaining or Snake-like clusters
*
* *
A and C merge into the same cluster omitting B!
Single linkage: Advantage
* ** * * Entropy group * * * *
* *
Outliers
Good outlier detection and removal procedure in cases with noisy data sets
Cluster analysis
More potential pitfalls & problems:
Do our data at all permit the use of means? Some methods (i.e. Wards) are biased toward production of clusters with approximately the same number of observations. Other methods (i. e. Centroid) require data as input that are metric scaled. So, strictly speaking it is not allowable to use this algorithm, when clustering data containing interval scales (Likert- or semantic differential scales).
Cluster analysis: Small artificial example

1
0,68
3
0,42
2
0,92
4
0,58
5 Note: 6 points yield 15 possible pairwise distances - [n*(n-1)]/2

1
0,68
3
0,42
2
0,92
4
0,58

1
0,68
3
0,42
2
0,92
4
0,58
Dendrogram
OBS 1 * OBS 2 * Step 0: Each observation is treated as a separate cluster OBS 3
OBS 4 * OBS 5
OBS 6 * Distance Measure
0,2
0,4
0,6
0,8
1,0
Dendrogram (Continued)
OBS 1 *
Cluster 1
OBS 2 * Step 1: Two observations with smallest pairwise distances are clustered OBS 3
OBS 4 * OBS 5
OBS 6 *
0,2
0,4
0,6
0,8
1,0
OBS 1 *
Cluster 1
OBS 2 *
Step 2: Two other observations with smallest distances amongst remaining points/clusters are clustered
OBS 3
OBS 4 * OBS 5
Cluster 2
OBS 6 *
0,2
0,4
0,6
0,8
1,0
OBS 1 *
Cluster 1
OBS 2 * OBS 3 Step 3: Observation 3 joins with cluster 1
OBS 4 * OBS 5
Cluster 2
OBS 6 *
0,2
0,4
0,6
0,8
1,0
OBS 1 *
Supercluster
OBS 2 * OBS 3
OBS 4 * Step 4: Cluster 1 and 2 - from Step 3 joint into a Supercluster OBS 5
OBS 6 *
0,2
A single observation remains unclustered (Outlier)
0,4
0,6
0,8
1,0
Textbooks in Cluster Analysis

Cluster Analysis, 1981 Brian S. Everitt
Cluster Analysis for Social Scientists, 1983 Maurice Lorr Cluster Analysis for Researchers, 1984 Charles Romesburg Cluster Analysis, 1984 Aldenderfer and Blashfield
Case: Clustering of beer brands

Brand profiles based om the 17 semantic differential scales Purpose: to determine the market structure in terms of similar/different brands Hypothesis: reflects the competitive structure among brands due to consumers bahaviour

Cluster Analysis

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Cluster Analysis

Caricato da

Copyright:

Formati disponibili

Chapter 12: Cluster analysis and segmentation of customers

Classify Factor 2 columns

Relatedness of multivariate methods: cluster analysis and factor analysis

Dependence and Independence methods

Examples: Cluster Analysis, Factor Analysis etc.

Dependence and Independence methods

Dependence method: Multiple regression

The primary focus is on the variables!

Independence method: Cluster analysis

The primary focus is on the observations!

Cluster analysis output: A new clustervariable with a cluster-number on each respondent

Younger male nerds

Cluster profiling: (hypothetical)

Cluster 1: Ecological shopper

Cluster 2: Traditional shopper

Buy ecological food Advertisements funny Low price important

1 2 3 4 5 Note: Finally the clusters respective media-behaviour needs to be uncovered

A small example of cluster analysis

- Q-type Factor analysis (9)

Linkage Methods - Average

- Centroid (5) - Median (6)

Overview of clustering methods

Distance between centres

Minimization of within-cluster variance

Illustration of important clustering issues in Figure 12.1

Euclidean distance (Default in SPSS):

y2-y1 A * (x1, y1) x2-x1

5-2 A * (1, 2) 3-1 X d = (3-1) + (5-2)

Which two pairs of points are to be clustered first?

Maybe A/B and D/E (depending on algorithm!)

Quo vadis, C? (Continued)

How does one decide which cluster a newcoming point is to join?

Quo vadis, C? (Continued)

Single linkage: Pitfall

A and C merge into the same cluster omitting B!

Single linkage: Advantage

Cluster analysis: Small artificial example

5 Note: 6 points yield 15 possible pairwise distances - [n*(n-1)]/2

Cluster analysis: Small artificial example

Cluster analysis: Small artificial example

OBS 6 * Distance Measure

Textbooks in Cluster Analysis

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Potrebbero piacerti anche