20559315

Journal of Quantitative Criminology, Vol. 22, No.
1, March 2006 ( 2006)

DOI: 10.1007/s10940-005-9003-6
On The Application of Fuzzy Clustering for Crime

Hot Spot Detection
Tony H. Grubesic1
One of the fundamental challenges in crime mapping and analysis is pattern

recognition. Efforts and methods to detect crime hot-spots, or geographic areas of
elevated criminal activity, are wide ranging. For aggregate data, such as total
crime events in a census tract(s), measures of spatial autocorrelation have proven
useful. For disaggregate data (i.e. individual crime events), kernel density
smoothing and non-hierarchical cluster analysis (e.g. k-means), are widely used.
Non-hierarchical techniques are particularly effective in delineating geographic
space into areas of higher or lower crime concentrations, because each observa-
tion is assigned to one and only one cluster. The resulting set of partitions pro-
vides clear-cut spatial boundaries that can be used for hot-spot evaluation and
interpretation. However, the strength of non-hierarchical methods can also be
viewed as a weakness. Although the hard-clustering of observations into a set of
discrete clusters is helpful, there are many cases where ambiguity exists in the
data. In such cases, a more generalized approach for hot-spot detection would be
helpful. The purpose of this paper is to explore the use of a generalized parti-
tioning method known as fuzzy clustering for hot-spot detection. Functional and
visual comparisons of fuzzy clustering and two hard-clustering approaches
(medoid and k-means), across a range of cluster values are analyzed. The
empirical results suggest that a fuzzy clustering approach is better equipped to
handle intermediate cases and spatial outliers.
KEY WORDS: hot-spot detection; crime; cluster analysis; spatial analysis; geo-
graphic information system (GIS).
1. INTRODUCTION
The ‘‘art’’ and science of cluster analysis continues to reward and
frustrate criminologists, geographers, sociologists, and others in their efforts
to apply this statistical technique in crime mapping and analysis. Broadly
1
Department of Geography, University of Cincinnati, Cincinnati, OH, 45221-0131, USA;
Phone: +1-513-556-3357; E-mail: tony.grubesic@uc.edu
77
0748-4518/06/0300-0077/0 2006 Springer Science+Business Media, Inc.
78 Tony H. Grubesic
defined, cluster analysis is a technique that seeks to place individual

observations into groups that mimimize within-cluster variation and maxi-
mizes between-cluster variation (Gordon, 1996, 1999; Kaufman and Rous-
seeuw, 1990). While this exploratory technique can provide significant
insight into patterns of crime events (Harries, 1999; Jefferis and Mamalian,
1998; Levine, 1999; Murray and Grubesic, 2002), the technique of cluster
analysis suffers from a wide variety of misuses and misapplications, par-
ticularly when used to identify hot-spots. By definition, a hot-spot is a
geographic area that exhibits a higher concentration of crime than its sur-
rounding areas (Grubesic and Murray, 2001; Levine, 1999). The analysis of
crime hot-spots is fundamental to the explanation of criminal activities and
their spatial trends. For example, certain environmental factors, such as the
physical layout of an area, proximity to various services and land use
mixes—are likely influences on criminal behavior (Greenburg and Rohe,
1984). Issues of access, exposure, opportunity and the availability of targets
are also important elements in helping explain crime from an environmental
perspective (Brantingham and Brantingham, 1981; Cohen and Felson,
1979). Not surprisingly, certain areas are more prone to higher concentra-
tions of crime. These ‘‘hot-spots’’ are often targets of increased manpower
from law enforcement agencies in an effort to reduce crime. Where resources
are concerned, the identification of hot-spots is helpful because most police
departments are understaffed. As such, the ability to prioritize intervention
through a geographic lens is appealing (Levine, 1999).2
Unfortunately, the process of delineating hot-spot boundaries is
somewhat arbitrary. Because crime densities are measured over a continu-
ous area, the actual boundaries separating hot-spots from areas with lower
crime densities are largely perceptual constructs (Harries, 1999). This means
that defined hot-spots can vary significantly depending on the scale of
analysis. The process of data aggregation between spatial units (e.g. block
crime counts to block groups), and any subsequent statistical or spatial
analysis, must account for the modifiable areal unit problem (MAUP)
(Openshaw, 1984). MAUP can create situations where a localized hot-spot
consisting of several census blocks might not even register if the analysis is
more concerned with regional concentrations of crime (Harries, 1999).
The surging interest in hot-spot detection methods, particularly the use
of cluster analysis, has evolved because analysts need an objective method
for determining these areas of elevated crime. Not only does this provide a
more solid foundation from which to allocate human resources (e.g. patrol
officers), it provides the opportunity to begin the process of implementing
2
Additional discussions of crime hot-spots can be found in Ratcliffe and McCullagh (2001),
Craglia et al. (2000, 2001), and Ackerman and Murray (2004).
Application of Fuzzy Clustering for Crime Hot Spot Detection 79
pro-active policing as opposed to reactive policing. In other words, the

ability to predict crime through prospective hot-spotting (Bowers et al.,
2004) is an emerging area of importance in crime mapping and analysis.
However, before pro-active policing and prospective hot-spotting can be-
come a reality, the ability to accurately delineate crime hot-spots using
existing empirical data and cluster analysis must be established.
Obviously, cluster analysis is not the only method of determining crime
hot-spots. For example, both Jefferis and Mamalian (1998) and Craglia
et al. (2000) outline a wide variety of techniques for determining areas of
elevated risk, including visual/cartographic interpretation techniques,
choropleth mapping, kernel density estimation, spatial autocorrelation and
cluster analysis. Clearly, the statistical based approaches, when used in
conjunction with visual methods, can provide valuable insight for detecting
areas of concern (Craglia et al., 2000). It is also important to note that crime
mapping and analysis is not the only field where the identification of hot-
spots is considered important. Spatial epidemiology seeks to identify areas
of elevated disease risk (Gatrell and Rowlingson, 1994; Gatrell et al., 1996;
Lawson, 2001).
As mentioned previously, the application of cluster analysis for hot-
spot detection is relatively problematic for several reasons. First, there is a
general level of confusion regarding the appropriate family of clustering
models to use. For example, many analysts struggle when deciding between
hierarchical or non-hierarchical/partitioning approaches. Second, because
hot-spots are spatial in nature, there are concerns in the way that both
hierarchical and partitioning-based clustering methods treat geographic
space. Third, there is relatively little guidance for determining the appro-
priate number of clusters, k, for analysis. Moreover, because both ap-
proaches require users to specify the number of clusters a priori, analysts are
forced to examine a range of possible solutions to determine the best
configuration. Finally, the strict all-or-nothing assignment of cluster mem-
berships (hard clustering) is not always realistic for hot-spot delineation.
Real-world data are often ambiguous in nature and hard-clustering
approaches do not treat outliers or intermediate cases particularly well.
Although the aforementioned limitations can be problematic, there are
robust alternatives to hard-clustering methods. One approach, known
as fuzzy clustering, is a generalization of hard, partition-based methods
(Bezdek, 1981; Brimicombe, 2003; Kaufman and Rousseeuw, 1990). Instead
of using an all-or-nothing approach to cluster assignment, fuzzy clustering is
able to say that an object (e.g. crime event), belongs primarily to Cluster 1,
whereas a second object might belong equally to Clusters 1, 2 and 3. This
degree of belonging is quantified by using membership coefficients that range
from 0 to 1 (Kaufman and Rousseeuw, 1990). The resulting fuzzy clusters
80 Tony H. Grubesic
provide a much more detailed snapshot of data structure, allowing analysts

to make better decisions on cluster memberships and delineated hot-spots.
Given the obvious potential associated with hot-spot delineation and a
generalized partitioning approach to crime events, the purpose of this paper
is two-fold. First, it provides a brief review of hierarchical and non-
hierarchical clustering methods, identifying their strengths and weaknesses
when used with spatial data and hot-spot detection. The study will also
address several myths associated with cluster analysis for crime hot-spot
detection, including the notion that non-hierarchical clustering approaches
are more difficult to use than both hierarchical clustering and density based
approaches (kernel interpolation) for identifying hot spots. Second, this
paper explores the relative utility of a generalized partitioning method
known as fuzzy clustering for crime hot-spot detection. This analysis in-
cludes both functional and visual comparisons of fuzzy clustering and two
hard-clustering approaches (median and k-means). Finally, this paper ex-
plores methods for visualizing differences in cluster probability surfaces with
data generated in the fuzzy clustering approach.
2. A COMPARISON OF CLUSTERING APPROACHES

2.1. Hierarchical Cluster Analysis
There are two basic hierarchical clustering methods, agglomerative and
divisive. Both of these methods begin with the calculation of a (n n) matrix,
D, of dissimilarities between every pair of observations (Bailey and Gatrell,
1995). Euclidean and Manhattan distances are two of the most frequently
used metrics, however, measures of dissimilarities are actually quite varied.
For example, in the data analysis package, CrimeStat 2, single linkage
clustering (also called nearest neighbor clustering) is one of the choices for
cluster analysis.3 Specifically, CrimeStat 2 implements an agglomerative
hierarchical approach using nearest neighbor clustering. The approach
starts with all observations (e.g. crime events) as independent entities. It
then uses the nearest neighbor measure to construct D. The nearest neighbor
measure is a comparison of the distances between two points (or groups of
points) with the average distance between all points. If the distance meets
the a priori criterion (usually less than the calculated probabilities of a
threshold distance occurring by chance) observations are linked to form
a new cluster. This process is repeated until all points have been assigned to
a first-order cluster. First order clusters are then tested for second-order
clustering in the same manner. The routine ends when all of the sub-groups
converge into one cluster or the threshold distance criterion fails. The
3
Nearest neighbor measures are typically based on Euclidean distances.
divisive approach is similar in nature, however, it starts with all of the

observations in one group and iteratively splits observations into clusters.
The use of hierarchical techniques, particularly with spatial data, can be
problematic when trying to identify hot-spots. For example, Bailey and
Gatrell (1995) note that single linkage approaches can produce clusters that
are elongated. This is known as the ‘‘chaining effect’’. Simply put, an
observation can join a group based on its similarity with just one member of
the group, producing elongated clusters. This statistical bias is indicative of
the most fundamental problem with hierarchical cluster analysis: one can
never repair what was done in previous steps. Once an agglomerative
method has joined two observations, or the divisive method has split two
observations, the changes are permanent. Kaufman and Rousseeuw (1990)
note that this level of rigidity in hierarchical techniques is both the key to
their success (low computation times) and their major disadvantage
(inability to correct erroneous decisions). Furthermore, Bailey and Gatrell
(1995, p. 233) suggest ‘‘although hierarchical clustering optimizes a criterion
at each step, there is no guarantee that, if one ends up with k groups, this is
the partition of the observations which would optimize this same criterion
over all possible partitions of the observations into k groups.’’ Simply put,
hierarchical techniques often produce local rather than global optima. This
is clearly one of the fundamental differences between hierarchical- and
partitioning-based clustering methods, which attempt to calculate a global
optima. A final issue worth noting for hierarchical approaches is the spatial
distribution of events in an analysis. Where crime is concerned, Levine
(1999) suggests that the type of crime can actually impact the type of
clustering approach required. For example, crime distributions with many
incidents, such as burglary, typically have lower threshold distances than
distributions with fewer incidents like rape. As a result, the production of
inconsistent hot-spots is a concern with hierarchical clustering.
2.2. Non-Hierarchical (Partitioning) Cluster Analysis

Non-hierarchical, partitioning- or optimization-based cluster analysis is
fundamentally different in its approach to classification. Partitioning-based
methods attempt to split observations into a pre-specified number of groups,
k, where the specified criterion is optimized globally over all possible splits.
This approach must satisfy the requirements of a partition (Kaufman and
Rousseeuw, 1990):
Each group must contain at least one object.

Each object must belong to at least one group.
82 Tony H. Grubesic
Fig. 1. A partition with n=16 and k=3.
Further, these conditions require that there are at least as many

groups as there are objects, k £ n. The major disadvantage to this ap-
proach is that a user must specify the number of groups a priori to the
analysis. As a result, this can lead to partitions that are somewhat artificial
in nature, but it is possible to rerun the analysis with different values of k
to determine which clustering appears to provide the most meaningful
interpretation. Figure 1 provides a graphical example of a partition based
clustering result.
There are a variety of partitioning-based cluster algorithms. One of the
most prominent and popular techniques is the k-means approach proposed
in Fisher (1958) and further refined by MacQueen (1967). The defining
characteristic of this approach is the use of centroids for grouping obser-
vations (Arabie and Hubert, 1996; Belbin, 1987; Hanson and Jaumard,
1997). The technique itself is based upon multivariate analysis of variance in
the evaluation of homogeneity among entities (Estivill-Castro and Murray,
2000). Specifically, the scatter matrix of similarity between entities may be
evaluated by its trace with homogeneity measured for each grouping
through the use of the sum of squares loss function (Aldenderfer and
Blashfield, 1984; Rousseeuw and Leroy, 1987). The following notation will
be used to specify the k-means approach:
i = index of observations;
ai = attribute weight of observation i (i.e. the number of crimes at a given
location);
k = index of clusters;
dik= distance between observation i and cluster k;
n
1 if observation i is assigned in cluster k
yik ¼
0 otherwise
K-means
XX
Minimize Z ¼ ai d2ik yik ð1Þ
i k
Subject to
X
yik ¼ 1 ð2Þ
k
yik ¼ ð0; 1Þ ð3Þ

The objective (1) of the k-means model is to minimize the total weighted
squared difference in cluster group membership, which is equivalent to
minimizing the within group sum of squares (Kaufman and Rousseeuw,
1990). Constraints (2) ensure that each observation is assigned to a cluster
group. Constraints (3) impose integer restrictions on decision variables.
An important element of the k-means algorithm is the use of a squared
Euclidean distance measure. Murray and Grubesic (2002) note that the non-
linear form of objective (1) means that linear programming based
approaches are not possible for optimally solving k-means. As a result,
alternating heuristics must be implemented to solve k-means. This means
that the k-means approach is likely to find a local, rather than global optima
(Grubesic and Murray, 2001). More importantly, the use of a distance-
squared measure d2ik , versus a straight Euclidean distance measure dik, is
problematic for spatial applications. Specifically, Murray and Estivill-
Castro (1998) note that a distance-squared function gives too much
importance to spatial outliers. Further, it drastically skews differences
between actual distances (e.g. linearly increasing, real-world, physical dis-
tance) and modeled distance (exponentially increasing).
The aforementioned issues with the k-means approach have clear
implications for crime-hot-spot detection. The presence of spatial outliers
can impact the degree of robustness associated with delineated hot-spots,
particularly if those hot-spots are identified with a k-means heuristic; i.e.
hot-spot locations generated with a k-means approach can be skewed spa-
tially (Grubesic and Murray, 2002).4 More importantly, from an operational
4
k-means is one of the clustering methods used in CrimeStat 2 for hot-spot detection.
84 Tony H. Grubesic
perspective, such biases can mislead and distort the assignment of resources
(e.g. police patrols). Given the budget constraints that most police depart-
ments are facing, this is a significant concern. Further, efforts in crime
forecasting (Gorr and Harries, 2003) or prospective hot-spotting (Bowers
et al., 2004) must be built on a foundation of strong, unbiased empirical
results, stemming from recursive crime analysis. This clearly reiterates the
need for a careful evaluation and choice of statistical techniques, particu-
larly when using spatial data for analysis. Moreover, it underscores the
importance of accurate hot-spot generation and interpretation.
Given the need for spatial accuracy, and the problems associated with
the distance metric in the k-means heuristic, there is a need to implement a
clustering algorithm that is not geographically biased to support hot-spot
detection and delineation. A viable option is the median clustering algo-
rithm. This approach differs from the both the k-means and central point
(Murray, 1999) clustering approach in several important ways. First, instead
of using an artificial point, such as the centroid generated in the k-means
procedure, to identify spatial clusters, the median approach uses the actual
observations (e.g. crime events), to identify clusters (Hakimi, 1964; Murray
and Estivill-Castro, 1998; Vinod, 1969). The following notation will be
utilized in the specification of this alternative approach:
i = index of observations;
j = index of potential medians;
ai = attribute weight of observation i (i.e. the number of crimes at a given
location);
dij = distance between observation i and potential median j;
p = number of cluster medians to be selected;
n
1 if cluster median j is selected
xj ¼
0 otherwise
n
1 if observation i is assigned to cluster median j
zij ¼
0 otherwise
Median clustering problem (MCP)
XX
Minimize Z ¼ ai dij zij ð4Þ
i j
Subject to
X
zij ¼ 1 for all i ð5Þ
j
zij xj for all i and j ð6Þ
X
xj ¼ p ð7Þ
j
zij ¼ ð0; 1Þ ð8Þ
xj ¼ ð0; 1Þ
From a functional standpoint, this approach is virtually identical to the
partitioning around medoids (PAM) algorithm outlined by Kaufman and
Rousseeuw (1990). The objective (4) of the MCP is to minimize the total
weighted assignment of observations to selected medians. This, in effect,
minimizes the average distance between observations and their assigned
median. Constraints (5) force all observations to be assigned to a median.
Constraints (6) require that a median be selected before it serves as a rep-
resentative location for grouping observations. Constraint (7) specifies that
p clusters are identified and constraint (8) imposes integer restrictions on the
decision variables.
It is important to note that the objective function of this model is linear,
which provides significant computation advantages over the k-means
approach (Murray and Estivill-Castro, 1998). Second, the use of a Euclid-
ean distance metric in the median model provides two superior features. Not
only can the distances be calculated prior to running the model, the
Euclidean metric also provides a real physical interpretation of distance,
representing travel, transportation or movement. This more realistic rep-
resentation of geographic space is critical for spatial cluster detection and
hot-spot delineation.
2.3. Problems in Application

Although both approaches outlined in this section have their associated
strengths and weaknesses, there remains a fundamental problem with the
entire family of non-hierarchical/partition-based approaches. Regardless of
the approach selected, observations are subject to an all-or-nothing
assignment routine. As noted previously, this is problematic for spatial
analysis and hot-spot detection, particularly if there are outliers or inter-
mediate observation locations. In both cases, the assignment of a single
outlier or intermediate observation to a cluster can spatially bias the
86 Tony H. Grubesic
composition of a cluster. For example, Fig. 2 illustrates a dataset with two

outliers and an intermediate observation. Observation 5 is equidistant be-
tween clusters 1 and 2. Partition-based cluster analysis forces the all or
nothing assignment of this observation to one of the clusters, even though it
belongs equally to both. Where the outliers are concerned (obs. 10 and 11),
they will be assigned to nearest cluster (1 and 2 respectively). Again, this
highlights the problem with non-hierarchical approaches; outliers can bias
the composition of a spatial cluster.
The aforementioned problems with non-hierarchical approaches for
hot-spot detection help to fuel several myths in the public and private
sectors. For example, partition-based approaches are generally viewed as
more difficult to both implement and interpret than other techniques (e.g.
hierarchal cluster analysis and kernel density interpolation) (Chainey and
Cameron, 2000; Jefferis and Mamalian, 1998; Levine, 2001). Granted,
while elements of non-hierarchical approaches can be difficult to interpret,
partition-based cluster algorithms are widely available in most commercial
statistical packages (e.g. SPSS, SAS, S-Plus, etc.) and the popular free-
ware, CrimeStat 2. The difficulty with interpretation stems from several
factors. Clustering algorithms do exactly what they are asked to
do—splitting n observations into k groups. Once this is accomplished,
analysts are forced to generate some type of visualization of these groups
and make interpretations of the data based on the clustering results.
Visualizing these data and results is typically done in a geographic infor-
mation system (GIS), which helps in the cartographic analysis of potential
hot-spots. However, the approach used for visualizing groups generated in
a partition-based clustering approach can significantly impact the utility of
the analysis. For example, CrimeStat 2 uses standard deviational ellipses
(SDE) for visualizing clusters generated by the k-means approach (Levine,
2001). SDEs are an abstraction of the actual cluster group and are not
necessarily indicative of the spatial arrangement of a cluster or hot-spot
(Levine, 2001). Because hot-spots are a spatial phenomenon, the accurate
representation of these areas is of great importance. Therefore, the use of
SDEs for hot-spot visualization can mislead analysts. Fortunately, more
accurate approaches for visualizing these clusters are available. For
example, a simple pin-map with each observation color-coded to its cor-
responding cluster membership can be an effective visualization technique.
Although this is quite simple, it provides value to analysts because cluster
groupings are easily visible and this type of approach is both easy and
efficient to implement. A second approach might include the generation of
a minimum-bounding polygon for each group. This can provide a crisp
delineation between observations and their cluster groupings. More
importantly, the geometric and spatial characteristics of these polygons
can be used to extract supplementary measures of cluster compactness and

event density (Grubesic and Murray, 2001).5
Given the strengths and weakness associated with partition-based
clustering approaches, the potential influence of spatial outliers and inter-
mediate cases, and the difficulty in visualizing the resulting clusters, it is
important to explore alternative approaches in this family of techniques. The
next section explores a method for cluster detection and hot-spot delineation
known as fuzzy clustering. Not only does it have the ability to deal with
intermediate observations and outliers, fuzzy clustering also holds significant
promise for the analysis of crime data and subsequent hot-spot generation.
3. FUZZY CLUSTERING
As mentioned previously, fuzzy clustering is a generalization of parti-
tioning. Instead of observations being assigned to one and only one cluster,
the fuzzy approach allows for some ambiguity in the data. That is, obser-
vations have a degree of belonging to one or more clusters (Hoppner et al.,
1999). In this way, cluster membership for each observation is spread out
over a range of groups. Fuzzy clustering is used in a wide variety of fields,
including finance (Boreiko, 2003), animal science (Chen et al., 2004), agri-
cultural risk assessment (Zhang, 2004) and meteorology (Liu and George,
2003). Kaufman and Rousseeuw (1990) note that the main advantage of
fuzzy clustering over strictly partition-based methods is that the fuzzy ap-
proach yields much more detailed information on the structure of the data.
As illustrated in Fig. 2, if a data set contains intermediate cases (i.e. inter-
mediate spatial locations) between relatively solid groups, the fuzzy ap-
proach is much better equipped to deal with this structure. The following
notation will be used to specify the fuzzy approach:
i,j = index of observations;

k = index of clusters;
dij = distance between objects i and j;
p = number of cluster groups;
uik = fractional membership of observation i in cluster k.
Fuzzy Clustering Problem (FCP)

PP
u2ik u2jk dij
X i j
Minimize Z ¼ P ð9Þ
k
2 u2jk
j
Subject to
5
These properties will be explored in the next section.
88 Tony H. Grubesic
1 2 6 7
5
3 4 8 9
Cluster 1 Cluster 2
10 11
Fig. 2. Data with outliers and intermediate observations.
X
uik ¼ 1 for all i ð10Þ
k
uik P 0 for all i and k ð11Þ

The objective of the FCP is to minimize the dissimilarity dij, between
observations i and j, whereas uik are the fractional values that provide the
smallest value of the objective function. They indicate, for each observation
i, how strongly it belongs to cluster k. It is important to note that the
dissimilarity index can be either Euclidean or Manhattan distance. Con-
straint (10) requires that each observation have a constant total membership
distributed over the different clusters. Constraint (11) requires that mem-
berships cannot be negative. This is usually normalized to a value of 1. To
phrase this somewhat differently, membership coefficients generated by the
FCP can be interpreted as the probability that an observation belongs to
each group. Finally, the outer sum is over all clusters k, so the objective
function is actually attempting to minimize total dispersion.6
6
There are a few additional quirks to this model. Each pair of observations i,j is encountered
twice because j,i also occurs. As a result, the sum must be divided by two (Kaufman and
Rousseeuw, 1990).
Not surprisingly, some fuzzy clusters are fuzzier than others. If observations
have equal memberships in all clusters, (1/k), the groupings are completely
fuzzy. If the observations have a value of 1 for only one cluster each, the
groupings are completely hard (partition based). Therefore, in order to
measure the degree of fuzziness in cluster groupings, one can implement
Dunn’s partition coefficient (1976), which is expressed by:
XX
Fk ¼ u2ik =n ð12Þ
i k
Expressed differently, this simply means that for a completely fuzzy clus-
tering, all uik will take a minimal value of 1/k. A completely hard partition
yields the maximal value of Fk=1. There are several important points worth
noting in this model. First, the FCP is a generalized version of the MCP
(Kaufman and Rousseeuw, 1990). Therefore, this approach is still
attempting to minimize the aggregate distance between observations and
potential medians when generating clusters.7 Second, the FCP is not for-
mulated explicitly as a location model, as was done for the k-means and
MCP in this paper. The basic difference is that a parameter for weighting
observations is not included in the objective function. The FCP is simply
concerned with minimizing the measure of dissimilarity, which in this case,
corresponds to the distance metric.
The following section explores the differences in these three approaches
to cluster detection, focusing on the utility of a fuzzy approach for hot-spot
detection. This includes both a functional and graphical comparison of these
clustering methods. A method for determining the relative strength of
cluster composition will also be highlighted.
4. EMPIRICAL RESULTS
Comparative results for the three clustering approaches will be pre-
sented for a spatial application across a range of k cluster values using a
Pentium 4/2.53 GHz computer. Application data contain 613 crime events
from the Pleasant Ridge neighborhood of Cincinnati, Ohio, for the year
2003. These data were acquired from the Cincinnati Police Department and
geocoded using Centrus Desktop 4.0. ArcGIS version 8.2 was implemented
to manage, manipulate and analyze the crime observations. NCSS 2004 was
utilized for all statistical calculations, including the clustering approaches.8
7
The FCP can be solved using iterative approach that stops when the objective function con-
verges (Kaufman and Rousseeuw, 1990).
8
NCSS (http://www.ncss.com) limits the sample size for both the MCP and FCP to 1000
observations.
90 Tony H. Grubesic
Figure 3 illustrates the study area for this analysis. All geographic data
layers are projected to the Ohio State Plane coordinate system using North
American Datum 1983. The units are measured in feet. Boundary polygons
correspond to a specific police district, beat and zone for Cincinnati. In this
case, the Pleasant Ridge neighborhood is located approximately 8 miles
northeast of the central business district in District 2, Beat 3, Zone 1. As
mentioned previously, there were 613 crime events in this area during 2003,
ranging from petit theft to murder.
4.1. k-Means
Cluster model solutions for the k-means approach are displayed in
Fig. 4. The column representing ‘‘percent variation’’ in Fig. 4 provides the
within-cluster sum of squares for k, as a percentage of the within sum of
squares with no clustering. In other words, lower values suggest less spatial
variation between members in the cluster groups. This is reflected in the
gradual decline of the variance in Fig. 4; as the number of k (groups)
Fig. 3. Study area.

60 60
Change
Variation
50 50
40 40
Percent
30 30
20 20
10 10
0 0
2 3 4 5 6 7 8 9 10 11 12 13 14
k
Change Variation
Fig. 4. Spatial configuration of k-means solution (k=8).
increases, within cluster variation decreases. As mentioned previously, at-

tempts to evaluate the quality of the k-means output are difficult, particu-
larly where optimal cluster configurations are concerned. For example, how
many clusters should be generated for a given set of observations? A general
rule of thumb for the k-means procedure is to define k where the percent
variation for the within-cluster sum of squares stops decreasing dramatically
(Hintze, 2001). In this case, Fig. 4 suggests that k=8 might be a good
stopping point, because the sum of squares variance is no longer displaying
dramatic changes. The spatial configuration of k=8 is displayed in Fig. 5.
A convex hull is utilized to delineate each of the eight clusters.
The convex hull (i.e. minimum bounding polygon) has some interest-
ing geometric and computational properties that are worth mentioning
(Graham, 1972; Jarvis, 1973; Preparata and Shamos, 1985). First, convex
hulls are the most compact geometric shape (convex set) to enclose a discrete
set of points. This property assists in the delineation of clusters by providing
a crisp boundary between groups. Second, polygons generated in this pro-
cess also represent a ‘‘minimum’’ area for totally enclosing a set of discrete
points with straight lines. As a result, cluster groups delineated with smaller
convex hulls are more spatially compact than those with larger convex hulls.
Third, density measurements for convex hulls and their associated point sets
92 Tony H. Grubesic
Hull 3 (433.33)
Hull 4 (775.36)
Hull 1 (1222.22)
Hull 5 (1762.37)
Hull 6 (371.42)
Hull 2 (612.36)
Hull 8 (580.35)
( ) Hull Density
Hull 7 (285.71)
Fig. 5. k-Means sum of squares variance and percent change.
are both informative and easy to calculate. One simply divides the area of a
convex hull by the total number of points enclosed by that same hull. Higher
density ratios suggest elevated crime rates while lower density ratios suggest
the opposite.
In the case of Fig. 5, simple geometric calculations reveal that both
Hull 1 (i.e. Cluster 1) and Hull 5 (i.e. Cluster 5) are areas of elevated crime.
Hull densities are 1222 and 1762 crimes per square mile, respectively. It is
interesting to note that the spatial distribution of these crimes and the areas
defined by each of these hulls are quite different. For example, Hull 1 is
more linear in nature, following a major commercial thoroughfare
(Losantiville Road) while Hull 5 constitutes a much greater expanse,
incorporating several major roads and commercial centers. Cartographic
investigation suggests several other areas where concentrations of crime
exist. For example, both Hull 2 and Hull 4 have relatively dense cores of
crime events. However, the cluster groups and convex hulls encompass a
relatively large area, making hull densities (612 and 775 crimes per square
mile, respectively) much lower.
This, in summary, is what makes the all-or-nothing assignment of crime
events in non-hierarchical cluster analysis problematic for hot-spot detec-
tion and analysis. In the case of Hull 4, because each crime event must be
assigned to one and only one cluster, outliers and intermediate cases can
both skew and hide the true spatial composition of hot-spots.
4.2. MCP
Figure 6 displays the results the median (medoid) based approach
outlined in Section 2, which utilizes a standard Euclidean distance measure.
Specifically, Fig. 6 displays the diagnostic statistics for this approach, which
includes the weighted distance and silhouette values for a series of k values.
Silhouettes are both graphical and quantitative diagnostics for cluster
memberships introduced by Rousseeuw (1987). Each cluster in the median/
medoid based approach is represented by a single silhouette. Silhouettes are
created by measuring the average dissimilarity of each object in a cluster
group to all other objects in the cluster group. The silhouette values s(i)
representing the membership of an event to its assigned cluster are calcu-
lated for each object i. The value for s(i) always lies between )1 and 1. This
value may be interpreted as follows:
sðiÞ 1 ) object i is well classified.
sðiÞ 0 ) object i is an intermediate case.
sðiÞ 1 ) object i is poorly classified
The overall quality of cluster groups is measured by the average silhouette

value for each grouping, that is, larger values suggest better classification.
0.5
0.45
Average Dissimilarity
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
4 5 6 7 8 9 10 11 12 13 14 15
k
Fig. 6. Silhouette diagnostics for MCP.

94 Tony H. Grubesic
Table I. FCP Diagnostics
Normalized Dunn’s
Number of clusters Average distance Average Silhouette partition coefficient (Fk)
4 154.121739 0.342155 0.5009

5 134.988543 0.347203 0.4761
6 119.443555 0.370325 0.5158
7 108.503284 0.371979 0.5049
8 100.140005 0.312643 0.4730
9 93.201119 0.402572 0.5089
10 83.513122 0.361899 0.5095
11 77.722786 0.341587 0.5059
12 77.118596 0.370702 0.5102
13 68.453108 0.389626 0.5342
14 63.960769 0.404988 0.5512
15 66.30627 0.360326 0.5276
Both the individual values and average values can be plotted for visual
comparison (see Kaufman and Rousseeuw, 1990). The graphed values in
Fig. 6 suggest that k=8 (0.4599) is a good choice for revealing a relatively
natural grouping of the crime events. Results also suggest that k=14
(0.4656) might be another good choice. These grouping schemes represent
the two highest average silhouette values for the crime data (Fig. 6). For
illustrative purposes, k=8 will be utilized for a comparative analysis.
Figure 7 displays the spatial characteristics of this grouping. Although
the spatial differences between the k-means and MCP approaches are evi-
dent, there are still problems with the MCP solution for the delineation and
interpretation of crime hot-spots. The most significant hull densities still
belong to Hull 1 (1222) and Hull 5 (1453). However, similar to the k-means
approach, the median clustering model requires that each observation be
included in a group, with an all-or-nothing assignment. So, even though the
median approach accounts for geographic space in a much more realistic
fashion, spatial outliers must still be assigned to a cluster, regardless of
whether or not they actually contribute any interpretive value to a solution.
Perhaps the best examples are illustrated in Fig. 7, for Hull 7 and Hull 4. At
the far southern end of the Hull 7 grouping exists a series of five crime events
near an interstate highway exchange. These events drastically skew the
spatial composition of the convex hull. For example, if the five events in
question were completely removed from the analysis, the modified hull
would consist of 0.194 square miles (0.276 original) and a new hull density
of 376 versus 292 in the original. This represents a 22.34% change in density
value. Although these outliers create some interpretive difficulties in iden-
tifying hot-spots, one would not want to arbitrarily remove crime events in
Table II. FCP Output
Generalized
Event Offense Address FCP assignment Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8
1 Grand Theft 2920 Highland DR 8 0.1247 0.1076 0.1460 0.0419 0.0450 0.0773 0.0593 0.3981*
2 Breaking and 3467 Kimberly CT 2 0.0259 0.4047* 0.0408 0.0316 0.1623 0.1324 0.1556 0.0467
Entering
3 Breaking and 6011 Montgomery Rd 6 0.0080 0.0353 0.0155 0.0123 0.0119 0.8441* 0.0664 0.0064
Entering
4 Vehicle Theft 3154 Troy Av 7 0.0222 0.0601 0.0291 0.2139 0.2058 0.1553 0.2973* 0.0163
5 Criminal 6334 Montgomery Rd 5 0.0018 0.0088 0.0025 0.0058 0.9438* 0.0108 0.0246 0.0019
Damaging
Application of Fuzzy Clustering for Crime Hot Spot Detection
*Denotes maximum probability value.

95
96 Tony H. Grubesic
order to ‘‘cook’’ the data for a stronger result. This is why the all-or-nothing
assignment requirements of partition-based cluster analysis present a
problem. Every event must be evaluated and assigned to a cluster, regardless
of its geographic location or its potential contribution to hot-spot delinea-
tion. Partition-based algorithms are not equipped to handle spatial clusters
in a meaningful way, particularly where hot-spot detection and delineation
are concerned. Fortunately, the ability to handle outliers is a strength of the
FCP.
4.3. FCP
Table I displays the statistical diagnostics for the FCP solution. Similar
to the k-means and MCP approaches, the goal is to minimize within-group
variation and maximize between-group variation. Both the MCP and FCP
utilize average distance as the measure of within-group variation and the
average silhouette value as a measure of cluster quality. From an interpretive
Hull 3 (433.33)
Spatial Outlier
Hull 1 (1222.22)
Hull 4 (834.86)
Hull 5 (1453.70)
Hull 6 (389.38)
Hull 2 (639.05)
Hull 8 (540.14)
Spatial Reduced Hull

Hull 7 (282.60) (yields a 22.34% increase
Outliers
in crime density)
( ) Hull Density
Fig. 7. Spatial configuration of MCP solution (k=8).

standpoint, lower average distance and higher average silhouette values

suggest good clusters. A unique output of the FCP is highlighted in the third
column of Table I, a normalized version of the Dunn’s partition coefficient,
Fk. This normalized version ranges from 0 to 1, with 0 indicating a com-
pletely fuzzy cluster and 1 indicating a completely hard cluster. Therefore,
higher average values for Fk suggest ‘‘harder’’ groupings. This, combined
with average silhouette values provides guidance for selecting the appro-
priate value of k. In the case of the Pleasant Ridge crime data, Table I
suggests that either 9 clusters or 14 clusters would be good selections.
However, for comparative purposes and overall consistency, k=8 will be
used for discussion.9
Figure 8 displays the FCP solution for k=8. The spatial configuration
of this solution is quite different than k=8 for the MCP and k-means ap-
proaches, although more subtle differences are not readily apparent. First
and foremost, Fig. 8 represents the generalized assignment of the FCP ap-
proach. Each crime event is assigned to a cluster group based on its prob-
ability of belonging to that cluster. These values are calculated for each of
the k groups in the FCP. Assignment to a cluster group is made based on the
maximum probability value for each crime event. Table II displays an
example of the FCP output for five crime events. In the case of Event 5,
criminal damaging, the FCP determined a 94% probability of membership
in Cluster 5. The remaining possibilities (i.e. other clusters) have negligible
probability values, so the assignment of Event 5 is made to Cluster 5. This is
the fundamental difference between the FCP and the MCP and k-means
algorithms. The FCP approach provides a more detailed snapshot of data
structure. As a result, analysts are able to evaluate each observation indi-
vidually, if needed, to make a more informed decision on cluster member-
ships and potential hot-spot memberships.10
One method for visualizing the output of the FCP approach is through
the spatial interpolation of probability values assigned to each crime event.
Although a wide variety of approaches are available (e.g. kriging, inverse
distance weighted, etc.) one of the more graphically appealing approaches is
a spline. Figure 9 displays a cluster membership probability surface for
Cluster 5. As one would expect, the surface displays higher values near the
center of the crime events, which corresponds to the median location for
Cluster 5. Probability values begin to taper off as distance from the median
9
k=8 is clearly not the optimal solution for the fuzzy cluster analysis. However, this choice will
help readers make better and more informed visual comparisons between the FCP, k-means
and MCP results.
10
Similar to the MCP approach, silhouette values can be used to identify the strength/quality of
the derived clusters.
98 Tony H. Grubesic
Hull 3 (433.33)
Hull 4 (685.71)
Hull 1 (337.66) Hull 5 (1649.12)
Hull 8 (2250.00)
Hull 2 (756.75)
Hull 6 (438.91)
Hull 7 (278.91)
( ) Hull Density
Fig. 8. Spatial configuration of FCP solution (k=8).
increases. In this case, only values ranging between 32.1% and the maxi-
mum value of 95.5% are displayed. It is important to note that this surface
is not a kernel density map for a hot-spot analysis; it simply represents a
probability surface for membership in Cluster 5 based on the FCP.
Figure 10, however, utilizes the probability data for a more meaningful
interpretation of areas with elevated crime. Once again, the geometric
properties of the convex hull are leveraged for hot-spot delineation and
interpretation. Figure 10 displays two sets of convex hulls. The larger geo-
graphic set represents the generalized assignment of each observation with
the FCP algorithm. Specifically, these hulls bound each crime event to their
corresponding spatial clusters using the maximized probability value for
each observation. The crime density statistics for each of these hulls is
displayed in the ‘‘original density’’ column in Fig. 10. The second set of
convex hulls, which are clearly smaller in geographic extent, highlight the
differences in probability values as they range from 0 to 1 for each cluster. In
this case, only crime events with a probability value of 50% or higher for
their respective clusters are enclosed by the convex hull. This serves two
functions. First, it effectively eliminates spatial outliers in the analysis. In the
case of Hull 7, the five observations previously mentioned are no longer
Fig. 9. Membership probability surface for cluster group 5.
included in the modified (i.e. weighted) convex hull. The same can be said
for the spatial outlier that was highlighted in Hull 4 (Figure 7). This process
is one way to make an objective decision on the elimination of spatial
outliers for hot-spot detection. Instead of arbitrarily removing observations
to uncover areas of elevated crime density that were masked by the influence
of spatial outliers, one can actually make an objective decision based on the
calculated membership probabilities. Further inspection of Figure 10 re-
veals that many of the observations in the original hulls are no longer
included in the weighted versions. Clearly, this is a function of the cut-off
point of 50% for observation inclusion. If the probability cut-off point for
the weighted hulls was reduced to 30 or 40%, more observations would (and
could) be included. This level of flexibility is an attractive feature of the
FCP, providing the analyst an opportunity to make an informed decision on
which values make the best cut-off points. The second important function of
the weighted hulls is the ability to recalculate crime density and delineate
100 Tony H. Grubesic
Hull 3
Hull 4
Hull 1 Hull 5
Hull 8
Hull 2
Hull 6
Hull 7
Hull Density Comparison

Hull Original Density Weighted Density Percent Change
1 337.66 859.64 154.59
2 756.75 1085.71 43.47
3 433.33 486.84 12.35
4 685.71 1379.31 101.15
5 1649.12 2807.69 70.25
6 438.91 730.76 66.49
7 278.91 482.14 72.87
8 2250.00 2500.00 11.11
Fig. 10. Compariation between original and weighted convex hulls for hot-spot analysis.
hot-spots. In this case, the new density values are in the second column of
the chart included in Figure 10, with the percent change highlighted in the
third column. The most significant changes occurred in Hulls 1 and 4. Hull 1
displays a 154% increase in crime density. Is this indicative of a hot-spot?
Not necessarily. The weighted density is still far lower than comparable
densities in Hulls 5 and 8. However, it is interesting to note the radical
change in Hull 4. A 101% increase, with a weighted density of 1379, clearly
indicates the presence of a crime hot-spot at the neighborhood level for
Pleasant Ridge. Not only is this the third highest value of all cluster groups
in the application, the dense core of crime events for Hull 4 was masked by
the original hard-cluster assignment of the MCP and k-means approaches.

More importantly, the generalized assignment of the FCP, which utilized the
maximum probability values, also hid the core density associated with Hull 4.
It was not until the complete information from the probability scores gen-
erated by the FCP were used that the area of elevated crime in Hull 4 was
apparent.
5. DISCUSSION AND CONCLUSION

The results presented in Section 4 highlight several important points
worth discussion. First, both solution quality and the spatial configuration
of clustering solutions can vary, with much of this variation due to the type
of approach used. Although many non-hierarchical clustering algorithms
are attempting to perform the same task, the way in which the algorithm
approaches the task can have a significant influence on the results. One of
the clearest examples of this is the use of a squared Euclidean distance
metric (d2ik ) versus a standard Euclidean distance measure (dik) (Murray and
Grubesic, 2002). As a result, the k-means (d2ik ), MCP (dik) and FCP (dik)
solutions presented in this paper had clear differences in spatial structure.
Corroborating evidence was also provided by the cluster analysis diagnos-
tics, suggesting that some of the solutions generated by a clustering algo-
rithm are better than others. Although this paper only highlights a select set
of relatively simple techniques for evaluating cluster solution quality, there
are a variety of more rigorous approaches available (Gordon, 1996; Kaufman
and Rousseeuw, 1990; Milligan and Cooper, 1985; Milligan and Mahajan,
1980). Second, although the NCSS statistical package limits sample size to
1000 for both the MCP and FCP approaches, basic aggregation routines
could allow for significant flexibility in a study area. For example, if 10,000
crime events are aggregated to 1000 census blocks for analysis, there is a
moderate loss in spatial resolution, but a relatively robust hot-spot analysis
can still be performed. However, as mentioned previously, observations that
constitute a cluster at the neighborhood level do not necessarily constitute a
cluster at the regional level. Widely known as the modifiable areal unit
problem, or MAUP, scale of study is an important consideration when using
statistical tests for exploring spatial data, particularly crime hot-spots
(Openshaw, 1984; Ratcliffe and McCullagh, 1999; Unwin, 1996). Local
knowledge of the study area is also important when making decisions
regarding data aggregation, the number of clusters to select and the overall
quality of the resulting cluster groupings (Harries, 1999; Levine, 2001).
Nothing replaces experience in the field or a strong familiarity with the data.
The actual comparative results between the three non-hierarchical
clustering approaches also proved to be useful. As mentioned throughout
the paper, partition-based clustering algorithms are not able to handle

spatial outliers or intermediate cases particularly well. The all-or-nothing
assignment of observations is problematic for data with some degree of
ambiguity. Specifically, the requirement to include all observations in a
cluster can actually hide areas of elevated density for hot-spot detection. In
this way, crime data and hot-spots can be prone to misinterpretation. In-
deed, hot-spots are perceptual constructs, because crime densities are mea-
sured over a continuous area. However, non-hierarchical cluster analysis is
actually a process of discretization that delineates continuous space into
specific zones (e.g. clusters). In part, efforts to categorize and delineate areas
of elevated crime are fueled by the need to better understand spatial patterns
and to allocate resources more effectively. However, by forcing observations
into an all-or-nothing assignment routine, much of the richness associated
with spatial data is lost. Therefore, a generalized partitioning algorithm,
such as the FCP, can provide the best of both-worlds—a way to explore
spatial patterns in crime data with an added level of flexibility in cluster
assignment that hard-clustering approaches cannot provide. In turn, this
type of analysis can afford additional insight into the spatial characteristics
of crime hot-spots.
This paper provides an empirical investigation on the utility of fuzzy
cluster analysis for crime hot-spot detection. The results suggest that the
geometric properties of convex hulls are useful when combined with the
results from partition-based cluster analysis in the delineation of crime hot-
spots. Specifically, the implementation of a weighted convex hull is useful
for delineating hot-spots, particularly when combined with the probability
values from the FCP approach. Results also indicate that fuzzy clustering is
a better technique for handling intermediate points and spatial outliers when
compared to other non-hierarchical clustering algorithms used for spatial
applications. Finally, the empirical findings of this study demonstrate that in
many applications, fuzzy clustering offers a more realistic approach to the
delineation of crime hot-spots in urban settings.
REFERENCES
Ackerman, W. V., and Murray, A. T. (2004). Assessing spatial patterns of crime in Lima, Ohio.
Cities 21(5), 423–437.
Aldenderfer, M., and Blashfield, R. (1984). Cluster Analysis, Sage Publications, Beverly Hills.
Arabie, P., and Hubert, L. (1996). An Overview of Combinatorial Data Analysis: Clustering and
Classification, World Scientific Publishers, Singapore.
Bailey, T. C., and Gatrell, A. C. (1995). Interactive Spatial Data Analysis, Longman, Harlow.
Belbin, L. (1987). The use of non-hierarchical allocation methods for clustering large sets of
data. Aust. Comput. J. 19: 32–41.
Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum
Press, New York.
Boreiko, D. (2003). EMU and accession countries: Fuzzy cluster analysis of membership. Int. J.
Finance Econ. 8(4), 309–325.
Bowers, K. J., Johnson, S. D., and Pease, K. (2004). Prospective hot-spotting: The future of
crime mapping? Br. J. Criminol. 44: 641–658.
Brantingham, P. J., and Brantingham, P. L. (1981). Environmental Criminology, Sage Publi-
cations, Beverly Hills.
Brimicombe, A. J. (2003). A variable resolution approach to cluster discovery in spatial data
mining. In Kumar, V. et al. (eds.), ICCSA 2003.
Chainey S., and Cameron J. (2000). Understanding Hot Spots. Presentation prepared for the
2000 Crime Mapping Research Center Conference. San Diego, CA.
Chen, G. H., Wu, X. S., Wang, D. Q., Qin, J., Wu, S. L., Zhou, Q. L., Xie, E., Cheng, R., Xu,
Q., Liu, B., Zhang, X. Y., and Olowofeso, O. (2004). Cluster analysis of 12 Chinese native
chicken population using microsatellite markers. Asian-australas. J. Anim. Sci. 17(8), 1047–
1052.
Cohen, L. E., and Felson, M. (1979). Social change and crime rate trends: A routine activity
approach. Am. Sociol. Rev. 44: 588–607.
Craglia, M., Haining, R., and Wiles, P. (2000). A comparative evaluation of approaches to
urban crime pattern analysis. Urban Stud. 37(4), 711–729.
Craglia, M., Haining, R., and Signoretta, P. (2001). Modelling high-intensity crime areas in
English cities. Urban Stud. 38(11), 1921–1941.
Estivill-Castro V., and Murray A. T. (2000). Hybrid optimization for clustering in data mining.
CLAIO 2000 on CD-ROM, IMSIO, Mexico.
Fisher, W. (1958). On grouping for maximum homogeneity. J. Am. Stat. Assoc. 53: 789–798.
Gatrell, A. C., and Rowlingson, B. S. (1994). Spatial point process modeling in a geographical
information system environment. In Fotheringham, S., and Rogerson, P. (eds.), Spatial
Analysis and GIS, Taylor and Francis, London.
Gatrell, A. C., Baley, T. C., Diggle, P. J., and Rowlingson, B. S. (1996). Spatial point pattern
analysis and its application in geographical epidemiology. Trans. Inst. Br. Geogr. 21: 256–
274.
Gordon A. D. (1996). How Many Clusters? An investigation of five procedures for detecting
nested cluster structure. In Forer, P., Yeh A., and He, J. (eds.), Proceedings of 9th Inter-
national Symposium on Spatial Data Handling. Beijing International Geographical Union.
Gordon, A. D. (1999). Classification, Chapman and Hall, New York.
Gorr, W., and Harries, R. (2003). Introduction to crime forecasting. Int. J. Forecast. 19: 551–
555.
Graham, R. (1972). An efficient algorithm for determining the convex hull of a finite point set.
Info. Proc. Letters 1: 132–133.
Greenburg, S., and Rohe, W. (1984). Neighborhood design and crime. J. Am. Plann. Assoc. 50:
48–61.
Grubesic T. H., and Murray A. T. (2001). Detecting Hot Spots Using Cluster Analysis and GIS.
Fifth Annual International Crime Mapping Research Conference. Dallas, TX.
Grubesic, T. H., and Murray, A. T. (2002). Imperfect Spatial Information: Implications for
Crime Mapping and Analysis, Sixth Annual International Crime Mapping Research Con-
ference, Denver, CO.
Hakimi, S. L. (1964). Optimum locations of switching centers and the absolute centers and
medians of a graph. Oper. Res. 12: 450–459.
Hansen, P., and Jaumard, B. (1997). Cluster analysis and mathematical programming. Math.
Program. 79: 191–215.
Harries, K. (1999). Mapping Crime: Principle and Practice, National Institute of Justice (NCJ
178919), Washington, DC.
Hintze J. (2001). NCSS and PASS. Number Cruncher Statistical Systems. Kaysville, Utah.
Hoppner, F., Klawonn, F., Kruse, R., and Runkler, T. (1999). Fuzzy Cluster Analysis: Methods
for Classification, Data Analysis and Image Recognition, John Wiley, West Sussex.
Jarvis, R. A. (1973). On the identification of the convex hull of a finite set of points in the plane.
Info. Proc. Letters. 2: 18–21.
Jefferis E. S., and Mamalian C. A. (1998). Crime Mapping Research Center’s Hot Spot Project.
The Second Annual Crime Mapping Research Conference, December 1998. Arlington, VA.
Kaufman, L., and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster
Analysis, John Wiley, New York.
Lawson, A. B. (2001). Statistical Methods in Spatial Epidemiology, John Wiley and Sons,
Chichester.
Levine N. (1999). CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident
Locations, version 1.0. Ned Levine and Associates/National Institute of Justice, Washington
DC.
Levine N. (2001). CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident
Locations, version 2.0. Ned Levine and Associates/National Institute of Justice, Washington
DC.
Liu Z. J., and George R. (2003). Fuzzy cluster analysis of spatio-temporal data. Computer and
Information Sciences – ISCIS 2003 – Lecture Notes in Computer Science. 2869: 984–991.
MacQueen J. (1967). Some methods for classification and analysis of multivariate observations.
In Le Cam L. and Neyman, J. (eds.), Proceedings of the Fifth Berkeley Symposium on
Mathematical Statistics and Probability, Vol. I. University of California Press, Berkeley.
Milligan, G. W., and Mahajan, V. (1980). A note on procedures for testing the quality of a
clustering of a set of objects. Decision Sci. 11: 669–677.
Milligan, G. W., and Cooper, M. C. (1985). An examination of procedures for determining the
number of clusters in a dataset. Psychometrika 50(2), 159–179.
Murray, A. T. (1999). Spatial analysis using clustering methods: Evaluating central point and
median approaches. J. Geogr. Syst. 1: 367–383.
Murray, A. T. (2000). Spatial characteristics and comparisons of interaction and median
clustering models. Geogr. Anal. 32: 1–19.
Murray, A. T., and Estivill-Castro, V. (1998). Cluster Discovery Techniques for Exploratory
Spatial Data Analysis. Int. J. Geogr. Inf. Sci. 12: 431–443.
Murray, A. T., and Grubesic, T. H. (2002). Identifying non-hierarchical spatial clusters. Int. J.
Ind. Eng. Theory Appl. Practice 9(1), 86–95.
Openshaw, S. (1984). The modifiable areal unit problem. Concepts and Techniques in Modern
Geography, Vol. 38, Norwick, Geo Books.
Preparata, F. R., and Shamos, M. I. (1985). Computational Geometry: An Introduction,
Springer-Verlag, New York.
Ratcliffe, J. H., and McCullagh, M. J. (1999). Hotbeds of crime and the search for spatial
accuracy. Geogr. Syst. 1(4), 385–398.
Ratcliffe, J. H., and McCullagh, M. J. (2001). Chasing ghosts: Police perception of high-crime
areas. Br. J. Criminol. 41: 330–341.
Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of
cluster analysis. J. Comput. Appl. Math. 20: 53–65.
Rousseeuw, P., and Leroy, A. (1987). Robust Regression and Outlier Detection, John Wiley,
New York.
Unwin, D. J. (1996). GIS, Spatial Analysis and Spatial statistics. Progr. Hum. Geogr. 20(4),
540–551.
Vinod, H. (1969). Integer programming and the theory of grouping. J. Am. Stat. Assoc. 64: 506–
517.
Zhang, J. Q. (2004). Risk assessment of drought disaster in the maize-growing region of
Songliao Plain, China. Agric. Ecosyst. Environ. 102(2), 133–153.

20559315

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

20559315

Caricato da

Copyright:

Formati disponibili

Journal of Quantitative Criminology, Vol. 22, No.

1, March 2006 ( 2006)

On The Application of Fuzzy Clustering for Crime

One of the fundamental challenges in crime mapping and analysis is pattern

deﬁned, cluster analysis is a technique that seeks to place individual

pro-active policing as opposed to reactive policing. In other words, the

provide a much more detailed snapshot of data structure, allowing analysts

2. A COMPARISON OF CLUSTERING APPROACHES

divisive approach is similar in nature, however, it starts with all of the

2.2. Non-Hierarchical (Partitioning) Cluster Analysis

Each group must contain at least one object.

Fig. 1. A partition with n=16 and k=3.

Further, these conditions require that there are at least as many

yik ¼ ð0; 1Þ ð3Þ

zij xj for all i and j ð6Þ

zij ¼ ð0; 1Þ ð8Þ

2.3. Problems in Application

composition of a cluster. For example, Fig. 2 illustrates a dataset with two

can be used to extract supplementary measures of cluster compactness and

i,j = index of observations;

Fuzzy Clustering Problem (FCP)

Fig. 2. Data with outliers and intermediate observations.

uik P 0 for all i and k ð11Þ

Fig. 3. Study area.

Fig. 4. Spatial conﬁguration of k-means solution (k=8).

increases, within cluster variation decreases. As mentioned previously, at-

Fig. 5. k-Means sum of squares variance and percent change.

The overall quality of cluster groups is measured by the average silhouette

Fig. 6. Silhouette diagnostics for MCP.

Table I. FCP Diagnostics

4 154.121739 0.342155 0.5009

*Denotes maximum probability value.

Spatial Reduced Hull

Fig. 7. Spatial conﬁguration of MCP solution (k=8).

standpoint, lower average distance and higher average silhouette values

Hull 1 (337.66) Hull 5 (1649.12)

Fig. 8. Spatial conﬁguration of FCP solution (k=8).

Fig. 9. Membership probability surface for cluster group 5.

Hull Density Comparison

the original hard-cluster assignment of the MCP and k-means approaches.

5. DISCUSSION AND CONCLUSION

the paper, partition-based clustering algorithms are not able to handle

Potrebbero piacerti anche