Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
We use the burglary data (FBI code 05) for year 2014. There are 14306 events, each with time t i
and location ( x i , y i ) .
Model :
( x , y , t)= (x , y )+ r ( xx i , y y i) t (tt i )
i
1
ai e(x x ) +( y y ) / L where T is
2
2L T i
the total duration of the dataset (here 365 days). The two kernels t and r , as well as the
background weights ai , are to be inverted. The smoothing length L is also to be optimized. We
here follow the approach of Marsan and Lenglin (2008) and use a simple histogram distribution for
the two kernels: t (t )=b k for T k t < T k +1 , and r (r )=c k for Rk r< Rk+ 1 .We use the
following discretization in time and distance :
T ={ 0 ; 0.1; 0.2; 0.5 ; 1 ; 2 ; 3 ; 4 ; 5 ; 7 ; 10 ; 15 ; 20 ; 30 ; 50 ; 100 } days, and
R= { 0 ; 0.1 ; 0.2 ; 0.3 ; 0.4 ; 0.5 ; 0.7 ; 1 ; 1.5 ; 2 ; 3 ; 5 ;10 ; 20 } km.
2
Expectation-Maximization algorithm :
Knowing L , the parameters { a i , b k , c k } are inverted by Expectation-Maximization. The
influence of event i on event j is ij = r ( x j xi , y j yi )t (t jt i ) , and the sum of all the
influences of past events on j is j= ij . The background rate density for event j is
i< j
2
2
(xx i ) +( y y i ) / L
all events i , thus including events i> j , and even j itself). We define the probabilities
ij
j
j
j+ j
event linked to the background node i . These probabilities are normalized by
ij+ 0,ij=1 .
i< j
f (a , b , c)= ai
i
that ai = j 0,ij , b k =
i, j
. Maximizing f gives
i , j>i
k
where i , k =T k+1 T k if T t i T k+1 ,
i , k
i
S k 'i
i , j /T t t <T ,T t <T
i, j / R r < R
k
k+ 1
k+1
ij
k+ 1
S k =( R2k+1R2k ) .
Convergence is tested by requiring that all non-zero values bk and c k are changed by less than
ln b k
5 % in logarithm, e.g.,
1 <0.05 , where bk is the value updated during the
ln b k '
unchanged (in particular the background rate-density ( x , y ) is thus estimated from the 2014
data only), but the triggering term r ( xxi , y y i ) t (tt i )
i
is now computed by summing over both 2014 and 2015 data. Remarkably, the log-likelihood is
systematically found to be lower with this approach, see Table 1. This is counter-intuitive, as using
more recent data to update the triggering term is expected to improve the prediction. A closer look
at the time series (Figure 3) shows that there were significantly less events in the first 81 days of
2015 as predicted from the 2014 data. Since including the new 2015 events in the calculation of the
triggering term result in a larger predicted number, doing so only strengthen the over-estimation.
The over-estimation of the number of events in 2015 highlights the fact that, practically speaking,
one would like to predict just where, rather than both when and where, the next event will occur, so
( x , y , t)
that only the predicted marginal density
is of actual interest, instead of the
dx dy (x , y , t)
complete space-time rate-density (x , y , t ) . We therefore introduce a second measure of the
capacity of the model to predict the future locations of the subsequent events, as
g(a , b , c )= ln (x i , y i , t i ) , where the summation is done on the 2015 events only, and the
i
triggering term of (x , y , t) is computed by summing over all preceding events (including those
of 2015). We show in Figure 4 that type 2 models perform better than type 1, but more importantly
that a simple (exponential) smoothing of all the previous events does actually better in predicting
the location of the next event, although the improvement is only marginal. This is particularly
surprising, since accounting for memory in the system should a priori improve the prediction
compared to a memory-less prediction as done with a simple smoothing. This is here due to a
change in the spatial properties of the burglary events in 2015 (compared to 2014), which are found
to be more distant of each other : the mean distance between any two burglaries was 13.58 km in
2014, and 14.05 km in 2015. For both years, consecutive events tend be less distant than average,
but there still exist a significant difference between the two time periods, cf Figure 5. Exploiting the
temporal clustering as done with our models will lead to predicted events to close to the
immediately preceding (past) event, while the simple smoothing will predict a distance slighlty
larger, hence a better prediction.
These results cast strong doubts on the capacity of the models proposed here to outperform simple
hotspot maps obtained by smoothing, for the dataset analyzed. The triggering contribution to the
occurrence of future events is small (it accounts only for 1.7 % for the best model). Accounting for
memory in the system therefore can only provide a very modest contribution to the effectiveness of
the prediction scheme.
More importantly, it is assumed that the dynamics of the process stays the same over time. Possible
non-stationarity of the process is thus clearly an issue, as it will prevent the use of past information
to predict the future. This is for example experienced in this analysis, as 2015 burglary events are
clearly not distributed (in time and in space) as they were in 2014. This non stationarity is likely due
to uncontroled evolutions in the way these acts are performed, but, in situations were new
prediction algorithms are set up and exploited by police patrols, could also be a response by
burglars to such a change. Unlike natural processes like earthquakes, analyses like the one presented
here could therefore have the ability to modify the observed process, making it more difficult to
correctly predict future events.
L (km)
0
Difference in
log-likelihood
0.01
0.02
0.05
0.1
0.2
0.4
100 %
99.9 %
98.7 %
98.1 %
93.5 %
73.5 %
45.8 %
<10-15
<10-7
-0.012
-0.017
-0.056
-0.17
-0.33
Figure 2 :Interaction kernels t (top graphs) and r (bottom graphs) for model type 2
with L=0.1 km. The two dashed lines show power-laws with exponents -1.5 (for t )
and -7 (for r ).
Figure 3 : number of events (in blue) and predicted number, using (magenta) or not using
(green) the 2015 events in the triggering term summation.
Figure 5 : mean distance between pairs of events separated by (n-1) events, for
the two time periods analyzed separately.