Automatic Estimation of Crowd Density Using Texture

“Automatic estimation of crowd density using texture”, International Workshop on Systems and Image Processing, IWSIP’97, May 28-30,
Poland.
Automatic Estimation of Crowd Density Using Texture

A.N. Marana†, S.A. Velastin‡, L.F. Costa♣, R.A.Lotufo♠
†DEMAC,IGCE, UNESP, Rio Claro, SP, Brazil

e-mail: nilceu@demac.igce.unesp.br
‡EEE, King's College London, London, UK
e-mail: sergio.velastin@kcl.ac.uk
♣IFSC, USP, São Carlos, SP, Brazil
e-mail: luciano@ifqsc.sc.usp.br}
♠DCA, FEE, UNICAMP,Campinas, SP, Brazil
e-mail: lotufo@dca.fee.unicamp.br
Abstract: for the simultaneous monitoring of many different

areas through an array of television monitors, are
This paper considers the problem of automatic likely to lose concentration. The advantages and
estimation of crowd densities, an important part of the necessity of automatic surveillance for routine crowd
problem of automatic crowd monitoring and control. A monitoring are, therefore, clear.
new technique based on texture description of the
images of the area under surveillance is proposed. Two This paper describes a new technique based on texture
methods based on different approaches of texture description for the problem of automatic estimation of
analysis, one statistical and another spectral, are crowd density.
applied on real images captured in an area of Liverpool
Street Railway Station, London, UK. The results 2. Previous Technique for Automatic Estimation
obtained show that both methods present similar of Crowd Density
general rates of correct estimation, and that the
potential use of texture description for the problem of Davies et al [3] have proposed a technique to estimate
automatic estimation of crowd densities is crowd densities based on two measures extracted from
encouraging. the input image of the area under surveillance. The
first measure is the number of foreground picture
1. Introduction elements computed by subtracting the input image
from a reference image containing no people. The
The management and control of crowds is a crucial second measure is the number of edge picture elements
problem for human safety, since when an accident of the image computed by an edge detection followed
happens where there is congestion of people many by a thinning operation.
lives can be lost [1].
Davies et al verified that there exists a linear
Two important aspects of the problem of correct relationship between the number of people present in
management and control of crowds are the design of the area under surveillance and the two measures,
environments where crowd congestion is expected to which were combined into an “optimal” estimation of
arise and the real-time monitoring of crowds within crowd density through a linear Kalman filter [3].
existing, typically urban, structures.
Despite the success of this technique for estimating
The development of models of crowd behaviour crowd density in areas containing moderate amounts of
provides a basis for informing architects and town- people, it can not be applied successfully in areas with
planners to design safer buildings. Sime [2] reviews high density crowds because the linear relationship
crowd psychology in terms of its relationship to does not hold when there are many superimposed
engineering and crowd safety and stresses the need to people in the image.
validate computer simulations of crowd movement and
escape behaviour against psychological as well as 3. A New Technique for Automatic Estimation of
engineering criteria. Crowd Density
For the problem of real-time crowd monitoring there is Images of crowds with different densities tend to
an established practice of using extensive closed- present distinct texture patterns. High density crowded
circuit television systems. As routine crowd areas are often made up of fine patterns (which
monitoring is tedious, human observers, responsible correspond to high frequencies in the frequency
Poland.
domain), while images of low density crowded areas

are mostly made up of coarse patterns (which This fact, which may be verified in the images
corresponds to low frequencies in the frequency presented in Figure 1, is considered in one of the
domain), especially when their backgrounds are also techniques presented in this paper to estimate the level
made up of coarse patterns. of congestion in areas under surveillance.
The results presented in Section 4 were obtained by

applying a statistical and a spectral method for texture
description of the crowded images. Texture descriptors
provided by such methods are then used by a neural
network, implemented according to the Kohonen's
model [4], in order to estimate the crowd densities. The
estimation is given in terms of discrete ranges such as
very low, low, moderate, high and very high densities.
The number of people for each range and the number
(a) of ranges depends on the specific application and the
particular characteristics of the area being monitored.
3.1 Statistical Texture Analysis
The grey level dependence matrix (GLDM), proposed

by Haralick [5] to carry out texture analysis, is a
statistical method which is based on the estimation of
second-order joint conditional probability density
functions, f (i , j d , θ ) . Each f (i , j d , θ ) is the
(b) probability of the pair of grey levels ( i , j ) occurring in
a pair of pixels of the image, given that these pixels are
separated by a distance d along the direction θ . The
estimated values form a two-dimensional histogram
which can be written in matrix form, the so-called grey
level dependence matrix. For a given pair of
parameters ( d , θ ) , the histogram obtained for fine
(high frequency) texture tends to be more uniformly
dispersed than the histogram for coarse (low
(c) frequency) texture. Texture coarseness can be
measured in terms of relative spread of histogram
occupancy cells about the main diagonal of the
histogram.
For the problem of automatic estimation of crowd

density, four spread indicators for texture
measurements proposed by Haralick have been used in
the new technique: contrast, homogeneity, energy and
entropy. These four measures are obtained from four
GLDMs calculated with parameters ( d , θ ) = (1,0°),
(d)
(1,45°), (1,90°) and (1,135°), summing up 16
measures, which make up the crowd density feature
vector.
3.2 Spectral Texture Analysis
The Fourier spectrum can be expressed in polar co-

ordinates as S ( r , θ ) , where S is the spectrum
function, and r and θ are the variables in this co-
(e) ordinate system. For a fixed value of θ , S ( r , θ ) may
Figure 1: 512x512 images from Liverpool Street be considered a 1-D function Sθ ( r ) , which gives the
Railway station (UK) for various densities: a): Very
low, b): Low, c): Moderate, d): High, e): Very high behaviour of the spectrum along the radial direction
Poland.
given by θ . Similarly, for a fixed value of r , S ( r , θ ) of the crowd density estimation based on texture
measures. The rows of the tables show the distribution
may be considered an 1-D function S r (θ ) , which of the estimation of crowd densities for each group of
gives the behaviour of the spectrum along the circle of crowd density test set. The values of the diagonal
radius r centred on the origin. (printed in bold face) indicate the correct classification
percentages for each group.
Global texture descriptors can be obtained by
integrating such 1-D functions [6]. Therefore, in a As the estimation of crowd density is supervised, a set
discrete case, these descriptors are: of 151 images was used to train the neural network,
π R summing up 299 images.
S( r) = ∑ Sθ ( r ) and S (θ ) = ∑ Sr (θ ) ,
θ =0 r =0
Crowd densities of images of the training and test sets
were manually estimated in advance, in order to
where R is the radius of a circle centred at the origin. establish a comparison standard. Using the manual
For an NxN spectrum, R is typically is chosen as N/2. estimation, the images were separated in groups of
very low density (0-15 people), low density (16-30
For the problem of automatic estimation of crowd people), moderate density (31-45 people), high density
density by using frequency information, spectrum (46-60 people) and very high density (more than 60
values have been used in the new technique to obtain people). Examples of images of such groups are
texture descriptors from the Fourier spectrum (actually, presented in Figure 1.
only 1/3 of the spectrum around the origin is
considered): The results of the estimation based on the statistical
r2 π
GLDM method, presented in Table 1, reached a mean
S ( r1 , r2 ) = ∑ ∑ Sθ ( r )
r = r1 θ = 0
of 82% correct estimation. It is possible to verify that
the results were quite good for all groups except for
θ2 N 3
S (θ1 , θ2 ) = ∑ ∑ S r (θ ) . the one made up of low density crowd images, which
θ =θ 1 r =1 reached only 54% of correct estimation.
Table 2 shows the results obtained by the spectral

Eight ranges of frequencies, defined by pairs ( r1 , r2 ) , method, which reached a mean of 80% correct
and 16 ranges of angles, defined by the pair (θ1 , θ2 ) estimation. The partial estimations obtained by the
are taken, summing up 24 measures, which make up spectral method are worst than the estimations
the crowd density feature vector. obtained by the statistical method, but the general rate
is quite similar. Moreover, the worst estimation
4. Results and Conclusion obtained by the spectral method (72% of the low
density crowd group) is much better than the one
obtained by using the statistical method (54% of the
Density VL L M H VH
low density crowd group).
VL 94% 6%
L 13% 54% 33% A common characteristic presented by both methods is
M 8% 85% 8% that the wrongly estimated densities are always
H 94% 6% assigned to neighbour density groups. This fact
VH 6% 94% contributes to increase the performance of the
Table 1: Results for GLDM method (VL: Very Low, technique of crowd density estimation by using texture
L: Low, M: Moderate, H: High, VH: Very High) description, and is due to the quantisation of crowd
densities in ranges. The ideal would be to get larger
Density VL L M H VH train sets of images in order to increase the resolution
of the estimation, reducing the widths of the ranges of
VL 88% 12%
densities.
L 13% 72% 15%
M 19% 73% 8%
The results of the estimations obtained during the tests
H 11% 83% 6%
allow us to consider both methods successful. While
VH 13% 88%
the statistical method reached quite good correct
Table 2: Results for spectral method (VL: Very Low, estimation rates (around 94%) for most groups, the
L: Low, M: Moderate, H: High, VH: Very High) spectral method presented small discrepancies between
the best and the worst estimation, reaching, in average,
practically the same rates of correct estimation
Tables 1 and 2 show results obtained when 148 images obtained by the statistical method.
captured from an area of Liverpool Street Railway
Station (London, UK) were used to assess the accuracy
Poland.
Acknowledgements
The authors are grateful to Railtrack PLC (London-

UK), for granting access to their sites and to Maria
Alicia Vicencio Silva, who first suggested the use of
texture features to measure crowd densities. Aparecido
Nilceu Marana is also grateful to CNPq
(Proc.200823/95-7), Luciano da Fontoura Costa thanks
FAPESP (Procs.94/3536-6 and 94/4691-5) and CNPq
(Proc.301422/92-13) for their financial help.
References
[1] J.F. Dickie, “Major crowd catastrophes”, Safety

Science, vol. 18, pp. 309-320, 1995
[2] J.D. Sime, “Crowd psychology and engineering”,
Safety Science, vol. 21, pp. 1-14, 1995.
[3] A.C. Davies, J.H. Yin, S.A. Velastin, “Crowd
monitoring using image processing”, Electronics
and Communications Engineering Journal},
February, pp. 37-47, 1995.
[4] T. Kohonen, “`The Self-Organizing Map”,
Proceedings of the IEEE, vol. 78, pp. 1464-1480,
1990
[5] R.M. Haralick, “Statistical and Structural
Approaches to Texture”, Proceedings of the IEEE,
vol. 67, pp. 786-804, 1979
[6] R.C. Gonzales, R. Woods, Digital Image
Processing, Addison-Wesley Publishing Company,
1993.
Poland.
[1] J.F. Dickie, “Major crowd catastrophes”, Safety

Science, vol. 18, pp. 309-320, 1995
[2] J.D. Sime, “Crowd psychology and engineering”,
Safety Science, vol. 21, pp. 1-14, 1995.
[3] A.C. Davies, J.H. Yin, S.A. Velastin, “Crowd
monitoring using image processing”, Electronics
and Communications Engineering Journal},
February, pp. 37-47, 1995.
[4] T. Kohonen, “The Self-Organizing Map”,
Proceedings of the IEEE, vol. 78, pp. 1464-1480,
1990
[5] R.M. Haralick, “Statistical and Structural
Approaches to Texture”, Proceedings of the IEEE,
vol. 67, pp. 786-804, 1979
[6] R.C. Gonzales, R. Woods, Digital Image
Processing, Addison-Wesley Publishing Company,
1993.

Automatic Estimation of Crowd Density Using Texture - 1997

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Automatic Estimation of Crowd Density Using Texture - 1997

Caricato da

Copyright:

Formati disponibili

“Automatic estimation of crowd density using texture”, International Workshop on Systems and Image Processing, IWSIP’97, May 28-30,

†DEMAC,IGCE, UNESP, Rio Claro, SP, Brazil

Abstract: for the simultaneous monitoring of many different

domain), while images of low density crowded areas

The results presented in Section 4 were obtained by

3.1 Statistical Texture Analysis

The grey level dependence matrix (GLDM), proposed

For the problem of automatic estimation of crowd

3.2 Spectral Texture Analysis

The Fourier spectrum can be expressed in polar co-

Table 2 shows the results obtained by the spectral

The authors are grateful to Railtrack PLC (London-

[1] J.F. Dickie, “Major crowd catastrophes”, Safety

[1] J.F. Dickie, “Major crowd catastrophes”, Safety

Potrebbero piacerti anche