Sei sulla pagina 1di 10

www.ietdl.

org

Published in IET Intelligent Transport Systems


Received on 13th November 2008
Revised on 20th August 2009
doi: 10.1049/iet-its.2008.0092

ISSN 1751-956X

Video sensor network for real-time trafc


monitoring and surveillance
T. Semertzidis K. Dimitropoulos A. Koutsia N. Grammalidis
Informatics and Telematics Institute, Centre for Research and Technology Hellas, 1st km Thermi-Panorama Road,
Thessaloniki 57001, Greece
E-mail: ngramm@iti.gr

Abstract: Sensor networks and associated infrastructures become ever more important to the trafc monitoring
and control because of the increasing trafc demands in terms of congestion and safety. These systems allow
authorities not only to monitor the trafc state at the detection sites, but also to obtain real-time related
information (e.g. trafc loads). This study presents a real-time vision system for automatic trafc monitoring
based on a network of autonomous tracking units (ATUs) that capture and process images from one or more
pre-calibrated cameras. The proposed system is exible, scalable and suitable for a broad eld of applications,
including trafc monitoring of tunnels at highways and aircraft parking areas at airports. Another objective of
this work is to test and evaluate different image processing and data fusion techniques in order to be
incorporated to the nal system. The output of the image processing unit is a set of information for each
moving object in the scene, such as target ID, position, velocity and classication, which are transmitted to a
remote trafc control centre, with remarkably low bandwidth requirements. This information is analysed and
used to provide real-time output (e.g. alerts, electronic road signs, ramp meters etc.) as well as to extract
useful statistical information (trafc loads, lane changes, average velocity etc.).

1 Introduction trafc monitoring, the visual tracking problem is


particularly challenging because of illumination of
Increasing trafc demands render necessary the use of background changes, occlusion problems etc.
intelligent systems for monitoring, control and
management of trafc. These systems support trafc data Many commercial and research systems use video
collection mechanisms along with intelligent incident processing, aiming to solve specic problems in trafc
detection and prediction models as well as dynamic monitoring. An efcient application for monitoring and
applications of data analysis techniques for target surveillance from multiple cameras is the reading people
classication. Improved management of trafc is expected tracker (RPT) [1], which was later used as a base for the
to alleviate present and future congestion and safety development of a system called AVITRACK, which
problems in a broad eld of trafc applications (highways, monitors airplane servicing operations [2]. An example of a
airports etc.). One approach for addressing this need is by commercial system for monitoring and controlling road
using new imaging technologies stemming from the trafc is Autoscopew Solo Wide Area Video Vehicle
signicant advances in the eld of computer vision. Video Detection System [3], which was also used in the FP5
surveillance along with advanced computer vision methods INTERVUSE project [4], for the development of an
is already in use and provides signicant aid to human articial vision network-based system for monitoring the
trafc control operators in trafc monitoring and ground trafc at airports. Other commercial products for
management tasks. However, robust and accurate detection road trafc monitoring are VisioPad [5], VideoTrak 910
and tracking of moving objects still remains a difcult [6], TrafCam [7], Vantage Iteris Monitoring Video
problem for the majority of computer vision applications. Detection System [8] and ABT2000 [9]. In Vitus-1 study
Especially in the case of outdoor video surveillance, such as [10], more video-based systems especially for tunnel

IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112 103
doi: 10.1049/iet-its.2008.0092 & The Institution of Engineering and Technology 2010
www.ietdl.org

applications are presented. Moreover, several studies have Calibration module (off-line unit to calibrate each video
been made on the evaluation of trafc video systems. sensor): To obtain the exact position of the targets in the
Specically, the University of Utah has summarised a status real world, the calibration of each camera is required, so
of detector technologies [11], whereas the NIT project that any point can be converted from image coordinates
(non-intrusive technologies) of University of Minnesota (measured in pixels from the top left corner of the image)
evaluates technologies for trafc detection [12]. Recently, to ground coordinates and vice versa. A calibration
the Texas Transportation Institute has presented an technique, which is based on a 3  3 homographic
investigation of vehicle detector performance, in which four transformation and uses both points and lines
video image vehicle detection systems are tested [13]. correspondences, was used [17]. The observed targets are
small with respect to the distance from the video sensors
In this paper, we present a novel multi-camera video and they are moving on a ground surface, which therefore
surveillance system, which supports functionalities such as can be approximated by a plane. For the calibration of a
detection, tracking and classication of objects moving camera the world coordinates of at least four points (or
within the supervised area. Whereas the majority of existing lines) within its eld of view as well as their corresponding
commercial and research trafc surveillance systems have image coordinates are required. In case of cameras with
been designed to cope with specic scenarios, the proposed partially overlapping eld of views relative calibration with
system is applicable to a broad eld of trafc surveillance respect to an already calibrated camera can be used. For
applications (highways, airports, tunnels etc.). Its main this purpose, a tool for calibrating camera pairs with
advantages are extensibility, parameterisation and its partially overlapping eld of views was developed. The tool
capability to support various image processing and data visualises the camera views, one of which is the calibrated
fusion techniques so as to be easily adaptable to different view according to which the other camera is calibrated. It
trafc conditions. then allows the user to specify corresponding points on the
two views before it warps them on the ground plane as
More specically, the system is based on a network of shown in Fig. 1. The procedure is repeated until all
intelligent autonomous tracking units (ATUs), which cameras with partially overlapping eld of views are
capture and process images from a network of pre- calibrated.
calibrated visual sensors. ATUs provide results to a central
sensor data fusion (SDF) server, which is responsible for Background extraction and update module: Each ATU of
tracking and visualising moving objects in the scene as well the system can automatically deal with background changes
as collecting statistics and providing alerts when specic (e.g. grass or trees moving in the wind) or lighting changes
situations are detected. In addition, depending on the (e.g. day, night etc.) supporting several robust background
available network bandwidth, images captured from specic extraction algorithms, namely mixture of Gaussians
video sensors may also be coded and transmitted to the modelling [18], Bayes algorithm [19], Lluis Miralles
SDF server, to allow inspection by a human observer (e.g. Bastidas method [20] and non-parametric modelling [21].
trafc controller). The topology of the ATUs network In experimental results presented in our previous
varies in each application depending on the existing publications [14, 15], the non-parametric modelling
infrastructure, geomorphologic facts and/or bandwidth as emerges as the one having the best trade-off between
well as cost limitations. The network architecture is based on results quality and execution times.
a wired or wireless transmission control protocol and internet
protocol suite (TCP/IP) connection. These topologies can be Foreground segmentation module: Connected component
combined to produce a hybrid network of ATUs. The labelling is applied to identify individual foreground objects.
proposed solution has been developed within the framework
of TRAVIS (TRAfc VISual monitoring) research project. In Blob tracking module (optional): The multiple hypothesis
our previous publications [1416], we presented evaluation tracker [22] was used, although association and tracking of
results from the supported image processing and data fusion very fast moving objects may be problematic.
techniques. In this paper, we focus on the evaluation of the
overall systems performance in two completely different Blob classication module: A set of classes of moving
trafc scenarios such as tunnels and airports. objects (e.g. person, car etc.) is initially dened for each
application. Then, each blob is classied by calculating its
membership probability of each class, using a previously
2 ATUs trained back-propagation neural network. Specically, nine
Each ATU is a powerful processing unit (PC or embedded attributes, characteristic of its shape and size, are used as
PC), which periodically obtains frames from one or more input to a neural network: the two sizes of the major and
video sensors. The video sensors are standard closed-circuit minor axes of the blobs ellipse and the seven Hu moments
television (CCTV) cameras equipped with a casing [23] of the blob that are invariant to both rotations and
appropriate for outdoor use and telephoto lenses for distant translations. The number of outputs of the neural network
observation. They are also static (xed eld of view) and equals to the predened number of classes. The class is
pre-calibrated. Each ATU consists of the following modules: determined by the maximum output value.

104 IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112
& The Institution of Engineering and Technology 2010 doi: 10.1049/iet-its.2008.0092
www.ietdl.org

Figure 1 Two camera views with partially overlapping eld of views inserted in the calibration tool and the nal output after
calibration
a Right view with corresponding point
b Left view with corresponding point
c Two views are warped on the ground plane

Three-dimensional observation extraction module: It uses The nal output of each ATU is a small set of parameters
the available camera calibration information to estimate the (ground coordinates, classication, reliability), which is
accurate position of targets in the scene. Since the camera transmitted to the SDF server through wired or wireless
calibration is based on homographies, an estimate for the transmission. If the foreground map fusion technique is
position (xw , yw) of a target in the world coordinates can be used, a greyscale image is provided at each polling cycle,
directly determined from the centre of each blob. Each indicating the probability for each pixel to belong to the
observation is also associated with a reliability matrix R, foreground.
depending on the camera geometry and its position at the
camera plane. This matrix is calculated using the calibration 3 Network communications
information [24]
The TRAVIS system is based on client server network
R(xw , yw ) J (xc , yc )LJ (xc , yc ) T
(1) model where each ATU acts as a client to the SDF
listening server. The clients are connected to the server via
a wired connection or a wireless point-to-point or point-to-
where L is the measurement covariance at location (xc , yc) on
multipoint wi/wimax connection. The ATUs are remotely
the camera plane, which is assumed to be a xed diagonal
controlled from the SDF server through a signalling
matrix and J is the Jacobian matrix of the partial derivatives
channel that enables the SDF server user to start/stop
of the mapping functions between the camera and the
capture of frames, change operational mode, capture cycle
world coordinate systems. Supposing that gx and gy are the
and other parameters. The signalling channel has been
two mapping functions with gx (xc , yc ) xw and
implemented in order to separate data transmission of the
gy (xc , yc ) yw , then the Jacobian matrix is dened as
ATUs from control signals sent by SDF.
2 3
@gx (xc , yc ) @gx (xc , yc )
6 7 Data fusion and multiple hypotheses tracking algorithms,
6 @xc @yc 7 run by the SDF server, require synchronised capture of
J (xc , yc ) 6 7
4 @gy (xc , yc ) @gy (xc , yc ) 5 frames of the scene using a constant sampling period. To
@xc @yc full this requirement, the network time protocol (NTP) is

IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112 105
doi: 10.1049/iet-its.2008.0092 & The Institution of Engineering and Technology 2010
www.ietdl.org

used in order to synchronise system clocks of each computer complete frame processing within the available time cycle
in the TRAVIS network with reference to servers system (abnormal function), the time window for sending data is lost
clock. For this conguration, the SDF server was set as and the data will arrive at the SDF server with delay. In this
stratum 1 clock level in the NTP hierarchy, whereas every case, this packet is discarded from the server, while the ATU
ATU set at stratum 2 clock level. will wait for the next valid window to capture a new frame
and thus re-synchronise with the rest.
Even with synchronised system clocks, a capture command
from the server will not arrive at every ATU on the same
moment of time because of network latency, computer
3.3 Secondary video streaming system
process delays and so on, which makes the coinstantaneous As a secondary backup system, the ATUs support on demand
frame grabbing from all ATUs with millisecond accuracy media streaming directly to the control centre, where SDF
impossible. For this reason, a synchronised capture server is installed, in order to assist operators when an
procedure is used, which was rst proposed in Litos et al. accident or other abnormal situation is reported.
[25]. According to this algorithm, after the synchronisation Depending on the available bandwidth, this secondary
of the system clocks, the SDF server sends to each ATU a backup system is able to stream compressed video or just a
specic timestamp, indicating the time in the future when number of compressed images. The application installed at
the ATUs will start capturing frames as well as the value of the control centre may render this input as Motion JPEG,
the (constant) frame sampling period. MPEG4 or any other available format and present it in a
new window. This service allows the operator to select the
specic ATU, watch the scene and validate the alarm.
3.1 Handshake procedure
After installation, all ATUs in the TRAVIS network are set 4 SDF server
in standby mode. At this phase, each ATU is listening in a
user datagram protocol (UDP) port, dened during the The SDF server collects information from all ATUs using a
installation procedure, for command by the server. When constant polling cycle, produces fused estimates of the
SDF server user clicks the start capturing button the position and velocity of each moving target and tracks these
handshake procedure starts and a command is sent through targets using a multi-target tracking algorithm. It also
the signalling channel notifying the ATUs for SDF s IP produces a synthetic ground situation display, collects
address and port number. The next step is for ATUs to statistical information about the moving targets and provides
request TCP connection from the listening SDF server. alerts when specic situations (e.g. accidents) are detected.
After all connections have been established and the
meeting time has been reached, the system enters normal 4.1 Data fusion
operation. In the normal operation phase, SDF and ATU
software maintain timers to control frame capture and A target present simultaneously in the eld of view of
synchronisation. multiple cameras will result in multiple observations
because of the fact that the blob centres of the same object
This handshake scheme simplies both rst installation in two different cameras correspond to close but different
and re-conguration of the nodes of TRAVIS network, since 3-D points. The system supports two different techniques
for grouping together all the observations that correspond
The SDF or ATU IP addresses are not a-priori required. to the same target. The rst technique is the grid-based
fusion, which uses a grid to separate the overlap area in
Only the UDP port number where ATUs listen needs to cells. Optimal values for the cell size are determined
be known a-priori. considering the application requirements (e.g. car size).
Observations belonging to the same cell or to neighbouring
The SDF creates a list of the connected ATUs and waits cells are grouped together to a single fused observation [14].
for their data.
The second technique is called foreground map fusion. In
this technique, as mentioned in Section 2, each ATU
3.2 Synchronisation maintenance provides the SDF server with one greyscale image per
During normal operation, the ATUs calculate the timestamp polling cycle, indicating the probability for each pixel to
for capturing every next frame in order to keep being belong to the foreground. The SDF server fuses these maps
synchronised to each other. After capturing the frame, image together by warping them to the ground plane and
processing techniques, as presented at previous section, are multiplying them [26]. The fused observations are then
applied to the captured frame, observations are extracted and generated from these fused maps using connected
nally a packet is sent to SDF. On the other side, the server component analysis and classication information is
receives one packet from every ATU, checks the timestamps computed as in the ATUs blob classication module.
to certify that frames are synchronised and determines that Although this technique has increased computational
data are ready to be fused. If one of the ATUs does not and network bandwidth requirements, when compared to

106 IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112
& The Institution of Engineering and Technology 2010 doi: 10.1049/iet-its.2008.0092
www.ietdl.org

grid-based fusion, it can very robustly resolve occlusions velocities up to 144 km/h. In the following subsections, the
between multiple views and provide more accurate results two prototype installations are discussed.
[14, 15],
5.1 Airport prototype system
4.2 Multiple target tracking
The rst prototype system concerns the trafc control of
The tracking unit is based on the multiple hypothesis aircraft parking areas (APRON). The system calculates the
tracking (MHT) algorithm, which starts tentative tracks on position, velocity and direction of the targets and classies
all observations and uses subsequent data to determine them according to their type (car, man, long vehicle etc.).
which of these newly initiated tracks are valid. Specically, Alerts are displayed in case of potentially dangerous
MHT [27] is a deferred decision logic algorithm in which situations, such as speeding. This application also provides
alternative data association hypotheses are formed whenever a graphical display of the ground situation at the APRON.
there are observation-to-track conict situations. Then, This information can be accessible by the respective person
rather than combining these hypotheses, the hypotheses are responsible, even if they are situated in a distant area, with
propagated in anticipation that subsequent data will resolve no direct eye-contact to the APRON. The prototype
the uncertainty. Generally, hypotheses are collections of system installed at Macedonia airport of Thessaloniki
compatible tracks. Tracks are dened to be incompatible if consists of three cameras, which were installed at the fourth
they share one or more common observation. MHT is a oor of the control tower, mounted at the balcony ledge.
statistical data association algorithm that integrates the The cameras cover a great percentage of the APRON area
capabilities of (i) track initiation: automatic creation of new (approximately 25 000 m2), while focusing in parking place
tracks as new targets are detected; (ii) track termination: number 4 of the airport APRON. The two ATUs (one
automatic termination of a track when the target is no ATU is connected with two cameras) and the SDF server
longer visible for an extended period of time; (iii) track were placed in the APRON control room of the tower and
continuation: continuation of a track over several frames in are interconnected through an Ethernet network. Fig. 2
the absence of observations; (iv) explicit modelling of presents the views from two of the cameras installed at
spurious observations; and (v) explicit modelling of Thessaloniki airport and the corresponding foreground
uniqueness constraints: an observation may only be maps using the non-parametric modelling method.
assigned to a single track at each polling cycle and vice versa.
Table 1 presents the execution times per frame of the basic
Specically, the tracking unit was based on a fast modules of the system for the airport application and the
implementation of the MHT algorithm [22]. A 2-D percentage of time spent at each procedure. All times have
Kalman lter was used to track each target and additional been acquired using the non-parametric modelling method
gating computations are performed to discard observation and the map fusion mode. The execution times are in the
track pairs. More specically, a gate region is dened range of milliseconds, which shows that the proposed
around each target at each frame and only observations solution is suitable for real-time trafc surveillance at airports.
falling within this region are possible candidates to update
the specic track. The accurate modelling of the target In order to analyse the results acquired by the system,
motion is very difcult, since a target may stop, move, ground truth data had to be acquired. For the airport
accelerate and so on. Since only position measurements are prototype, tests have been conducted using cars equipped
available, a simple four-state (position and velocity along with global positioning system (GPS) devices. The ground
each axes) CV (constant velocity) target motion model in truth data have been correlated with the system results and
which the target acceleration is modelled as white noise then inserted into a database, along with relevant
provides satisfactory results. environmental parameters. A test analysis tool has also been
implemented in order to calculate and display various
statistics that can be acquired by comparing the results of
5 Experimental results the system with ground truth. Table 2 presents the average
Two prototypes were installed, each focusing on a different position error for each camera as well as the average
aspect of trafc monitoring (airports, tunnels at highways). position error of the system. Hence, the theoretical
Extensive tests were made during pilot installations both at obtainable accuracy of the system is approximately 3.69 m,
Macedonia airport of Thessaloniki, Greece and at a tunnel which compares favourably to the performance of surface-
near Piraeus harbour of Athens, in order to optimise the movement radars (less than 7.5 m) according to the ICAO
proposed system for each application. The maximum specications [28]. Fig. 3 presents the trajectory of a
vehicle speed that the system can detect depends on the follow-up car (equipped with a GPS) moving on the yellow
installation, while the system can cope well with small line of the parking position 4 and the corresponding
velocities. Parameters that affect the maximum speed trajectory estimated by the proposed system.
limitation are the area of coverage and the frame rate of the
system. For example, a system congured to run at 2 f/s The position error distribution for the three cameras
and cover road distance of 40 m can estimate target installed at the airport application is shown in Fig. 4. The

IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112 107
doi: 10.1049/iet-its.2008.0092 & The Institution of Engineering and Technology 2010
www.ietdl.org

Figure 2 Detection of moving vehicles at Thessaloniki airport


a and b Views from two cameras at Thessaloniki airport
c and d Corresponding foreground maps

majority of the recorded position errors are between 0 and system is robust enough to false negative errors, with a
5 m, while there is also a signicant percentage between 5 probability of 4.5% (Table 3), whereas the false positive error
and 10 m. In total, more than 91% of the recorded is approximately 15.3%, since the system is affected by
observations have been accurately detected (the GPS error sudden lighting changes. The majority of errors are classied
should also be considered). as position errors, that is, detected targets with a position
deviation from the corresponding GPS record.
Crucial statistics for this kind of evaluation are the false
positive and the false negative errors. False positive occurs A basic drawback of all video-based surveillance systems is
when the system observes a target, while in truth there is the degradation of the systems performance under low
none. On the other hand, false negative means systems visibility conditions. In case of thin mist, where the
failure to detect a target when in truth there is one. False visibility of air trafc controller was limited, the system
negative error is considered of great importance since a high performance was satisfactory, since a uniform background
value of this statistic means that the system is vulnerable to was created facilitating the detection of targets. However in
errors that may have severe consequences, such as failing to case of thick fog, the system could not distinguish the
identify an emergency situation. Tests showed that the target from the background providing high false negative
errors. For this reason, the cameras should be installed as
close as possible to the surveyed area to reduce the
Table 1 Execution times in airport application possibility of false negative errors under extremely low
Airport application
Modules Time, ms Percentage Table 2 Average position errors

background extraction 43.56 49.09 Camera Average position error, m


data fusion 31.7 35.73 camera 1 3.65
tracker 0.22 0.25 camera 2 3.73
display 13.25 14.93 camera 3 3.69
total 88.73 100 total 3.69

108 IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112
& The Institution of Engineering and Technology 2010 doi: 10.1049/iet-its.2008.0092
www.ietdl.org

Table 3 Error types for airport application

Error type Percentage


false negative 4.5
false positive 15.3
position error 80.2

nally installed at the entrance of the tunnel, placed in a way


that the two elds of view partially overlap. The tunnel
coverage resulting for this pilot installation is about 35 m.
Each ATU consists of an embedded PC and a camera. The
embedded PCs were installed on top of the tunnel roof and
inside an IP-66 box, which is highly protective against
environmental conditions. A WiMAX interface was used to
Figure 3 Comparison of a target trajectory to the ground connect the ATUs to the SDF server, which was installed in
truth a building about 1 km away (line of sight).
The light grey line indicates the GPS signal, whereas the dark grey
line is the output of the proposed system
In Figs. 5a and b, two synchronised frames from the
different cameras are presented whereas Figs. 5c and d
visibility. At night the system showed adequate capability to present the corresponding thresholded foreground maps.
detect targets, nevertheless slightly higher position errors are The SDF provides a visualisation window, which displays
produced due to the detection of aircrafts/vehicles lights. the fused observations as well as the calculated trafc
statistics in real time. As in the case of airport prototype
system, the average execution times for each module of the
5.2 Tunnel prototype system system are in the range of milliseconds (Table 4). However,
Another prototype was applied for trafc control of tunnels at the execution times of the background extraction (non-
highways. The focus of this application is on the collection of parametric modelling) as well as the data fusion (map
trafc statistics, such as speed and loads per lane. It can also fusion mode) module are relatively high, since the observed
identify potentially dangerous situations, such as objects moving targets are close to the cameras and consequently
falling from a vehicle or trafc jams. These results can be they cover larger regions on the camera frame.
provided to trafc surveillance centres or they can be used to
accordingly activate local road signs and trafc lights. The For the tunnel prototype, the ground truth data were
prototype was installed at a highway tunnel near Piraeus collected by viewing image sequences through a specially
Harbour, Athens, Greece. Taking into account existing developed manual marking tool, which allows the user to
installation constraints (limitations from tunnel authorities, specify the correct position of the moving targets. The
slight left turn near the entrance etc.) two cameras were ground truth data were correlated with the system results

Figure 4 Position error distribution for the airport application

IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112 109
doi: 10.1049/iet-its.2008.0092 & The Institution of Engineering and Technology 2010
www.ietdl.org

Figure 5 Detection of moving vehicles at Piraeus tunnel


a Frame from camera 1
b Frame from camera 2
c Mask from camera 1 frame
d Mask from camera 2 frame

and then inserted into the database. Table 5 presents the not affected by light changes or other environmental
average position error for each camera as well as the average conditions, the position error is higher than in the case of
systems position error, which is approximately 6.345 m. airport application. This is mainly due to two reasons: (i)
Although cameras are closer to targets and the system is long vehicles, for example, trucks cover a big part of the
cameras view creating high position errors and (ii) the
position of targets at the back of the scene cannot be
Table 4 Execution times for each module in tunnel estimated accurately because of the bad cameras perspective.
application Fig. 6 shows the trajectory of a vehicle, moving in the
supervised area of the tunnel, estimated by the manual
Tunnel application
marking tool (light grey dashed line) and the corresponding
Modules Time, ms Percentage
background extraction 61.78 35.86
data fusion 109.35 63.47
tracker 0.405 0.23
display 0.751 0.44
total 172.286 100

Table 5 Average position errors

Camera Average position error, m


camera 1 6.39 Figure 6 Comparison of a target trajectory to the ground
truth
camera 2 6.30 The light grey dashed line indicates the ground truth (manual
total 6.345 marking tool), whereas the dark grey solid line is the output of
the proposed system

110 IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112
& The Institution of Engineering and Technology 2010 doi: 10.1049/iet-its.2008.0092
www.ietdl.org

Figure 7 Position error distribution for the tunnel application

trajectory produced by the proposed system (dark grey solid proposed solution is extensible, parameterised and it
line). The deviation between two trajectories at the last supports various image processing and data fusion
observation is due to the bad perspective of cameras. techniques so as to be easily applicable to a broad eld of
Although this effect can be addressed by increasing the trafc applications. The modular architecture of the
number of cameras covering the monitoring area, high system, which is based on autonomous units (ATUs) that
accuracy is not such crucial in a tunnel application as in an work completely independently from each other, allows
airport application. the support of a large number of cameras without
increasing signicantly the computational requirements.
The position error distribution for both cameras is shown in Test results from the two pilot applications for tunnel and
Fig. 7. The majority of the recorded position errors are between APRON trafc monitoring and surveillance show the
0 and 5 m, while there is also a signicant percentage between great potential of the proposed technology. Possible
5 and 10 m. This means that approximately 80% of the future work includes the extension of the two pilot
recorded observations have been accurately detected (the applications by adding more autonomous trafc units and
manual marking error should also be considered). the distribution of road/tunnel trafc information
through web to remote clients.
Finally, Table 6 contains statistics related to the error types
in the tunnel application. The majority of errors recorded are
position errors, 85.9%, while the system seems to be robust
enough to false negative errors, since it is unaffected by 7 Acknowledgments
weather conditions. However, false positive errors seem to This work was supported by the General Secretariat of
be relatively high because of the fact that long vehicles may Research and Technology Hellas under the InfoSoc
cover a big part of the cameras view, thus producing TRAVIS: Trafc VISual monitoring project and the EC
erroneously multiple targets, which actually correspond only under the FP6 IST Network of Excellence: 3DTV-
to one moving vehicle. Integrated Three-Dimensional Television Capture,
Transmission, and Display (contract FP6-511568). The
authors would like to thank MARAC Electronic S.A. for
6 Conclusions the fruitful cooperation within TRAVIS project as well as
A novel multi-camera video surveillance system focusing on A. Smolic and HHI Berlin for supplying the video
trafc monitoring applications was presented. The sequences for the rst lab tests.

Table 6 Error types for tunnel application

Error type Percentage 8 References


false negative 2.94
[1] LE BOUFFANT T., SIEBEL N.T., COOK S., MAYBANK S.: Reading
false positive 11.07 people tracker reference manual (Version 1.12). Technical
Report No. RUCS/2002/TR/11/001/A, Department of
position error 85.9
Computer Science, University of Reading, 2002

IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112 111
doi: 10.1049/iet-its.2008.0092 & The Institution of Engineering and Technology 2010
www.ietdl.org

[2] THIRDE D., BORG M., FERRYMAN J., ET AL .: A real-time scene [16] SEMERTZIDIS T., KOUTSIA A., DIMITROPOULOS K., ET AL .: TRAVIS:
understanding system for airport apron monitoring. Proc. an efcient trafc monitoring system. 10th Int. Conf. on
Fourth IEEE Int. Conf. on Computer Vision Systems, Applications of Advanced Technologies in Transportation,
January 2006 May 2008

[3] MICHALOPOULOS P.G.: Vehicle detection video through [17] DIMITROPOULOS K., GRAMMALIDIS N., SIMITOPOULOS D., PAVLIDOU N.,
image processing: the autoscope system, IEEE Trans. Veh. STRINTZIS M.: Aircraft detection and tracking using
Technol., 1991, 40, (1), pp. 21 29 intelligent cameras. IEEE Int. Conf. on Image Process, 2005,
pp. 594597
[4] PAVLIDOU N., GRAMMALIDIS N., DIMITROPOULOS K., ET AL .: Using
intelligent digital cameras to monitor aerodrome surface [18] KAEWTRAKULPONG P., BOWDEN R.: An improved adaptive
trafc, IEEE Intell. Syst., 2005, 20, (3), pp. 76 81 background mixture model for real-time tracking with
shadow detection. Second European Workshop on
[5] Citylog: http://www.citilog.fr, accessed September Advanced Video-based Surveillance Systems, 2001
2008
[19] LI L., HUANG W., GU I.Y.H. , TIAN Q.: Foreground
[6] Peek Trafc: http://www.peek-trafc.com, accessed object detection from videos containing complex
September 2008 background. Proc. 11th ACM Int. Conf. on Multimedia,
2003, pp. 2 10
[7] Tracon: http://www.tracon.com, accessed September
2008 [20] LLUIS J. , MIRALLES X., BASTIDAS O.: Reliable real-time
foreground detection for video surveillance application.
[8] Iteris Odetics IST: http://www.iteris.com/, accessed Proc. Third ACM Int. Workshop on Video Surveillance and
September 2008 Sensor Networks, 2005, pp. 59 62

[9] ArtiBrain: http://www.artibrain.at, accessed September [21] ELGAMMAL A., HARWOOD D., DAVIS L.: Non-parametric model
2008 for background subtraction.computer vision. Sixth
European Conf. on Computer Vision, June/July 2000
[10] SCHWABACH H., HARRER M. , HOLZMANN W. , ET AL .: Video
based image analysis for tunnel safety vitus-1: a tunnel [22] COX J. , HINGORANI S.L. : An efcient implementation
video surveillance and trafc control system. 12th World of Reids multiple hypothesis tracking algorithm
Congress Intelligent Transport Systems, November 2005 and its evaluation for the purpose of visual tracking,
IEEE Trans. Pattern Anal. Mach. Intell., 1996, 18,
[11] MARTIN P. , FENG Y., WANG X.: Detector technology pp. 138 150
evaluationDepartment of Civil and Environmental
Engineering, University of Utah Trafc Laboratory, Final [23] HU M.-K. : Visual pattern recognition by moment
Report, November 2003 invariants, IRE Trans. Inf. Theory, 1962, 8, pp. 179 187

[12] Minnesota Department of Transportation: Evaluation [24] BORG M., THIRDE D., FERRYMAN J., ET AL .: Visual surveillance
of non-intrusive technologies for trafc detection, Phase II, for aircraft activity monitoring. VS-PETS 2005, Beijing,
November 2004. Available at: http://projects.dot.state.mn. China
us/nit/index.html
[25] LITOS G. , ZABULIS X., TRIANTAFYLLIDIS G.A.: Synchronous
[13] MIDDLETON D., PARKER R., LONGMIRE R. : Investigation of image acquisition based on network synchronization.
vehicle detector performance and ATMS interface. Report IEEE Workshop on Three-Dimensional Cinematography,
0-4750-2, Project Title: Long-Term Research into Vehicle 2006
Detection Technologies, March 2007
[26] KHAN S., SHAH M.: A multiview approach to
[14] KOUTSIA A., SEMERTZIDIS T., DIMITROPOULOS K., GRAMMALIDIS N., tracking people in crowded scenes using a planar
GEORGOULEAS K. : Automated visual trafc monitoring and homography constraint. Ninth European Conf. on
surveillance through a network of distributed units. ISPRS Computer Vision, 2006
2008, Beijing, China, July 2008
[27] BLACKMAN S., POPOLI R.: Design and analysis of modern
[15] KOUTSIA A., SEMERTZIDIS T., DIMITROPOULOS K., GRAMMALIDIS N., tracking systems (Artech House, Boston, USA, 1999)
GEORGOULEAS K. : Intelligent trafc monitoring and
surveillance with multiple cameras. Sixth Int. Workshop [28] ICAO Document AN-Conf/11-IP/4: Manual of
on Content-Based Multimedia Indexing (CBMI 2008), June advanced surface movement, guidance, and control
2008 systems (A-SMGCS)

112 IET Intell. Transp. Syst., 2010, Vol. 4, Iss. 2, pp. 103 112
& The Institution of Engineering and Technology 2010 doi: 10.1049/iet-its.2008.0092

Potrebbero piacerti anche