116

Pattern Recognition Letters 28 (2007) 17191726
www.elsevier.com/locate/patrec
Vision-based bicycle/motorcycle classication

Stefano Messelodi
a
a,*
, Carla Maria Modena a, Gianni Cattoni
Fondazione Bruno Kessler-irst, Via Sommarive 18, I-38050 Povo, Trento, Italy
b
Universita` degli Studi di Trento, Via Mesiano 77, Trento, Italy
Received 1 August 2005; received in revised form 2 February 2007
Available online 13 May 2007
Communicated by H.H.S. Ip
Abstract
We present a feature-based classier that distinguishes bicycles from motorcycles in real-world trac scenes. The algorithm extracts
some visual features focusing on the wheel regions of the vehicles. It splits the problem into two sub-cases depending on the computed
motion direction. The classication is performed by non-linear Support Vector Machines. Tests lead to a successful vehicle classication
rate of 96.7% on video sequences taken from dierent road junctions in an urban environment.
2007 Elsevier B.V. All rights reserved.
Keywords: Trac monitoring; Feature extraction; Support Vector Machine; Vehicle classication; Image analysis
1. Introduction
Image analysis techniques have been shown to be eective and cost competitive in various trac control applications (Kastrinaki et al., 2003; Foresti et al., 2003; Hu et al.,
2004). In spite of some drawbacks, mainly related to a
dependence on scene illumination, vision-based systems
oer several advantages over traditional trac control
techniques: low impact on the road infrastructure, low
maintenance costs and the possibility for a remote operator
to receive images. Furthermore, a vision-based system can
be adapted to detect and classify particular kinds of vehicles on the basis of visual features. This is the case when
discriminating between bicycles and motorcycles. This
capability provides important information to trac managers in order to evaluate the need to build bicycle lanes
or to establish correlations between trac and air or acoustic pollution.
When necessary, bicycle counting is usually performed
manually by transportation personnel, or automatically
*
Corresponding author. Tel.: +39 0461 314 592; fax: +39 0461 314 591.
E-mail address: messelod@itc.it (S. Messelodi).
0167-8655/$ - see front matter 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.patrec.2007.04.014
by means of special purpose-build equipment. For temporary sessions of data acquisition pneumatic rubber tube
detectors placed across the road are often used. For continuous monitoring, permanent detectors are used, i.e. devices
such as loop detectors and infrared or video detection systems. A comparison of dierent bicycle detection technologies is included in (SRF Consulting Group, 2003).
Few vision-based algorithms have been proposed in literature (Dukesherer and Smith, 2001; Rogers and Papanikolopoulos, 2000) devoted to bicycle counting. The
algorithm proposed by Rogers and Papanikolopoulos
(2000) detects objects moving through the scene by means
of a background dierencing technique. The estimation of
the movement direction enables their system to localize the
wheels by searching for ellipses using the generalized
Hough transform in the edge map. They claim they are able
to count the number of bicycles on a trail with an accuracy
up to 70%, for a variety of weather conditions. Furthermore, the authors refer to a previous method (Rogers
and Papanikolopoulos, 1999) where bicycles were detected
in the image by a template-matching technique, although
concluding that the Hough based method is a better alternative, mainly for computational reasons. Dukesherer and
1720
S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726
Smith (2001) propose to utilize a hybrid approach: rstly,

the Hough transform for circles, in order to localize wheel
regions in the image, and then the Hausdor distance for
matching the candidates with a simple bicycle template,
where a bicycle is described as two arcs of circle separated
by an approximately known distance. The knowledge
about camera set-up along with expected bicycle position
is used to constrain the radius range. They report a 96%
detection rate on a set of 25 images. In both methods bicycles are viewed from the side of a bicycle lane and the
authors make no mention of problems related to the discrimination from motorcycles.
As far as we know, the problem of distinguishing bicycles from motorcycles has not been investigated in the literature, at least using image analysis techniques. There are
two major diculties we have to cope with: the wide range
of visual appearances of the pattern to be classied and
their typical low resolution, e.g. 50 60 pixels, in a standard trac surveillance system. The former problem
depends both on the dierent poses of the vehicle with
respect to the camera, and on the variety of bicycle and
motorcycle models.
The algorithm proposed in this paper is a module of a
video-based trac monitoring system, named SCOCA
(Messelodi et al., 2005b) developed for the real-time collection of trac parameters at road intersections. The functionalities provided by the system are: vehicle detection,
counting, classication, average speed estimation, and
compilation of the turning movement table for each monitored intersection. Vehicle classes recognized by the system
are: bicycle, motorcycle, car, van, lorry, urban bus and
extra-urban bus.
The classication is mainly based on the comparison of
the detected vehicles with a set of 3D models. As a matter
of fact, certain classes are not distinguishable using only
macroscopic data about shape and size. This is the case
of motorcycle and bicycle. For this reason SCOCA is
endowed with specialized classiers in order to solve these
ambiguities. The algorithm proposed in this paper implements one such classier which exploits the tracking results
provided by SCOCA, e.g. the location of the vehicle in the
image and its direction of movement.
The underlying idea is to divide the problem into two
dierent contexts, or subproblems, in order to increase
the uniformity of the visual appearance of the vehicles to
be classied. The purpose is to roughly divide the set of
vehicle images into two groups, the rst depicting a side
view of the vehicle and the second containing front or rear
views. The context switching is simply controlled by a
threshold on the estimated direction of motion, with
respect to the camera optical axis. For each subproblem
a set of features are extracted from the image and used
to discriminate between bicycle and motorcycle. In particular, we propose to focus the feature selection to the wheels
region of the vehicle. First, we show that this is a good
choice through the implementation and evaluation of two
heuristic based classiers, one for each subproblem. Next,
we present two Support Vector Machine classiers fed with

the projection proles computed over the image region
depicting the wheels of the vehicle. Experiments have been
performed on real images coming from dierent junctions.
The paper is organized as follows: the next section
briey describes the trac monitoring system SCOCA. Section 3 introduces the feature selection upon which relies a
rst classier dedicated to distinguish between motorcycles
and bicycles, Section 4 presents the use of Support Vector
Machines in this problem. Section 5 shows the result of the
experimental sessions. Finally, Section 6 provides some
considerations and conclusions.
2. The trac monitoring system
The aim of SCOCA is to collect statistical information
about trac owing through road intersections. The system has been designed following modularity and exibility
criteria in order to work eectively at road junctions characterized by dierent topologies and dierent acquisition
set-ups. For this purpose the operator provides the system
with some initial data, i.e. the camera intrinsic and extrinsic
parameters.
The input sequence of images (with size 288 352) is
analyzed, at 25 frames per second, by two main subprocesses running in parallel. They are devoted, respectively,
to the detection and tracking of objects crossing the scene
and to the trac parameter extraction for each of them
(class, speed, path). In order to properly collocate the bicycle/motorcycle classier in the whole system, both modules
are briey described below. More details are provided in
(Messelodi et al., 2005b).
2.1. Object detection and tracking
Stationary or moving objects are detected by analyzing
the dierence between the current frame and a reference
background image. This latter is computed and updated
by means of a Kalman predictive ltering technique
(Messelodi et al., 2005a). The image/background dierence
is then thresholded; very small foreground regions are ltered away and the remaining ones are grouped into hypothetical vehicles by considering their distance and the
possible overlap with the expected position of the objects
detected in the previous frame. The convex hull of each
group of regions is a blob that represents the support set
B of the object.
This technique is typically robust and accurate but its
signicant computational load suggests to avoid to apply
it on a frame-by-frame basis, in such a way to leave room
for other tasks required by a real-time trac surveillance
system, e.g. classication, detection of relevant trac
events. For this reason the frame dierencing step is
applied once every few frames and a more ecient tracking
method is adopted, in the intermediate frames: each object
is tracked by selecting in the current image a set of small
regions which are easily identiable in the subsequent
frames, i.e. regions containing edges or corners. The correspondences among these regions between two successive
images provide information about the object movement.
The result of this module is a set of objects each one represented by a data structure that stores information about
the dierent views of the object in the scene and its displacements D between consecutive frames.
2.2. Parameter extraction
The parameter extraction module analyzes the output of
the previous module in order to provide a description in
terms of class, speed and path for each detected and
tracked object. For the purpose of this paper only the classication step needs a brief description. It works in two
stages: a model-based classication step, followed by a feature-based one, when needed.
The model-based classier makes use of a set of 3D
models which provide a rough description of the shapes
of dierent vehicle categories. There are eight adopted
models: a single model (called CYCLE) represents motorcycles and bicycles, three models for cars, two models for
vans, and a single model represents lorries and buses. Actually, a model is dened to represent pedestrians but it is
used only to detect false alarms.
The 3D model classier considers for each view of the
moving object the best match with each 3D model placed
in dierent positions and along dierent directions on the
ground plane. The match score is computed as the overlap
between the support set Bj of the jth view of the object and
the projection of the ith model onto the image plane. The
object is then assigned to the 3D model having the highest
average score computed over the set of its views. Focusing
on the best 3D model, a list is associated to each object
view containing the following data: the estimated position
and orientation on the ground plane, the overlap score,
and a number in the interval (0, 1] (inside factor) that species what fraction of the projected model is visible in the
image.
If the best model corresponds to a single vehicle category, then the classication terminates straightforwardly.
Otherwise, specialized classiers are applied in order to
determine the correct class among the vehicle categories
associated to that 3D model. These classiers use specic
features extracted from the views of the vehicle.
3. Feature selection
At the end of the model-based classication phase, an
object descriptor stores the following information for each
view of the vehicle:
B: the support set of the unknown vehicle, i.e. a binary
mask corresponding to the convex hull of the vehicle in
the image;
I: the subimage of the input image corresponding to the
bounding box of B;
1721
D: the absolute dierence map between I and the background image in the same location;
the displacement vector D in the image plane of the vehicle blob with respect to the previous frame;
(x0, y0, h): the estimated position and direction of the
vehicle on the road;
the score of the model-based classication step;
the inside factor, i.e. the fraction of the real-world positioned model that is visible in the image.
An example of this information is reported in Fig. 1.
The world coordinate system is chosen by having the
(X, Y) plane in correspondence to the road plane, the origin
at the vertical projection of the camera optical center to
(X, Y), and the Y-axis directed as the projection of the optical axis onto the road plane.
In order to deal with the great variability of the visual
appearance of a motor/bicycle, mainly due to the dierent
perspectives under which it can be observed by the camera,
we choose to distinguish two dierent contexts according
to the moving direction of the vehicle, h, with respect to
the Y-axis. If h is close to the Y-axis direction, i.e. the angle
between them is less than a xed value Th, a front or rear
view of the moving vehicle appears in the image (Fig. 2a
and b). Otherwise, the image depicts a side view of the vehicle (Fig. 2c and d). The selection of the threshold values
will be discussed at the end of this section.
Side view. In this case, the underlying idea of the algorithm is that the luminance of the region inside the wheels
of a bicycle is more similar to the background, with respect
to the same region of a motorcycle. The purpose is to localize the wheel regions in the image and compute the average
value of D for those regions. The feature extraction proceeds as follows:
(1) Let x be the direction of the displacement vector D in
the image plane.
(2) Compute the direction x0 of the minimum bounding
rectangle1 (MBR) of the support set B with direction
constrained in a centered neighborhood (5) of x.
(3) Estimate the location of the regions R1, R2 corresponding to the wheels:
compute the projections p0 and p1 of the subimage
of D in B, respectively along the direction x0 and
its normal;
in order to reduce boundary noise (mainly due to
shadow) in both the directions, in particular in
the wheels area, consider the portion p00 of p0
between the 3rd and 100th percentile and the portion p01 of p1 between the 1st and 99th percentile;
the wheel regions are approximated by two rectangular regions R1 and R2 (Fig. 3), obtained from
the intersection of the backprojection of the rst
1
MBR(x,d)(B) is the rectangle with minimum area which contains all the
points of the set B and has a side with slope in the range (x d, x + d).
1722
Fig. 1. Information associated to a view of a vehicle labeled as CYCLE by the model-based classier: region I of the input image, background dierence map
D, support set B (here its boundary overlapped to I). Other information: D = (12, 23), (x0, y0, h) = (1504, 11,741, 50), model-based classication
score = 0.85, inside factor = 1.0.
Fig. 2. (a) Front view of a bicycle with estimated motion direction h = 200 in the real world. (b) Rear view of a motorcycle: h = 20. (c) Side view of a
bicycle: h = 90. (d) Side view of a motorcycle: h = 100.
third of p00 with the backprojection of the rst and

the last third of p01 (one third has been estimated by
observing the proportion and the position of the
wheels in a set of ridden motor/bicycles side
images).
(4) Compute the average of D in R1 \ B and R2 \ B,
yielding two values S1 and S2, respectively, that act
as scores for the classication.
If both the scores S1 and S2 are lower than a given
threshold Ts, then the object view is classied as a bicycle;
otherwise as a motorcycle.
Front/rear view. The idea underlying the classication
of a front or rear view of the vehicle is that the thickness
of the tires is typically wider for motorcycles than for
bicycles, and this fact can be detected by analyzing a prole obtained from the image portion that contains a wheel
of the vehicle (actually the wheel which is closest to the camera). The feature extraction algorithm works as follows:
(1) Taking advantage of the information about position
and direction of the vehicle on the road plane, and
the expected displacement of wheels with respect to
the vehicle middle point, the real-world location of
the wheel closest to the camera is estimated. To focus
the analysis to the bottom part of the wheel, a vertical
segment of xed height (40 cm in the experiments) is
virtually placed in that location and its back projection in the image plane is computed. Let Hw be its
length, in pixels, on the image plane.
1723
Fig. 3. Side views. The projections p0 and p1 of the dierence map, along the direction x0 and its normal, in order to determine the wheel regions R1 and
R2 (boxes). Left (bicycle): x0 = 85.3; the extracted features lead to S1 = 24.3 and S2 = 37.4. Right (motorcycle): x0 = 89.7; the extracted features lead to
S1 = 79.8 and S2 = 75.6.
(2) Compute the average value of D for each row of the

support set B, and let Rb be the rst row, starting
from the bottom, where the computed value is greater
than a given threshold Tp; let Rt be the row with
index Rb Hw.
(3) Considering the subregion B 0 of B whose pixels have
a row index in the interval [Rb, Rt] (surrounded
regions in Figs. 4 and 5), project onto the horizontal
axis the values of D within the support set B 0 .
(4) Analyze the prole of this projection (the bottom
ones in Figs. 4 and 5) that usually presents a peak
originated by the lower part of the wheel. The width
of the peak Wp is estimated.
If Wp is lower than a threshold Twp that linearly
depends on the distance of the vehicle from the camera,
then this view is classied as a bicycle; otherwise, as a
motorcycle.
Fig. 4. A front view of a bicycle along with the associated projections. On

the left, portion I of the input image; in the middle D, the dierence with
the background, and the contour of the support set B. The B 0 region (front
wheel) is determined from the prole on the right hand side (average of D
values row by row inside B). The prole on the bottom, produced by B 0 is
then analyzed: in this example Hw = 10 and the peak width Wp has been
estimated to be 3 pixels.
Fig. 5. A rear view of a motorcycle along with the associated projections.

On the left I; in the middle D and the contour of B and B 0 . In this example
Hw is also 10 pixels, but the peak width Wp results to be 11 pixels.
The values of the thresholds involved in this rst classication algorithm (both for side and front/rear view) have
been set during a parameters estimation stage, on a training
set of labeled vehicle views. Only the views having score
and inside factor greater than prexed values (0.3 and
0.95, respectively) are considered. A range of reasonable
values have been assigned to Th, Ts, Tp, Twp and the conguration that maximized the classication rate on a training set of labeled vehicle views has been selected. The
percentile values, used to remove noise at the tails of the
projections, have been estimated by comparing, for a small
set of images, the projections of the automatically extracted
blobs and those of manually labeled blobs.
The classication performance of this algorithm, presented in Section 5, suggests that the considered visual cues
have a suciently good discrimination power. Therefore
they are adopted in the selection of the features of the
SVM classiers described in the following section.
4. The SVM bicycle/motorcycle classier
Support Vector Machines are based on the learning
theory developed by Vapnik (1995). They are a method
1724
introduced for function estimation, times series analysis,

variance analysis and have recently shown a great potential
for solving several visual learning and pattern recognition
problems (Lee and Verri, 2003). Focusing on SVMs for
two-class classication, a support vector classier can be
seen geometrically as an optimal separating surface (decision surface) between two classes. Theory on SVM can
be found in literature (e.g. Scholkopf and Smola, 2002).
We adopted the SVM technique to model the boundary
of the two classes bicycle and motorcycle. It can be formulated as a classication problem of 3D objects, where only a
limited number of 2D views are presented during the training phase. We concentrate on the use of a non-linear classier dened by a Gaussian kernel function with parameter r.
Another parameter of the SVM is the regularization term C.
The input to the classier is a feature vector x related to
the visual cues discussed in Section 3. Two SVMs have
been trained to classify side views and front/rear views
respectively.
The information related to the wheel regions in the
vehicle image is captured by two skewed projection proles
(P0 and P1). Their computation is dierent in the two contexts, and consists in the following steps (refer to Figs. 6
and 7):
(1) let x be the direction of the displacement vector D in
the image plane;
(2) let x0 be the slope of MBR(x,5)(B); let V0, V1, V2, V3
be the vertexes of the MBR, counterclockwise, where
the side V 0 V 1 is the lower side in image coordinates
having slope x0 for side views, and slope orthogonal
to x0 for front/rear views;
(3) extract two rectangular zones Zi, i = 1, 2 from the
MBR whose vertexes V0, V1, V 02 , V 03 are dened parametrically with respect to factors fi, as V 02 V 1
fi V 2 V 1 and V 03 V 0 fi V 3 V 0 ;
(4) compute the projections p0 by projecting the region of
D enclosed in Z1 along the direction x0, and p1 by
projecting the region of D enclosed in Z2 along the
direction normal to x0;
(5) P0 and P1 are then obtained by the quantization of p0
and p1 into xed dimensions D0 and D1.
Fig. 6. Computation of projection p0 and p1 for a side view of a motorcycle. The MBR and the zones Z1 and Z2 are highlighted.
Fig. 7. Computation of projection p0 and p1 for a rear view of a bicycle. The MBR and the zones Z1 and Z2 are highlighted.
The feature vector x is composed by the concatenation

of P0 and P1, and represent a generalization of the features
used in the previous classier.
The machine training has been performed using two sets
of vehicles views coming from distinct trac sequences.
The rst set has been split, according to the threshold Th,
to build the training set for the side views and for the
front/rear view classiers. Analogously, the second set has
been split to generate two cross-validation sets.
Next, the four sets have been ltered by removing the
object views which have a score or an inside factor less than
specic thresholds (0.3 and 0.95, respectively). In fact they
typically give rise to outliers. The classiers have been
trained using dierent values for the thresholds Th, f1, f2,
D0, D1 and for the parameters r and C. The range of variability for the threshold Th, f1, f2, have been set around the
values used in the early classier (i.e. around 25, 0.33, 0.33
respectively). The conguration that maximized the classication accuracy on the cross-validation set has been
selected.
5. Experimental results
The classiers presented in the previous sections aim to
assign an object view to one of the two categories bicycle/
motorcycle. A single vehicle detected and tracked by SCOCA
is typically composed of several views that can be used
together to classify the vehicle. We adopt the following criterion to classify a vehicle as a function of the classication
results of its views:
assign the vehicle to the class of the majority of its views;
if they are equally present, consider the sum of the
scores associated to the views classied as bicycle and
motorcycle, respectively, and assign the vehicle to the
class having the highest score.
All the experiments have been carried out using ve real
trac sequences collected from two distinct road intersections. We run the SCOCA system on the image sequences
and collected the vehicle assigned to the CYCLE class by the
model-based classication module. This set contains true
bicycles and motorcycles along with misclassied vehicles
(pedestrian, wheelchairs, prams, noise generated by shadows, etc.) that have been manually excluded. We labeled
all of the vehicles using a graphical tool that has been
developed to speed up the ground-truth generation and
the evaluation of the classier accuracy.
In the rst experiment, aimed at verifying the discrimination ability of the selected visual features, the early classier has been applied to a test set containing the vehicles
detected in three sequences, while the other two sequences
have been used to build the training set for the parameters
estimation. The test set contains 45 bicycles and 144 motorcycles, each one described by its set of views, that range
from 3 to 8 depending on the vehicle speed and trajectory
1725
Table 1
Classication result at vehicle level of the early classier applied to three
sequences coming from two dierent junctions
Input class
Classication result
No.
Bicycle
Motorcycle
Error rate (%)
Bicycle
Motorcycle
45
144
27
5
18
139
40.0
3.5
Total
189
12.2
through the scene. The total number of views is 1118.

Table 1 reports the classication results, at vehicle level,
showing the confusion matrix and the classication error
rates. These gures motivated the use of the same kind of
features, i.e. related to the wheels region, for the training
of the SVM classiers.
The SVM classiers have been tested both at view level
and at vehicle level. Four sequences have been utilized to
build the training set and one for the cross-validation set,
used to estimate the value of the parameters. The composition of the resulting training set is reported in Table 2.
The classication accuracy has been measured using a
leave-one-vehicle-out method on the training set. The reason for this choice, instead of the standard leave-one-out,
is the prevention of a possible bias in the test results due
to the relevant correlation that exists among dierent views
of the same vehicle. Let T be the set of all views of all the
vehicles in the training set. Let Ti be the set of views of the
ith vehicle: then the machine is trained on the set TnTi and
tested on Ti. Each element of Ti is classied and the vehicle
is assigned to a category according to the majority
criterion.
Table 3 reports the classication results of the two
SVMs (side and front/rear) both at view level and at vehicle
level. The number of support vectors is 256 for the SVM
trained on side views, and 126 for the SVM trained on
front/rear views. As expected the accuracy at vehicle level
is higher thanks to the redundancy provided by the multiple views.
Table 2
Input to be classied with the SVMs extracted from four sequences taken
from two dierent junctions
Side views
Front views
Total views
Total vehicles
Bicycle
Motorcycle
Total
324
150
474
78
675
107
782
197
999
257
1256
275
Table 3
Error rates of the SVM classiers for side views and for front views, at
view level and at vehicle level, distinguishing on both the classes
SVM on side views

SVM on front views
Error at view level
Error at vehicle level
Global
error (%)
Bicycle
error (%)
Motorcycle
error (%)
6.2
6.2
6.3
3.3
10.2
6.0
9.5
3.8
4.3
6.5
4.3
3.0
1726
The major source of classication errors is the inaccurate detection of the vehicle boundary in the previous localization and tracking steps. This is mainly due to the partial
or total inclusion of the vehicle shadow in the blob, or to
the inclusion of moving background regions, typically generated by moving leaves or their shadows.
6. Conclusions
In this paper, we have described an algorithm for the
discrimination between bicycles and motorcycles. It is part
of a video-based trac monitoring system that aims to
detect, track and classify vehicles at urban road junctions.
The algorithm is applied after a model-based classication, that is unable to discriminate between the two vehicle
classes, but that ensures (with a certain condence) that the
vehicle belongs to one of the two considered classes. The
visual features used by the classier are computed starting
from the vehicle image, the background image and an estimated position and orientation of the vehicle in the real
world. This data is provided by other modules of the monitoring system. The algorithm focuses on the image regions
that correspond to the wheels of the vehicle, and acts dierently depending on the vehicle orientation with respect to
the camera view (side or front/rear).
The application of a rough classier has shown that the
selected zones and features are discriminant. Support Vector Machines have then been trained using analogous features, based on the skewed projection proles of the lower
part of the vehicle, leading to a global error rate of 6.3% at
view level and 3.3% at vehicle level. These gures cannot be
directly compared to other works as it is a relatively unexplored task. In fact, as far as we know, currently, no other
monitoring system for urban junctions exists that is able to
classify vehicles including the bicycle class. Considering
that in trac surveillance applications, aiming at collecting
data for statistical purposes, a classication error rate
around 5% is typically accepted, our results can be considered satisfactory.
Future plans include the improvement of the classiers

input, mainly addressing the shadow detection problem
and the management of dynamic background pixels, but
also rening the estimation of the vehicle pose. Other visual
features will be explored like the detection of the leg movement of cyclists. Moreover, the appropriateness of nonvisual features, like the average vehicle speed, and the utility of dening more contexts will be investigated.
References
Dukesherer, J., Smith, C., 2001. A hybrid HoughHausdor method for
recognizing bicycles in natural scenes. In: IEEE Internat. Conf. on
Systems, Man and Cybernetics.
Foresti, G.L., Micheloni, C., Snidaro, L., 2003. Advanced visual-based
trac monitoring systems for increasing safety in road transportation.
Adv. Transport. Studies Internat. J. 1 (1), 2747.
Hu, W., Tan, T., Wang, L., Maybank, S., 2004. A survey on visual
surveillance of object motion and behaviors. IEEE Trans. Systems
Man Cybernet. Part C: Appl. Rev. 34 (3), 334352.
Kastrinaki, V., Zervakis, M., Kalaitzakis, K., 2003. A survey of video
processing techniques for trac applications. Image Vision Comput.
21 (4), 359381.
Lee, S.-W., Verri, A., 2003. Special Issue on Support Vector Machines for
Computer Vision and Pattern Recognition. Internat. J. Pattern
Recognit. Artif. Intell. 17 (3).
Messelodi, S., Modena, C.M., Segata, N., Zanin, M., 2005a. A Kalman
lter based background updating algorithm robust to sharp illumination changes. In: 13th Internat. Conf. on Image Analysis and
Processing, Cagliari, Italy.
Messelodi, S., Modena, C.M., Zanin, M., 2005b. A computer vision
system for the detection and classication of vehicles at urban road
intersections. Pattern Anal. Appl. 8 (12), 1731.
Rogers, S., Papanikolopoulos, N., 1999. A robust video-based bicycle
counting system. In: ITS Amer. 9th Annual Meeting, Washington,
DC.
Rogers, S., Papanikolopoulos, N., 2000. Bicycle counter. Technical
Report MN/RC 2000-08, Articial Intelligence, Robotics, and Vision
Laboratory, University of Minnesota.
Scholkopf, B., Smola, A.J., 2002. Learning with Kernels. MIT Press,
Cambridge, MA.
SRF Consulting Group, 2003. Bicycle and pedestrian detection. Research
Report 23330, US DoT FHW and Minnesota DoT.
Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. SpringerVerlag, New York.

116

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

116

Caricato da

Copyright:

Formati disponibili

Pattern Recognition Letters 28 (2007) 17191726

Vision-based bicycle/motorcycle classication

, Carla Maria Modena a, Gianni Cattoni

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

Smith (2001) propose to utilize a hybrid approach: rstly,

we present two Support Vector Machine classiers fed with

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

third of p00 with the backprojection of the rst and

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

(2) Compute the average value of D for each row of the

Fig. 4. A front view of a bicycle along with the associated projections. On

Fig. 5. A rear view of a motorcycle along with the associated projections.

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

introduced for function estimation, times series analysis,

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

The feature vector x is composed by the concatenation

Error rate (%)

through the scene. The total number of views is 1118.

SVM on side views

S. Messelodi et al. / Pattern Recognition Letters 28 (2007) 17191726

Future plans include the improvement of the classiers

Potrebbero piacerti anche