Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
www.elsevier.com/locate/imavis
Abstract
This paper proposes a novel edge-based stitching method to detect moving objects and construct mosaics from images. The method is a
coarse-to-fine scheme which first estimates a good initialization of camera parameters with two complementary methods and then refines the
solution through an optimization process. The two complementary methods are the edge alignment and correspondence-based approaches,
respectively. The edge alignment method estimates desired image translations by checking the consistencies of edge positions between
images. This method has better capabilities to overcome larger displacements and lighting variations between images. The correspondence-
based approach estimates desired parameters from a set of correspondences by using a new feature extraction scheme and a new
correspondence building method. The method can solve more general camera motions than the edge alignment method. Since these two
methods are complementary to each other, the desired initial estimate can be obtained more robustly. After that, a Monte-Carlo style method
is then proposed for integrating these two methods together. In this approach, a grid partition scheme is proposed to increase the accuracy of
each try for finding the correct parameters. After that, an optimization process is then applied to refine the above initial parameters. Different
from other optimization methods minimizing errors on the whole images, the proposed scheme minimizes errors only on positions of features
points. Since the found initialization is very close to the exact solution and only errors on feature positions are considered, the optimization
process can be achieved very quickly. Experimental results are provided to verify the superiority of the proposed method.
q 2004 Elsevier B.V. All rights reserved.
Keywords: Image registration; Image-based rendering; Mosaics; Moving object detection; Video retrieval
where Nbv is the number of elements in Pvb : Let Td denote where mi and si are the local mean and variance of Ii ;
a threshold and set to be 4. Given a number k; we want to respectively; ð2K þ 1Þ2 represents the area of matching
determine the number Npv of elements in Pva whose dv ði; kÞ window. When considering efficiency, the measure Dðp; qÞ
is less than Td : In addition, we denote the average value is preferred and adopted in this paper. It is well known the
of dv ði; kÞ for these Npv elements as Ekv ; which can be used computation of the sum of differences is time-consuming. In
as an index to measure the goodness of k to see whether it order to solve this problem, this paper uses a branch-and-
is a suitable translation solution. If Ekv is smaller enough bounded (or pruning) technique to speed up the calculations
and Npv is larger enough, the position k can be considered of Dðp; qÞ: First, a matrix is obtained by recording all the
as a good horizontal translation. More precisely, if Ekv # temporary values when accumulating the previous sum of
Te and Npv $ Tp ; the k is collected as an element of the set differences. The matrix is then used to check the
Sx of possible horizontal translations, where the two accumulating result when calculating the sum of differ-
thresholds Tp and Te are set to be 5 and 2, respectively. ences. If current result is larger than its corresponding
Let Wb be the width of the input image Ib : Through threshold stored in this matrix, further accumulation for
examining different k for all lkl , Wb ; the set Sx can be getting the final sum of differences is not necessary. Since
obtained. the set Sxy is small and many unnecessary accumulations
On the other hand, let Pha and Phb be the sets of horizontal have been avoided, the best solution of translations can be
edge positions in Ia and Ib ; respectively. With Pha and Phb ; we quickly obtained. In addition, since many impossible
can define a distance function dh as follows: translations have been filtered out in advance by edge
dh ði; kÞ ¼ min lPha ðiÞ 2 k 2 Phb ðjÞl; ð5Þ alignment, the proposed method has better capabilities to
1#j#Nbh overcome the problem of image lighting changes. Fig. 3
where Nbh is the number of elements in Phb : Let Hb denote shows the block diagram of this edge-based translation
the height of the input image Ib : According to dh ; with the estimation algorithm. Details of the whole algorithm are
similar method to obtain Sx ; by examining different k for summarized as follows.
all lkl , Hb ; the set Sy of possible vertical translations can
be obtained. With Sx and Sy ; the set Sxy of possible 3.1.1. Edge-based translation estimation algorithm
translations can be obtained as follows: Sxy ¼ {ðx; yÞlx [ Ia and Ib : two adjacent images prepared to be stitched.
Sx ; y [ Sy }:
Once Sxy is obtained, we want to determine the best
Step 1. Apply a vertical edge detector to find the sets Pva
translation from Sxy through a correlation technique. In this
and Pvb of vertical edge positions from Ia and Ib ;
technique, two commonly used measures are the sum of
respectively.
intensity differences and the normalized cross-correlation,
Step 2. Determine the set Sx of possible horizontal
respectively, defined as:
translations from Pva and Pvb based on dv ði; kÞ (see Eq. (4)).
X
x;y¼K Step 3. Apply a horizontal edge detector to find the sets
Dðp; qÞ ¼ lIa ðx þ px ; y þ py Þ Pha and Phb of horizontal edges from Ia and Ib ;
x;y¼2K respectively.
2 ma 2 Ib ðx þ qx ; y þ qy Þ þ mb l; ð6Þ Step 4. Determine the set Sy of possible vertical
translations from Pha and Phb based on dh ði; kÞ (see Eq.
and (5)).
X
x;y¼K Step 5. Let Sxy denote the set of possible translations, i.e.
1
Cðp; qÞ ¼ ½Ia ðx þ px ; y þ py Þ Sxy ¼ {ðx; yÞlx [ Sx ; y [ Sy }:
sa sb ð2K þ 1Þ2 x;y¼2K Step 6. Determine the best solution ðtx ; ty Þ from Sxy
through a correlation technique and a branch-and-
2 ma ½Ib ðx þ qx ; y þ qy Þ 2 mb ; ð7Þ bounded method.
When the translation ðtx ; ty Þ is found, the M of Eq. (1) can Condition 1. Pðx; yÞ must be an edge point of the
be set as: m0 ¼ 1; m1 ¼ 0; m2 ¼ 2tx ; m3 ¼ 0; m4 ¼ 1; m5 ¼ image Iðx; yÞ: This means that Pðx; yÞ is a local maxima
2ty ; m6 ¼ 0; and m7 ¼ 0: of l7I s ðx; yÞls¼2 and l7I s ðx; yÞls¼2 . a threshold, i.e. 20;
Condition 2. l7I s ðx; yÞls¼2 ¼ maxðx0 ;y0 Þ[Np
3.2. Motion parameter estimation by feature matching {l7I ðx ; y Þls¼2 }; where Np is a neighborhood of Pðx; yÞ
s 0 0
within a 27 £ 27 window.
As described in Fig. 1, two strategies are used to find
respective initial estimates of camera parameters for 3.2.2. Correspondence establishment
further optimization process. In this section, details of In Section 3.2.1, we have described how the feature
the correspondence-based method are described. In points between Ia ðx; yÞ and Ib ðx; yÞ are derived. Now, we are
Section 3.2.1, we will propose a new method to extract a ready to find the matching pairs between Ia and Ib : Let
set of useful feature points from images based on edges. FPIa ¼ {pi ¼ ðpix ; piy Þ} and FPIb ¼ {qi ¼ ðqix ; qiy Þ} be two
Then, details of building correspondences between features sets of feature points extracted from two images Ia and Ib ;
are described in Section 3.2.2. However, due to noise, many respectively. In addition, Nfa and Nfb represent the number of
false matches will also be generated. In Section 3.2.3, a new elements in FPIa and FPIb ; respectively. The similarity
scheme is proposed to eliminate all impossible false between two feature points p and q is measured by their
matches. normalized cross-correlation. For each point pi in FPIa ; find
the maximum peak of the similarity measure as its best
3.2.1. Feature extraction matching point q in another image Ib : Then, a pair {pi , qi }
In this section, we will use several edge operators to is qualified as a matching pair if two conditions are satisfied:
extract a set of useful feature points as keys to derive desired
registration parameters. First of all, let Gs ðx; yÞ be denoted CIa I b ðpi ;qi Þ ¼ max CIa I b ðpi ;qk Þ and CIa I b ðpi ;qi Þ $ Tc ; ð8Þ
qk [FPIb
as the 2D Gaussian smoothing function as follows:
!
x2 þ y2 where Tc ¼ 0:75: The first condition enforces to find a
s
G ðx; yÞ ¼ exp 2 ; feature point qk [ FPIb such that the measure CIa ;Ib is
2 s2
maximized. As for Condition 2, it forces the value CIa ;Ib of a
where s is a standard deviation of the associated probability marching pair to be larger than a threshold (0.75 in this
distribution. Let Gsx ðx; yÞ and Gsy ðx; yÞ denote the first partial case).
derivatives of Gs in the x and y directions, respectively, i.e.
! 3.2.3. Eliminating false matches
s x x2 þ y2 In the previous section, through matching, a set of
Gx ðx; yÞ ¼ 2 2 exp 2 and Gsy ðx; yÞ
s 2s2 matching pairs has been extracted. However, if the relative
! geometries of features are considered, the matching results
y x2 þ y2 can be refined more accurately. Therefore, in this section,
¼ 2 2 exp 2 :
s 2s2 we will define a matching goodness for refining the
matching results. Let MPIa ;Ib ¼ {pi , qi }i¼1;2… be the set
The gradients of an image Iðx; yÞ smoothed by Gs ðx; yÞ at of matching pairs, where pi is an element in FPIa and qi
scale s in the x and y directions can be then defined, another element in FPIb : Let NeIa ðpi Þ and NeIb ðqi Þ be
respectively, as: denoted as the neighbors of pi and qi within a disc of radius
Ixs ðx; yÞ ¼ I p Gsx ðx; yÞ and Iys ðx; yÞ ¼ I p Gsy ðx; yÞ; R; respectively, where R is set to 200 in this paper. Assume
that NPpi qj ¼ {n1k , n2k }k¼1;2… is the set of matching pairs,
where p means a convolution operation. Then, the modulus where n1k [ NeIa ðpi Þ; n2k [ NeIb ðqj Þ; and all elements of
of the gradient vector of Iðx; yÞ is: NPpi qj belong to MPIa ;Ib : The proposed method is based on a
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi concept that if {pi , qi } and {pj , qj } are two good
l7I s ðx; yÞl ¼ lIxs ðx; yÞl2 þ lIys ðx; yÞl2 : matches, the relation between pi and pj should be similar to
the one between qi and qj : Based on this assumption, we can
If all the local maxima of l7I s ðx; yÞl are located and measure the goodness of a matching pair {pi , qi }
thresholded with a preset value, then all edge points of according to how many matches {n1k , n2k } in NPpi qi
Iðx; yÞ at scale s can be detected. Since we are interested in whose distance dðpi ; n1k Þ is similar to the distance dðqi ; n2k Þ;
some specific feature points for image stitching, additional where dðui ; uj Þ ¼ kui 2 uj k; the Euclidean distance between
constraints have to be introduced. Basically, this paper two points ui and uj : With this concept, the measure of
defines the feature point as the one whose edge response is goodness for a match {pi , qi } can be defined as:
the strongest with a local area. In addition, in order to
suppress the effect of noise, s is set to 2. In what follows, the X Cðn1k ; n2k Þrði; kÞ
two conditions adopted here for judging whether a point GIa Ib ðiÞ ¼ ;
1 þ distði; kÞ
Pðx; yÞ is a feature point or not are summarized as follows: {n1k ,n2k }[NPpi qi
296 J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306
where distði; kÞ ¼ ½dðpi ; n1k Þ þ dðqi ; n2k Þ=2; Cðn1k ; n2k Þ the has some false matching pairs, it is not guaranteed that four
correlation measure between n1k and n2k ; correct pairs will always be well selected. In what follows, a
( 2uði;kÞ=T1 Monte-Carlo style method is proposed to find a good
e if mði; kÞ , T2 initialization of camera parameters through a series of tries
rði; kÞ ¼ ;
0; otherwise and testing.
with the two predefined thresholds T1 and T2 ; and 3.3. Motion parameter estimation using Monte Carlo
method
ldðpi ; n1k Þ 2 dðqi ; n2k Þl
uði; kÞ ¼ :
distði; kÞ In Sections 3.1 and 3.2, two different strategies have been
proposed to obtain different motion parameters from
The contribution of a pair , {n1k n2k }
in NPpi qi monotoni- different views. In this section, a Monte-Carlo-style method
cally decreases based on the value of distði; kÞ: Besides, if is proposed for integrating these methods together for
the value of uði; kÞ is larger than the threshold T2 ; the further optimization process.
contribution of {n1k , n2k } is set to zero. The spirit of the Monte Carlo method is to use many
After calculating the goodness of each pair {pi , qi } in tries to find (or hit) the wanted correct solution. Assume
MPIa ;Ib ; we can obtain their relative goodness GIa Ib ðiÞ for each try can generate a solution and the probability to find
further eliminating false matches. Assume G is the average
a correct solution for each try is r: After k tries, the
value of GIa Ib ðiÞ for all matching pairs. If the value of GIa Ib ðiÞ probability of continuous failure to find a correct solution
is less than 0:75 G; the matching pair {pi , qi } is
is s ¼ ð1 2 rÞk : Clearly, even though r is very small, after
eliminated. hundreds or thousands of tries, s will tend very closely to
After eliminating impossible false matches, a set MPr of zero. In other words, if we define a try as a random
remained pairs can be found from MPIa ;Ib : Clearly, if four selection of four matching pairs, each try will generate a
correct matching pairs can be selected from MPr ; the solution by solving Eq. (9). Then, it can be expected that
desired solution M can be found by solving the following a correct solution M will be obtained after hundreds or
equation: thousands of tries.
As we know, for each try, four matching pairs will be
AM T ¼ b; ð9Þ selected for obtaining one possible solution of Eq. (1). If
MPr has Nr elements and Nc ones are correct, the
where probability to select four correct pairs for each try will
2 3
x1 y1 21 0 0 0 2x01 x1 2x01 y1 be
6 7
6 7
6 x2
6 y2 21 0 0 0 2x02 x2 2x02 y2 7
7
Nc ðNc 2 1ÞðNc 2 2ÞðNc 2 3Þ
:
6 7 Nr ðNr 2 1ÞðNr 2 2ÞðNr 2 3Þ
6 7
6 · ·· ·· · · ·· · · · · ·· · · · · ·· ·· · 7
6 7
6 7
6 7 In what follows, a method is proposed to improve
A ¼ 6 x4 y4 21 0 0 0 2x4 x4 2x4 y4 7 0 0
;
6 7 the probability for each try to find a correct solution by
6 7
6 7 separating images into grids. Assume all the correct
60 0 0 x1 y1 21 2y01 x1 2y01 y1 7
6 7 and false matching pairs distribute very randomly.
6 7
6 7
6 · ·· ·· · · ·· · · · · ·· · · · · ·· ·· · 7 Then, if the input images are segmented into several
4 5
grids, in each grid the probability to select a
0 0 0 x4 y4 21 2y04 x4 2y04 y4 correct matching pair is still Nc =Nr : Therefore, we
2 3
x01 can select four different girds first and then get one
6 7 matching pair from each grid. With this method, the
6 07
6 x2 7 probability to select four correct matching pairs will
6 7
6 7 become Nc4 =Nr4 : Clearly,
6 7
6 ·· · 7
6 7
6 7
6 7 Nc ðNc 2 1ÞðNc 2 2ÞðNc 2 3Þ N4
b ¼ 6 x04 7; and {ðxk ;yk Þt , ðx0k ;y0k Þt }k¼1;…;4 , c4
6 7
6 7 Nr ðNr 2 1ÞðNr 2 2ÞðNr 2 3Þ Nr
6 07
6 y1 7
6 7
6 7 if Nc , Nr : Thus, the suggested method can better
6 7
6 ·· · 7 enhance the hit rate of finding four correct matching
4 5
pairs to derive desired parameters.
y04
On the other hand, since the Monte Carlo method uses
the set of four selected pairs. Eq. (9) can be solved by using lots of tries to find final desired solutions, we should propose
the Householder transform [15]. However, since MPr still a verification process to determine which try is the best.
J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306 297
Fig. 4. Intensity adjustment: (a) original images Ia and Ib ; (b) after adjusting, the intensities between Ia and Ib are getting closer.
Assume M i ¼ ðmi0 ; mi1 ; …; mi7 Þ is the solution got from the ¼ M 0 : Repeat
Step 4. Let i ¼ 1; k ¼ 0; C ¼ cðM 0 Þ; and M
ith try. The verification process can be achieved by the following steps:
comparing how many matching pairs in MPr are consistent Step 4.1. Randomly generate four different real
to M i : Let {p $ q} be a matching pair and the consistent numbers {am }m¼1;…;4 such that 0 # am , 1;
error eðp; q; M i Þ of this pair to M i be defined as: Step 4.2. Determine four different integers {bm }m¼1;…;4
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
!2 !2
u
u mi0 px þ mi1 py þ mi2 mi3 px þ mi4 py þ mi5
i
eðp; q; M Þ ¼ t x
q 2 i x y
þ q 2 i x : ð10Þ
m6 p þ mi7 py þ 1 m6 p þ mi7 py þ 1
For each matching pair {pk $ qk } in MPr ; if satisfying Prðbm Þ # am , Prðbm þ 1Þ: If the set
eðpk ; qk ; M i Þ , Te ; the pair {pk $ qk } is said to be consistent {bm }m¼1;…;4 fails to be found, go to Step 4.1;
to M i ; where Te is a threshold set to 6 for the consistency Step 4.2. Obtain the set SiP of four matching pairs by
check. Based on Eq. (10), a counter cðM i Þ is used to record selecting one matching pair from the bm th gird for m ¼
how many matching pairs in MPr which are consistent to 1; 2; …; 4;
M i : After several tries, the best solution M can be obtained Step 4.3. Get the solution M i from SiP by solving Eq.
as follows: (9);
Step 4.4. Calculate cðM i Þ; the number of matching pairs
¼ arg max cðM i Þ:
M ð11Þ in MPr which are consistent to M i ;
Mi
Step 4.5. If cðM i Þ . C then C ¼ cðM i Þ and M ¼ Mi;
When initialization ði ¼ 0Þ; M 0 is got from the edge Step 4.6. i ¼ i þ 1; If i , MaxIterations; go to Step 4.1;
alignment approach (see Section 3.1). Based on above
descriptions, details of the proposed method can be 3.4. Parameter refinement through optimization
summarized as follows:
With the Monte Carlo method, the best estimate M can
3.3.1. Integrated parameter estimation algorithm be found from MPr : However, if an optimization process is
MaxIterations. Maximum number of iterations and set to can be further refined. In Section 2, a method
applied, M
be 300 here.
the desired
L: grid dimension and set to be 8 here, M:
solution.
Table 1
Two sets of synthetic camera motions used to generate the synthetic images of Figs. 6 and 7, respectively
Images Parameters
m0 m1 m2 m3 m4 m5 m6 m7
Synthetic image pair1 Real values 1.0 0.1 242.0 0.1 1.0 220.0 0.0 0.0
Estimated 1.00003 0.09960 241.35 0.09954 1.00034 220.587 0.0000018 0.000001
Synthetic image pair 2 Real values 1.0 0.1 240 20.1 1.05 290 0.0 0.0
Estimated 1.00005 0.0993 239.01 20.0998 1.05006 289.5 0.0000078 0.000004
The estimation results of model parameters are shown in rows 4 and 6, respectively.
for deriving desired parameters has been described by parameters by minimizing errors only on positions of
minimizing the discrepancy in intensities of all pixels feature points.
between two images. In this section, instead of minimizing In Section 3.2, two sets of feature points, i.e. FPIa and
the whole image, we will describe a method to find desired FPIb ; have been extracted from two images Ia and Ib ;
Fig. 6. Stitching result of two synthetic temple images generated with the camera parameters m0 ¼ 1:0; m1 ¼ 0:1; m2 ¼ 242; m3 ¼ 0:1; m4 ¼ 1:0; m5 ¼ 220;
m6 ¼ 0; and m7 ¼ 0; (a) and (b) are the pair of synthetic images and (c) is the stitching result.
J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306 299
respectively. For each point pi in FPIa and qj in FPIb ; where {pk , qk } is an element in MPM : By calculating the
if eðpi ; qj ; MÞ
according to Eq. (10) and M; , Te ; we denote gradient and Hessian matrix of F; M can be updated with
{pi , qj } as a new match. Then, after checking all elements the iterative form:
in FPIa and FPIb ; a new set MPM of matching pairs can be
M Tt þ ðA þ lÞ21 B;
Ttþ1 ¼ M ð13Þ
obtained as:
MPM ¼ {pk , qk ; k ¼ 1; 2; …; NM }; where t is the iteration number,
where pk [ FPIa ; qk [ FPIb ; and eðpk ; qk ; MÞ
, Te : Then, X
NM
›ek ›ek X
NM
›e k
we can define an error function as: ½Aij ¼ ; ½Bi ¼ ek ;
k¼1
›m
i ›m
j k¼1
›m
i
X
NM
¼
FðMÞ eðpk ; qk ; MÞ;
ð12Þ and l is a coefficient obtained by the Levenber –Marquardt
k¼1 method [15]. The above minimization process quickly
Fig. 7. Stitching result of the synthetic ‘White House’ images generated with the camera parameters m0 ¼ 1:0; m1 ¼ 0:1; m2 ¼ 240; m3 ¼ 20:1; m4 ¼ 1:5;
m5 ¼ 290; m6 ¼ 0; and m7 ¼ 0; (a) and (b) are the synthetic images and (c) is the stitching result.
300 J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306
converges since only the coordinates of feature positions are where lAl is the overlapping area of Ia and Ib ; pi a pixel in
is
considered into minimization and the initial estimate of M Ia ; and qi its corresponding pixel in Ib : Assume Wa and
very close to the final solution. Wb are the widths of Ia and Ib ; respectively. According to
DI; Wa and Wb ; the intensities of Ia and Ib will be
3.5. Blending technique for mosaic construction adjusted as:
Fig. 8. Stitching results when different feature masks are used. (a) and (b) Pairs of results of feature extraction and matching. The sizes of used masks are
15 £ 15, 23 £ 23, 35 £ 35 and 51 £ 51 masks, respectively. (c) Stitching results obtained by accordingly stitching pairs of images in (a) and (b).
J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306 301
Table 2
Estimation results of camera parameters obtained from pairs of images shown in Fig. 8 when different feature masks are used
m0 m1 m2 m3 m4 m5 m6 m7
True values 1.0 0.1 242.0 0.1 1.0 220.0 0.0 0.0
15 £ 15 mask 1.00001 0.09999 241.55 0.09997 1.00024 220.224 0.0000019 0.000002
23 £ 23 mask 1.00002 0.09989 241.15 0.09955 1.00025 220.157 0.0000017 0.000001
27 £ 27 mask 1.00003 0.09960 241.35 0.09954 1.00034 220.587 0.0000018 0.000001
35 £ 35 mask 0.99995 0.09998 242.01 0.09994 1.00023 219.998 0.0000016 0.000001
43 £ 43 mask 0.99991 0.09999 241.45 0.09996 1.00034 220.614 0.0000012 0.000004
51 £ 51 mask 0.99991 0.09999 241.45 0.09996 1.00034 220.614 0.0000012 0.000004
be obtained by: With Eq. (16), the intensities of Ia will be gradually changed
to Ib :
dbe Ia ðpi Þ þ dae Ib ðqi Þ
Ic ðri Þ ¼ ; ð16Þ
dae þ dbe 3.6. Complexity analysis
where da is the distance between pi and la ; db the distance In order to understand the efficiency of the proposed
between qi and lb ; and e an exponential order for weighting. method, in what follows, details of complexity analysis of
Fig. 9. Stitching result of a series of panoramic images. (a) Series of panoramic images. (b) Stitching result.
302 J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306
each proposed algorithm will be given. Assume all input verified by comparing the differences between the estimated
images are with the same dimension NI £ NI : Then, the camera parameters and the true ones.
complexity of extracting edge features will be OðNI2 Þ and Fig. 6 shows the stitching result of the synthetic
thus the proposed edge alignment method has the temple image. The red symbol, ‘ þ ’, indicates the
complexity OðNI2 Þ: As to the feature matching method, positions of located feature points. The symbols, shown
the time complexity of feature extraction is OðNI2 Þ and the in (a) and (b), with the same index mean they are a
number of feature points will increase according to the matching pair. From (c), clearly, even though the
order OðNI2 Þ: However, since the used feature points are displacement between (a) and (b) is larger, the proposed
extracted along edges and constrained by a window mask, method still works well to stitch them together. Fig. 7 is
the number of feature points can be properly controlled to another result of synthetic images, i.e. the White House.
increase with the order OðNI Þ: Thus, the complexity of Although the overlapping area between these images is
correlation matching is OðK 2 NI2 Þ; where K 2 is the mask small, the proposed method still successfully stitches
size for correlation calculation (Eq. (7)). As to the them together. On the other hand, in order to examine
algorithm to eliminate false matches, since the number the robustness and sensibility of the proposed feature
of matching pairs will increase according to the order extraction and matching method, we used several masks
OðNI2 Þ; the complexity to eliminate false matches is
with different sizes to locate different features for
OðR2 NI2 Þ; where R is a disc of radius to calculate the
matching (see Section 3.2.1). If the number of feature
goodness of a feature point. However, the number of
points is large or too small, quite error in feature
feature points located within the radius R of a feature
matching will be produced. In this experiment, ten masks
point is less than K 2 : Therefore, the complexity to
with different sizes are used, i.e. 15 £ 15, 19 £ 19,
eliminate false matches will be OðK 2 NI2 Þ: As to the
Monte Carlo method, its complexity depends on the
number of iterations, i.e. OðMaxIterationsÞ: Since K 2 NI2 @
MaxIterations; the complexity to find an initial solution of
M (see Eq. (1)) through feature matching is still OðK 2 NI2 Þ:
Then, the scheme to combine the edge alignment and the
feature matching method together for obtaining a good
initial solution of M is still OðK 2 NI2 Þ:
As to the final optimization method (Eq. (13)), each
iteration to refine desired camera parameters is a ratio of
image size, i.e. rNI2 ; where r ! 1: Thus, the used
optimization method has the complexity Oðtmax rNI2 Þ;
where tmax is the maximum number of iterations. Thereby,
the total complexity of finding desired camera parameters is
Oððtmax r þ K 2 ÞNI2 Þ; Since the optimization is focused on
features points, the term tmax r will be less than K 2 : In
addition, the proposed blending technique has the time
complexity OðNI2 Þ: Therefore, the proposed algorithm for
mosaic construction and object detection is with the time
complexity OðK 2 NI2 Þ:
4. Experimental results
Fig. 11. Stitching result when images have moving objects. (a and b) Two adjacent images with moving objects. (c) Stitching result.
Fig. 12. Stitching result when the camera has some rotation change. (a and b) Two adjacent images. (c) Stitching result.
304 J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306
23 £ 23,…, and 51 £ 51. For this examination, the edge larger lighting changes, the proposed method still works
alignment method (see Section 3.1) is not used. Fig. 8(a) well to find all desired camera parameters for stitching.
and b show the results of feature extraction and matching Fig. 11 shows the case when images have some moving
listed according to the sizes 15 £ 15, 23 £ 23, 35 £ 35 objects. The moving object will disturb the work of
and 51 £ 51, respectively. Table 2 shows details of image stitching. However, the proposed method still
camera parameters estimated when these masks are used. successfully stitches them together. Fig. 12 shows the
Clearly, even though different masks are used, proper result when images have some rotation and
matching pairs still can be found and thus desired camera skewing effects. In this case, the proposed Monte Carlo
parameters can be very accurately estimated. The method still works well to find the correct camera
proposed method failed to stitch images when a parameters.
55 £ 55 mask is used since too few features are extracted The proposed method also can be used in camera
for matching. compensation for extracting moving objects from video
Fig. 9 shows the result for mosaic construction when sequence. Fig. 13 shows two frames got from a movie. In
a series of panoramic images are used. In this case, order to detect the moving object, a static background
before stitching, all the images are projected into a should be constructed. With the proposed method, the
cylindrical map [5]. Then, only the translation parameters camera motion between Fig. 13(a) and (b) can be
need to be estimated. Fig. 10 shows the case when well found and compensated. Fig. 13(c) is the mosaic of
images have larger intensity differences: (a) and (b) are Fig. 13(a) and (b). Then, the moving object can be
the original images and (c) is the stitching result. The detected by image differencing like Fig. 13(d). The
large lighting changes will lead to the instability of detection result is very useful for various applications
feature matching in the traditional matching techniques like intelligent transportation system, video indexing,
like block matching or phase correlation. However, in video surveillance, and, etc. Fig. 14 is another case
this paper, the proposed edge alignment algorithm tries to when a moving car appears in the video sequence. From
find all possible translations by checking the consistence the experimental results, it is obvious that the proposed
of edge positions instead of comparing the intensity method is indeed an efficient, robust, and accurate method
similarity of images. Therefore, even though images have for image stitching.
Fig. 13. Mosaic construction and object detection when images have a moving object: (a and b) are two adjacent images; (c) is the mosaic result of (a) and (b);
(d) is the object detection result by image differencing.
J.-W. Hsieh / Image and Vision Computing 22 (2004) 291–306 305
Fig. 14. Mosaics and object detection when images have a moving object: (a and b) are two adjacent images; (c) is the mosaic result of (a) and (b); (d) is the
object detection result by image differencing.