Sei sulla pagina 1di 4

On the Design and Performance of ELC and APB

Algorithms for the Reconstruction of Shredded


Documents
1
R.Lotus,
2
Justin Varghese
1
Centre for Information Technology & Engineering, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu, India
2
College of Computer Science, King Khalid University, Abha, Saudi Arabia

AbstractReconstruction of hand torn paper documents is a
challenging task in forensic and investigation sciences. In this
paper, the design aspects of the important reconstruction
algorithms proposed by Edson Justino, Luiz S. Oliveria,
Cinthia Freitas (ELC) and Arindam Biswas, Partha
Bhowmick, Bhargab B. Battacharya (APB) for hand torn
documents are analysed, their reconstructed results are
compared and the merits of the algorithms are understood.
I. INTRODUCTION
Documents store, organize and elucidate information
for education, enlightenment, and enrichment of
civilization. Reconstruction of torn documents is essential
to import information which has wide application in
forensic sciences, art conservation, and archaeology [10].
Documents get deteriorated due to insects, moisture,
temperature, humidity, constant handling, obliteration and
shredding. Shredding can be performed by machine or by
hand. Manual reconstruction of shredded document is a
time consuming job, needs hard work of experienced
personals. Digitization makes the job easier. Automation
of reconstruction through image processing algorithms
yields effective solution. Wolfson [1] proposed an efficient
two curve matching algorithm for puzzle solving.
Boundaries are represented by shape feature strings
obtained by polygonal approximation. Kong and Kimia [2]
resampled the boundaries using polygonal approximation
to reduce the complexity in curve matching, and used
dynamic programming to align fragments. Recently fields
like art conservation, archaeology have adopted jigsaw
puzzle solving techniques to reconstruct wall paintings of
ancient buildings [12][6], pottery fragments [11], [7].
Justino et al. [5] proposed algorithm to reconstruct hand
shredded paper documents. An extracted feature of
simplified polygon determines the matching pieces.
A.Biswas et al. [8] proposed a method using chain code of
contours of fragmented pieces and its Minkowski sum for
reconstruction. A. Pimenta et al. [9] proposed algorithm to
reconstruct hand shredded paper documents where
extracted features of simplified polygon are fed into
longest common subsequence (LCS) dynamic
programming algorithm. The scores of LCS are used in the
modified Prims algorithm to determine the matching
fragments. L. Zhu et al. [14] proposed a global approach
The proposed work analyses the design aspects of
important hand torn document reconstruction algorithm
proposed by Edson Justino, Luiz S. Oliveria and Cinthia
Freitas [5] (ELC algorithm) and Arindam Biswas, Partha
Bhowmick and Bhargab B. Battacharya [8] (APB
algorithm), compares their reconstructed results of the
algorithm and depicts the merits of these two algorithms.
This paper organised into five Sections. Section II
explains the general methodology for the reconstruction of
hand torn document. Section III narrates the design aspects
of ELC and APB reconstruction algorithms. Experimental
results and comparative analysis are provided in Section
IV. Section V concludes the paper.
II GENERAL METHODOLOGY FOR THE RECONSTRUCTION OF
SHREDDED DOCUMENTS
A general methodology for the reconstruction of
shredded documents is illustrated in Fig.1. It consists of
pre-processing, feature extraction and reconstruction
stages. Pre-processing the scanned images of torn
fragments is essential to extract adequate and effective
features. Pre-processing are done like contour tracing [8],
contour simplification [5]. Features like distance, angle,
and colours of vertex to be extracted from pre-processed
fragments images are based on the reconstruction
principles of different algorithms. Fragments satisfying
matching criteria are merged to form reconstructed original
document.
Fig.1.General methodology for document reconstruction
Scanned images of input fragments
Pre-processing
Merge the two fragments and
form a new fragment
Reconstructed document
Feature extraction
Matching
criteria
Yes
No
III. ELC ALGORITHM
The ELC algorithm is proposed by Edson Justino,
Luiz S. Oliveria and Cinthia Freita for the reconstruction
of hand shredded documents. The various stages of the
algorithm are explained through the following steps:
Step:1-Pre-processing. Scanned images,
f
I

( ) 1, 2, ........ , f n n total number of fragments
of hand
shredded document fragments have irregularities in
boundaries. Douglas Peucker (DP) polyline simplification
algorithm [3] is implemented on fragments
'
f
I
contours,
f
C

to get well defined, simplified polyline boundary with
reduced irregularities. Contours of scanned images

f
I
are
passed through polyline simplification process for
reducing he number of vertices in a fragments contour
and to produce a simplified polygon which approximates
the original contour shape. DP uses closeness of contour
vertices to the edge of polygon. The polyline
simplification starts with initial edge segment, guesses
between the initial vertex, ( )
1 1
, v i j
and last vertex,
( ) ,
l l
v i j
, intermediate vertices are checked for closeness
to that edge segment. The contour vertices far away from
the initial edge segment, whose distance from initial edge
segment exceeds the specified tolerance (distance), form
simplified polygon edges. The vertices lesser than
specified tolerance are discarded. The process produces the
simplified edge segment. The procedure is repeated until
all the contour vertices fall within specified tolerance of
simplification. Finally the chosen vertices form a polygon
( ) P
which approximates the original contour shape.
Step:2-Feature extraction. The vertices of the pre-
processed and simplified polygon, f
P
is subjected to the
Feature Extraction process. For each vertex,
fm
v
of
simplified polygon,
f
P
where
( ) 1, 2, ... = . , m t t total no of vertices of polygon f
feature
extraction process computes features like Euclidean
distance,
p
v
d
of the vertex
fm
v
with the previous vertex
pr
v
and Euclidean distance
n
v
d
of the vertex
fm
v
with the
next vertex
ne
v
and angle v

with respect to previous and


next

distances, such that,

( ) ( )
2 2
1
1 p
i i
i i v
y y
d x x
_
+


,

(1)

( ) ( )
2 2
1
1 n
i i
i i v
y y
d x x
_
+

+
+
,

(2)

1
.
cos
p n
p n
v
v v
v v
d d
d d


_




,
(3)
where, ( )
,
i j
x y
are the co-ordinates of the current vertex,
( )
1 1
,
i j
x y
are the co-ordinates of previous vertex and
( )
1 1
,
i j
x y
+ + are the co-ordinates of the next vertex. Thus for
every vertex,
f m
v

feature list contains vertex coordinate
position, distances with the previous and next neighbour
and angle as shown in Table 1.
Step: 3-Matching. The features of the vertices extracted in
the previous step determine the degree of matching
possibility of any two fragments being compared.
Matching criteria necessitates the summation of angles of
vertices of two polygons, i.e.,
im
v
of
th
i polygon,
( ) 1 i f
and
jm
v
of the
th
j polygon, ( ) 1 / j f i j
must be equal to 360. If the complement of angles is 360
then a matching parameter,
angles
w
is set to 1.
The previous and next distances ( ) ,
pr ne
v v
of the
th
m
vertex of
th
i fragment,
im
v
and previous and next distances
of
( ) ,
pr ne
v v
of the th
m
vertex of
th
j
fragment

are compared.
The fragments considered for matching are allotted a
matching degree,
matching
w
such that
( ) ( ) ( ) ( )
( )
( ) ( ) ( ) ( )
( )
1 and =1
5 and =1
im pr jm pr im ne jm ne
im pr jm pr im ne jm ne
v v v v angles
matching
v v v v angles
if d d or d d w
w
if d d and d d w

'

(4)
matching
w
is increased, if polygons,
i
P
and
j
P
of fragments i
and j under consideration satisfy certain degrees of
matching as follows,
matching
+2, 1/5 '
1, 1/10 '
, .
w
matching matching
matching matching
matching
if of polygons perimeter matches
if of polygons perimeter matches
otherwise
w w
w w
w

(5)
Step:4-Reconstruction. Once the metric to find matching
fragments has been determined, the process enters the
reconstruction phase. The algorithm compares each
fragment with all other fragments to find best matching as
that match which maximizes matching
w
. The fragments with
maximum matching
w
are merged to form a new fragment.
The features of new fragment are added to the feature list,
the merged vertices are removed and the matching process
is continued i.e., if fragments i and j

are merged to form a
new fragment, the polygon, ij
P
of newly formed fragment
is added in the fragments list and the whole reconstruction
procedure is continued from the first step for the remaining
number of fragments.
IV. APB ALGORITHM
APB algorithm is proposed by Arindam Biswas,
Partha Bhowmick and Bhargab B. Battacharya for the
reconstruction of hand shredded documents. The various
stages of the algorithm are explained through the following
steps:
Step:1-Pre-processing. Contours,
f
C
of scanned images
f
I
of f torn fragments are extracted using differential
operators [4].
Fig. 5. Reconstruction results of APB algorithm
Table 1.Feature List for ELC Algorithm
Corners m
v
(vertices) of each contour are detected
using the bending values [13] of discrete points
constituting f
C
.
Step: 2- Feature extraction. After pre-processing, the
feature list is generated for all contours
f
C
. Feature list for
the APB algorithm includes the distance between
consecutive corners determined in the clockwise direction.
Distance between consecutive corners,
m
v
and
n
v

( ) ( )
1 mod n m f +
of individual contours is calculated
in clockwise direction as

( ) ( )
2
2
mn i j
i j
y y
d x x
_
+

,

(6)
Chain code ( ) cc of edge segment between
m
v
and
n
v
of
individual fragments contour is determined. Edge
identification number,
m
e
and length of the chain code of
edge,
fm

are also noted. With all these features, a single


height balanced AVL tree is constructed. Euclidean
distances of all edges in all contours are stored as primary
keys and edge number
m
e
and their respective feature list
are stored as auxiliary keys.
Step: 3- Matching. After the AVL tree construction, for
each distance
mn
d
of edge segments of all contours, a
search for distances which fall within the range
[ ] +
d d m f m f , ,
,
is performed in the tree. Here
4 5
pixels. These distances are arranged in ascending order
corresponding to the differences among distances being
compared.
Step: 4-Reconstruction. In the reconstruction phase, the
best match of contour f
C
with the edge number
m
e
is the
contour with the edge number
,1 m
e
which corresponds to
the first distance in the ascending array of distances.
Minkowski sum is defined for edge numbers ( )
,1
,
m m
e e
considering its union operation with a circular disc,
( ) D ,
of radius

. Required transformation like translation and


rotation are performed for these contours. If the edge
,1 m
e

is contained inside the envelope then the fragments
corresponding to the edge numbers,
,1

m m
e and e
are
matching pieces. Otherwise next possible edge from
distance array is checked with
m
e
for reconstruction. If the
match is found, the new fragment
ij
C
which replaces the
matching fragments
i
C
and
j
C
. The newly formed
contour is added in the fragments list during the second
iteration of reconstruction. The first iteration continues
until the above said steps are carried out on all the
remaining
2 f
contours.
IV. EXPERIMENTAL RESULTS
The experiments on reconstruction of shredded
document were implemented in Matlab on a INTEL core
(TM), 2.53GHz machine. Manually shredded test
document used for ELC and APB algorithm are shown in
Fig.2 and Fig.3. Shredded documents are reconstructed by
ELC and APB algorithms.
ELC and APB algorithms design aspects reconstruct
the hand torn documents. Fragments which has jigsaw
edge segment [Fig.5(a)] will satisfy the matching criteria
designed by ELC algorithm. In this algorithm the
summation of angles of two vertices of fragments under
consideration should be 360 and the respective previous,
next distances of fragments considered should be equal.
Natural shredding shall not always generate shredded
document of the type that can be reconstructed by the
matching criteria suggested by ELC algorithm. So the
shredded document of test image 2, though are matching
document are not merged by the ELC algorithm, since they
do not satisfy its matching criteria. The degrees of freedom
in matching some of the angles are explained imprecisely.
According to normal tearing style of human, the
probability of occurrence of such kind of hand torn
fragments [Fig.6(a)] is less. So the ELC algorithm cannot
produce reconstruction results for the shredded documents
of test image 2.
APB algorithm uses distances between the edge
segments of different fragments in matching phase. Those
fragments whose differences of distance of edges between
the vertices of the fragments which lie within predefined
threshold

, ( ) 4 to 5 pixels
are taken as matching
document by APB algorithm for merging. The matching
fragments suffer from one or two pixel variance due to
acquisition defects or due to pre-processing techniques.
Though the pre-processing steps of APB algorithm
generates one or two pixel variances, the distance
matching criteria with the predefined threshold ( )

overcomes pre-processing limitations of APB algorithm.
Since the reconstruction stage of APB algorithm deals only
with distance, it works with any type of hand torn
fragments.
Comparing the APB and ELC algorithms, the ELC
algorithm though passes through improved pre-processing
stages is limited in the reconstruction of specific fragments
vertex coordinate
n v
d
v

1
v
( )
1 1
, x y
( ) 1,m
v
d
( ) 1,2
v
d 1
v

2
v
( )
2 2
, x y
( ) 2,1
v
d
( ) 2,3
v
d
2
v

3
v
.
.
.
m
v
( )
2 2
, x y
.
.
.
( ) ,
m m
x y
( ) 3,2
v
d
.
.
.
( ) , 1 m m
v
d

( ) 3,4
v
d
.
.
.
( ) ,1 m
v
d
3
v

.
.
.
m
v

Fig. 5. Reconstruction results of APB algorithm


p v
d
Fig.6(b) fragments do not satisfy the matching
criteria of ELC algorithm.
as illustrated in figure(6.a) due to its critical with matching
angle criteria as against the APB algorithm which
reconstructs all kind of hand shredded document as shown
if Fig.6((a)and (b). So the paper suggests APB algorithm
as a general purpose reconstruction algorithm for the
reconstruction of hand torn document fragments.
IV. CONCLUSION
This work analyses the design aspects and
reconstruction results of hand torn fragments of the ELC
and APB algorithms. ELC algorithm yields reconstruction
for fragments of specified tearing style. APB algorithm
reconstructs the hand torn fragments of all tearing styles.
The future work will focus on new novel technique for
reconstruction of hand torn fragments with less time
complexity.

REFERENCES
1. H. Wolfson, On curve matching, IEEE Trans. Pattern Anal. and
Machine Intell., vol. 12, pp. 483489, 1990.
2. W. Kong and B. Kimia, On solving 2D and 3D puzzles under
curve matching, in CVPR, 2001, pp. 583590
3. David Douglas and Thomas Peucker, Algorithms for the
reduction of the number of points required to represent a digitized
line or its caricature, The Canadian Cartographer, vol. 10, pp. 112
122, 1973
4. R.C.Gonzalez and R.E WOODs, Digital Image
processing,Addison-Wesley Pub Co.,1993
5. E. Justino, L. S. Oliveira, and C. Freitas, Reconstructing
shredded documents through feature matching, Forensic Science
Intern., vol. 160, 2005.
6. C. Papaodysseus, T. Panagopoulos, M. Exarhos, C. Trianta
fillou, D. Fragoulis, and C. Doumas, Contour-shape based
reconstruction of fragmented, 1600 b.c. wall paintings, IEEE Signal
Processing, vol. 50, 2002, pp. 1277.1288.
7. H. C. G. Leit.ao and J. Stolfi, A multiscale method for the
reassembly of two-dimensional fragmented objects, IEEE Trans.
PAMI, vol. 24, 2002, pp. 1239.1251
8. A.Pimenta, E. Justino, L. S. Oliveira, and R,Sabourin
Document Reconstruction using Dynamic Programming,
IEEE,Acoutics,speech and signal processing,2009.
9. F.Kleber and R.Sablanting, A Survey of Techniques for
Document and Archaeology Artefact Reconstruction IEEE,
Document Analysis and Recognition,2009



10.A.Pimenta, E. Justino, L. S. Oliveira, and R,Sabourin
Document Reconstruction using Dynamic Programming,
IEEE,Acoutics,speech and signal processing,2009.
11.F.Kleber and R.Sablanting, A Survey of Techniques for
Document and Archaeology Artefact Reconstruction IEEE,
Document Analysis and Recognition,2009
12. M.Kampel and R.Sablanting, on 3D mosaicing of rotationally
symmetric ceramic fragments, IEEE, 2004.
13. C.papaodysseus, m.exarhos, M.Panagopoulos, P.Rousopoulos,
C.triantafillou and t.panagopoulas, Image and pattern analysis
of 1650 B.C wall paintings and reconstruction, IEEE
trans.,systems.,Man and cybernetics, vol.38, n0.4 July 2008
14.M.-J. J. Wang, W.-Y. Wu, L.-K. Huang, D.-M. Wang, Corner
detection using bending value, Patt. Rec. Letrs., 1995,pp.
575.583
15.L.Zhu,Z.Zhou, and D.Hu, Globally Consistent Reconstruction
of Ripped-Up Documents,IEEE Transactions on Pattern
Analysis and Machine Intelligence, VOL. 30, NO. 1, January
2008
Fig.2. Test Image 1
Fig.3 Test Image 2
Fig. 4. Reconstruction result of ELC algorithm
Test image 1
Test image 2
Fig. 5. Reconstruction results of APB algorithm
Test image 1
Fig.6(a) fragments satisfy the matching criteria of ELC
algorithm
Fig.3 Test Image 2
Fig.2. Test Image 1

Potrebbero piacerti anche