Sei sulla pagina 1di 9

Neurocomputing 237 (2017) 272–280

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Broken and degraded document images binarization MARK


a,b a,⁎
Yiping Chen , Liansheng Wang
a
Department of Computer Science, School of Information Science and Engineering, Xiamen University, Xiamen, China
b
School of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China

A R T I C L E I N F O A BS T RAC T

Communicated by X. Li Document image binarization refers to the conversion of a document image into a binary image. For broken and
Keywords: severely degraded document images, binarization is a very challenging process. Unlike the traditional methods
Binarization that separate the foreground from the background, this paper presents a new framework for the binarization of
Thresholding broken and degraded document images and restoring the quality of the document images. In our approach, the
Document image non-local means method is extended and used to remove noises from the input document image in the step of
pre-process. Then the proposed method binarizes the document image which takes advantage of the quick
adaptive thresholding proposed by Pierre D. Wellner. To get more pleasing binarization results, the binarized
document image is post-processed finally. There are three measures in the post-process step: de-speckle,
preserve stroke connectivity and improve quality of text regions. Experimental results show significant
improvement in the binarization of the broken and degraded document images collected from various sources
including degraded and broken books, magazines and document files.

1. Introduction Recent efforts in the adaptive method have been done in [8,9].
Specifically designed for broken document images (see Fig. 1),
Historical and ancient document collections available in libraries Stubberud et al. [10] trained an adaptive restoration filter and applied
throughout the world are of great cultural and scientific importance. the filter to the distorted text image that the OCR system could not
The transformation of such documents into digital form is essential for recognize. Adrian et al. [11] proposed the linking broken character
maintaining the quality of the originals while provide scholars with full borders with variable sized masks to improve recognition accuracy.
access to that information [1]. However, the document images may be Banerjee et al. [2] proposed the contextual restoration which learns the
broken and degraded because of the poor quality of paper, the printing text model from the degraded document itself. Touching and broken
process, ink blot and fading, document aging, extraneous marks, noise characters were corrected by the algorithm. Lazzara et al. [12] recently
from scanning, etc. [2]. Restoring and enhancing the quality of the improved Sauvola's method using a multiscale scheme. Milyaev et al.
broken and degraded document image is a common and essential [13] proposed a binarization method for accurate scene text under-
requirement in libraries. standing. Recently, Ntirogiannis et al. [14] proposed a pixel-based
Comparison with the traditional methods, e.g. median filtering [3], binarization evaluation methodology for historical handwritten/ma-
wiener filtering [4] and Bayesian filter [5], document image binariza- chine-printed document images.
tion plays a key role in document processing since its performance But it can not prevent missing punctuation marks. In addition to
affects quite critically the degree of success in a subsequent character broken document images, various methods for degraded document
segmentation and recognition [6]. The commonly used approaches of images have been proposed. Bal [15] proposed a language-independent
binarization can be classified into two types: global methods and local semi-automated system for enhancing degraded document images that
adaptive methods. The global document image binarization methods is capable of exploiting inter- and intra-document coherence.
find a global threshold for the whole document image, which can deal Likforman-Sulem et al. [16] compared two image restoration ap-
with background noise well but usually over-thresholds document proaches for the pre-processing of printed documents. She believed
images results in broken character strokers. Unlike global approaches, high-pass filtering may remove stains and holes. Lelore et al. [17]
the local adaptive document image binarization [7,1] determine the introduced a technique based on a Markov Random Field (MRF) model
threshold for each pixel adaptively, which can behave better than global of the handwritten document, whose observation model was estimated
methods on degraded and broken document images. Adaptive binar- thanks to an expectation maximization (EM) algorithm. The method
ization can also reduce noise while segmenting text from background. showed a greater sensitivity to thin line than other techniques. In the


Corresponding author at: Department of Computer Science, School of Information Science and Engineering, Xiamen University, Xiamen, China.

http://dx.doi.org/10.1016/j.neucom.2016.12.058
Received 17 February 2015; Received in revised form 2 September 2016; Accepted 26 December 2016
Available online 06 January 2017
0925-2312/ © 2017 Elsevier B.V. All rights reserved.
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Fig. 3. Traverses image alternatively from the left and from the right.

Fig. 1. A broken document image (a) and the result (c) of our proposed binarization. (b)
g (n ) + g (n − width )
is the direct OCR result of (a), and (d) is the corresponding OCR result of (c). h (n ) =
2 (4)

recent efforts, Biswas et al. [18] proposed a global-to-local approach for width is the width of the image. g(n) (in Eq. (3)) and h(n) (in Eq. (4))
this issue and Singh et al. [19] proposed a method for severely are the further approximations of the sum f(n), respectively. h(n)
degraded and non-uniformly illuminated documents. makes the algorithm get rid of the every-other-line effect and produces
To cut these shortcomings mentioned before, we propose a frame- consistently good results across a wide range of images. The method
work that contains three steps. The document images are pre-pro- works well because comparing a pixel to the average of neighborhood
cessed using the non-local means denoising method in the first step. pixels will preserve hard contrast lines and ignore soft gradient
Secondly, we extend the Wellner's quick adaptive thresholding method changes. The advantage of the method is that it just need one pass
based on histogram and use Rosenfeld's method to determine the through the image. However, there are several problems with Wellner's
threshold for each pixel. In order to get pleasing binarization results, method. First, the method is very sensitive to the scanning order.
three measures are employed to post-process the output thresholding Second, the moving average is not a good representation for the
document images in the final step including de-speckle, preserve neighborhood pixels since the similar pixels are not evenly distributed
strokes and improve quality of the text regions. Experiments and in all directions.
evaluations are performed to test our proposed method.
The structure of the paper is organized as follows. Section 2 3. Our proposed method
describes the quick adaptive thresholding. Section 3 presents the
framework of our proposed method in detail. The experimental results The proposed method for broken and degraded document images
and discussions are described in Section 4. Section 5 contains conclu- binarization is illustrated in Fig. 4. All steps of the method are
sions and our future work. described in following subsections.

3.1. Noise removal using non-local means


2. Quick adaptive thresholding
Broken and degraded document images contain many noises, so
Our proposed algorithm is an extension of Wellner's adaptive
noise removal is needed before we apply thresholding to the input
thresholding method [20]. The main idea of Wellner's algorithm is
document images. In this pre-processing step, non-local means denois-
that each pixel is compared to the average of its neighborhood pixels.
ing method [21] is used to smooth the document images. Non-local
Specially, the method is to run through the image while calculating a
means method assumes the image contains an extensive amount of
moving average of the last s pixels (see Fig. 2). When the value of a
self-similarity. It is a neighborhood filter but its neighborhood is the
pixel is significantly t percent lower than this average, it is set to black,
whole image. Non-local means method has been proved working well
otherwise it is left white.
and robust for image noise removal.
The Wellner's method first treat the image as a single row of pixel
Given a discrete noisy image v = {v (i )|i ∈ I}, the noise removal
composed of all the rows in the image lined up next to each other, see
value NL [v](i ) is defined as following equation:
Fig. 2. The sum fs(n) and average of the values of the last s pixels are
computed using the Eq. (1) and (2) respectively. The value of the NL [v](i ) = ∑ w (i , j ) v ( j )
resulting image T(n) is either 1 (for black) or 0 (for white) depending j∈I (5)
on whether it is t percent darker than the average value of the previous
the weight w (i , j ) is defined as
s pixels according to Eq. (2).
1 ⎛ ∥ v (Ni ) − v (Nj )∥2 ⎞
2, a ⎟
s −1
w (i , j ) = exp ⎜⎜ − ⎟
fs (n ) = ∑ pn −i Z (i ) ⎝ h2 ⎠ (6)
i =0 (1)
where a > 0 is the standard deviation of the Gaussian kernel, h is a
⎧ ⎛ ⎞ filtering parameter which usually depends on the standard deviation of
⎪1 if p < ⎜ fs (n ) ⎟ ( 100 − t )
T (n ) = ⎨ n
⎝ s ⎠ 100 the noise, ∥∥ is the weighted Euclidean distance, Z(i) is the normalizing
⎪ constant.
⎩ 0 otherwise (2)
However, the non-local means algorithm is slow to compute since it
In order to save the computational cost and get more pleasing results, searches and computes the weight around the whole image for each
Wellner gave a quick adaptive thresholding method, instead of traver- pixel. We proposed a multi-scale method to evenly divide the whole
sing the image from left to right or right to left, the improved method image into given size (S) portions, then choose one portion to compute
traverses it alternatively from the left and from the right as illustrated the weight and denoise the pixel pi. The scale S is S = 2n (n = 6, 5, …, 1).
in Fig. 3. The chosen portion is the most similar one compared with the
n neighborhood of the pixel pi. Here, we use the average gradient
gs (n ) = ∑ (1 − 1/s)i pn −i orientation as a criteria of similarity distance. The improved non-local
i =0 (3) means method reduces lots of computational cost and works well, see
Fig. 5.

3.2. Adaptive thresholding with histogram


Fig. 2. Treat the image as a single row of pixel composed of all the rows in the image
lined up next to each other. A final adaptive threshold will be determined in this step. Our

273
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Fig. 4. The flowchart of our proposed method.

Fig. 8. Result of Document image post-processing: (a)binarized document image; (b)


Fig. 5. Document image pre-processing: (a)input document image; (b)de-noised image
post-processed image.
using non-local means method.

Fig. 6. Document image after thresholding: (a)de-noised document image; (b)result


after our proposed thresholding method.

proposed method is a variant of Wellner's adaptive thresholding but


based on histogram. The proposed method traverses the image
alternatively from the left and from the right. The text regions are
located if the value of the pre-processed image I ′(x , y) exceeds a
threshold d. The threshold d is determined using Rosenfeld's method
[22] which is based on analyzing the concavities of the histogram h (I ′) Fig. 9. Precision of our method over 150 document images with different q, p1, and p2 .
vis-a-vis its convex hull, Hull (I ′), that is the set theoretic difference
|Hull (I ′) − p (I ′)|. p (I ′) is the probability mass function of the image I′. have some outliers, especially the speckles, see Fig. 6(b). The quality of
When the convex hull of the histogram is calculated, the deepest text regions in the output images is also poor and stroke connectivity is
concavity points become candidates for a threshold. When the back- not preserved. We apply following three measures to remove the
ground is gradiently similar to the text regions, we use the following speckles, preserve the stroke connectivity and improve the quality of
logistic sigmoid function that exhibits the desired saturation behavior text regions. The detailed post-processing will be described as below.
for large and small values of the background:
⎛ ⎞ 3.3.1. De-speckle
⎜ ⎟ In this step, adaptive window anisotropic diffusion for speckle
⎜ (1 − p2 ) ⎟ reduction [23] proposed by Liu et al. is used. The speckle reduction
d (I ′(x , y)) = qδ ⎜ + p2 ⎟
⎛ ⎞ anisotropic diffusion can be modeled as follows:
⎜ 1 + exp ⎜ −4I ′(x , y) + 2(1 + p1) ⎟ ⎟
⎜ ⎟
⎝ ⎝ b (1 − p1) 1 − p1 ⎠ ⎠ (7) ∂It (x, y, t )
= div [c (q )·∇It (x, y, t )]
∂t (8)
where q is a weighting parameter, δ is the average difference between
the black and white pixels, p1 and p2 are constant. We suggest the ∂It (x, y, t )
It (x, y, t )|t =0 = It (x, y, 0), |∂Ω = 0
following parameter values: q = 0.6, p1 = 0.5, p2 = 0.8 in the experi- ∂n (9)
ments. The results after our proposed adaptive thresholding method
where It is the document image after our thresholding, ∂Ω the
are demonstrated in Fig. 6.
boundary of Ω, n the unit vector perpendicular to boundary, ∇the
gradient operator, div the divergence operator, It (x, y, 0) the initial
3.3. Post-processing noisy image, It (x, y, t ) the image at time t and c(q) the diffusion
coefficient, which is an “edge-stopping” and non-negative, monotoni-
After the proposed adaptive thresholding, the output images still cally decreasing function. The diffusion coefficient is given by

Fig. 7. Procedure of preserving stroke connectivity.

274
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Table 1
Test document image portion and its corresponding OCR results. Underlined words indicate the error OCR results.

Table 2 Table 3
PSNR results of corresponding document images binarization in Table 1. Quantitative evaluation of the binarization results of different methods over 150
documents extracted from a digital issue of a French magazine.
Method PSNR
Method Precision Recall Fm Time (s)
Niblack [28] 12.7642
Sauvola et al. [29] 12.3087 W. Nibleck's 0.70 0.75 73.2 165
Kim et al. [30] 14.9765
Gatos et al. [6] 16.6665 Sauvola et al. 0.89 90.4 91 180
Chou et al. method [27] 16.9498 Kim et al. 0.88 0.88 87.1 190
Wagdy et al. method [31] 17.2156 Gatos et al. 0.90 0.91 90.5 202
Our proposed 18.2381 Chou et al. 0.92 0.91 92.8 190
Wagdy et al. 0.93 0.92 93.3 175
Ours 0.97 0.95 95.7 156

275
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Fig. 10. Binarization of a severely broken document image: (a) original document image; (b) Nibleck's method [28]; (c) Sauvola et al. method [29]; (d) Kim et al. method [30]; (e) Gatos
et al. method [6]; (f) Chou et al. method [27]; (g) Wagdy et al. method [31]; (h) our proposed method.

276
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

1
c (q ) =
1 + (q − q0 )2 (10)
where q is the instantaneous coefficient of variation. The definition of q
can be found in [23]. The q0 is defined by
⎛ C ⎞
q0 = ⎜ ⎟ MAD (∥∇ log Ixt , y ∥)
⎝ 2⎠ (11)
where MAD (∥∇ log Ixt , y ∥) is the median absolute deviation, which is
defined as
MAD (∥∇ log Ixt , y ∥) = medianI {|∥∇ log Ixt , y ∥ − medianI [∥∇ log Ixt , y ∥]|}
(12)
where C is a constant C=1.4826. Also, the symbols ∇, ∥ ∥ and
represent the gradient, gradient magnitude and absolute value, respec-
tively.

3.3.2. Preserve stroke connectivity


Due to the influence of brokenness, degradation and thresholding
in document images, the stroke connectivity are very difficult to
preserve. We proposed two steps to retrieve the strokes as shown in
the Fig. 7. In the first step, we extract the strokes of the characters
based on a combination of a simple feature point detection scheme and
a novel stroke segment connecting method [24] proposed by Lin et al.
So we first extract the character feature points based on Rutoviz's
definition of crossing number [25] for a pixel p. Then we can trace the
skeletons of the stroke segments from one feature point to the other.
Finally, we use a bi-directional graph method to determine which pair
of stroke segments needs to be connected at the fork point. After this
first step, we get the structure information for each character. In the
second step, a swell filter is used to fill all possible breaks, gaps or holes
in the characters. This swell filter is also combined with the bi-
directional graph built in the first step. If Psw is the number of black
pixel in a n × n sliding window, which has the white pixel (x,y) as the
central pixel in the connection road of bi-directional graph, and xa, ya
the average values for all black pixels in the n × n windows, then this
pixel is changed to black if Psw > ksw and |x − xa | < dx and |y − ya | < dy
where ksw is experimentally set as ksw = 0.001125*vh . vh is the largest
peak in a height histogram of the connected components. The latter two
conditions are used in order to prevent an increase in the thickness of
character strokes since we examine only white pixels among uniformly
distributed black pixels.

3.3.3. Improve quality of text regions


We further use swell filter to scan the entire binary image and each
white pixel is examined. Further more, an edge stopping function is
used to ensure sharp edges. The Lorentzian edge penalty function [26]
is employed to preservation of sharp edges. The Lorentzian edge
penalty function is defined by
⎧⎪ 1 ⎛ x ⎞⎫ ⎪

ρL (x ) = log ⎨1 + ⎜ ⎟ ⎬


2 ⎝ ρL ⎠ ⎭ ⎪
(13)
where ρL is called the contrast parameter, which controls the shape of
the edge-stopping function [26]. x is the pixel in the binary image.
Fig. 8 demonstrates the result after our post-process measure.

4. Experimental results and discussions

The experiments and evaluation were designed to test and analyze


the proposed method on various document images including broken
Fig. 11. Binarization of an historical handwritten document image: (a) original and poor degraded document images.
document image; (b) Nibleck's method [28]; (c) Sauvola et al. method [29]; (d) Kim We test the proposed method on three types of document images:
et al. method [30]; (e) Gatos et al. method [6]; (f) Chou et al. method [27]; (g) Wagdy historical handwritten documents, old newspapers and broken docu-
et al. method [31]; (h) our proposed method. ments. All 150 images varies in noises, resolution, stroke size and
illumination contrast. Part of test data were collected from Chou et al

277
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Fig. 12. Binarization of an historical handwritten document image: (a) original document image; (b) Nibleck's method [28]; (c) Sauvola et al. method [29]; (d) Kim et al. method [30];
(e) Gatos et al. method [6]; (f) Chou et al. method [27]; (g) Wagdy et al. method [31]; (h) our proposed method.

[27]. The dataset contain 122 document images photographed by an adaptive method [30], Gatos et al. adaptive method [6], Chou et al.
ORITE I-CAM 1300 onechip color camera. The authors set the camera learning-built method [27], Wagdy et al. global thresholding method
lens at about 6 cm from each document and captured a rectangular [31].
region of approximately 4.9 × 3.4 cm which resulting a 320×240 pixels
grayscale image. Thus, the resolution of each captured image is about
166 dots per inch. Another 28 document images were collected from 4.1. Quantitative evaluation of performance
internet for test. The sensitivity of the proposed method to parameter
settings of q, p1, and p2 is shown in the Fig. 9. The results suggest that There are two goals for performance evaluation: effectiveness for
the following parameter values: q = 0.6, p1 = 0.5, p2 = 0.8 in the practice and effectiveness compared with other algorithms. Our
experiments achieve a highest precision. evaluation measures consist of the OCR system and PSNR value. To
The Performance of our proposed method is compared with six verify the efficiency of the proposed binarization method, experiments
state-of-the-art binarization algorithms: Niblack's adaptive threshold- are performed on several well-known OCR systems including
ing method [28], Sauvola et al. adaptive method [29], Kim et al. Tesseract-2.01 OCR from Google, Microsoft Office Document
Scanning and ABBYY FineReader. Table 1 shows the binarization

278
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

Fig. 13. Binarization of an historical handwritten document image: (a) original document image; (b) Nibleck's method [28]; (c) Sauvola et al. method [29]; (d) Kim et al. method [30];
(e) Gatos et al. method [6]; (f) Chou et al. method [27]; (g) Wagdy et al. method [31]; (h) our proposed method.

document image portions and the OCR results. In terms of OCR 2 × Recall × Precision
Fm =
performance, our method can achieve the same promising results as Recall + Precision (14)
Gatos et al. and outperform the others five methods for the example TP TP
where Recall = and Precision =
TP + FN
,
with TP, FP, and FN,
TP + FP
document image. It also produces 100% quality for OCR detection on
respectively, standing for true positive (total number of well-classified
the inadequate illumination condition. Corresponding PSNR values are
foreground pixels), false positive (total number of misclassified fore-
showed in Table 2. Throughout all the experiments, the proposed
ground pixels in binarization results compared to ground truth), and
method can achieve better improvement in binarization of broken and
false negative (total number of misclassified background pixels). The
degraded document images compared to other six approaches.
quantitative evaluation results are shown in Table 3. It is demonstrated
The assessment of the performance of our method against the
that our method outperform other six binarizaion methods in terms of
different cases is quantitatively analyzed with three metrics. According
average precision, recall and cost time over whole 150 document
to common evaluation protocols [32], we used the F-measure (Fm) in
images.
order to compare our method with other approaches. Fm is expressed
Figs. 10, 11, 12 and 13 show example documents that have severe
in percentage:
broken/scratches on the original documents. As a summary, we notice:
(1) Niblack's method suffer from a great amount of background noise,

279
Y. Chen, L. Wang Neurocomputing 237 (2017) 272–280

(2) Sauvola et al. method has no background noise, but many [13] S. Milyaev, O. Barinova, T. Novikova, P. Kohli, V. Lempitsky, Fast and accurate
scene text understanding with image binarization and off-the-shelf ocr, Int. J. Doc.
characters are broken, (3) Kim et al. approach achieves good result, Anal. Recognit. (IJDAR) (2015) 1–14.
but remain much noises and characters broken, (4) Gatos et al. [14] K. Ntirogiannis, B. Gatos, I. Pratikakis, Performance evaluation methodology for
performs well, but has many speckles, (5) Chou et al. method will historical document image binarization, IEEE Trans. Image Process. 22 (2) (2013)
595–609.
miss some strokes due to their method highly depends on learning [15] G. Bal, G. Agam, O. Frieder, G. Frieder, Interactive degraded document enhance-
features; (6) Wagdy et al. method may fail to retrieve characters ment and ground truth generation, in: Electronic Imaging 2008, International
because their method uses a global threshold; (5) compared with other Society for Optics and Photonics, 2008, pp. 68150Z–68150Z.
[16] L. Likforman-Sulem, J. Darbon, E.H.B. Smith, Pre-processing of degraded printed
methods, our proposed algorithm performs superiorly in general
documents by non-local means and total variation, in: Proceedings of the 10th
degraded document images. Particularly, the current method outper- IEEE International Conference on Document Analysis and Recognition, ICDAR'09,
forms all algorithms in the document images which are poorly broken 2009, pp. 758–762.
[17] T. Lelore, F. Bouchara, Document image binarisation using markov field model, in:
and have high noise.
ICDAR, 2009, pp. 551–555.
[18] B. Biswas, U. Bhattacharya, B.B. Chaudhuri, A global-to-local approach to
5. Conclusions and future work binarization of degraded document images, in: Proceedings of the 22nd IEEE
International Conference on Pattern Recognition (ICPR), 2014, pp. 3008–3013.
[19] B.M. Singh, R. Sharma, D. Ghosh, A. Mittal, Adaptive binarization of severely
We presents a new framework for degraded document image degraded and non-uniformly illuminated documents, Int. J. Doc. Anal. Recognit.
binarization and restoring the quality of the document images. The (IJDAR) 17 (4) (2014) 393–412.
document images are pre-processed using the non-local means denois- [20] P. Wellner, Adaptive thresholding for the DigitalDesk, Xerox, EPC1993-110.
[21] A. Buades, B. Coll, J. Morel, A non-local algorithm for image denoising, in: IEEE
ing method in the first step. Then, we extend the Wellner's quick Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2,
adaptive thresholding method based on the histogram and use Citeseer, 2005, p. 60.
Rosenfeld's method to determine the threshold for each pixel. To get [22] A. Rosenfeld, D.La. Torre, Histogram concavity analysis as an aid in threshold
selection (in image processing), IEEE Trans. Syst., Man, Cybern. 13 (1983)
the pleasing binarization results, three measures are employed to post- 231–235.
process the output thresholding document images in the final step, [23] G. Liu, X. Zeng, F. Tian, Z. Li, K. Chaibou, Speckle reduction by adaptive window
which includes de-speckle, preserve strokes and improve quality of the anisotropic diffusion, Signal Processing.
[24] F. Lin, X. Tang, Off-line handwritten Chinese character stroke extraction, in:
text regions. Experimental results and evaluation analysis show
Proceedings of International Conference on Pattern Recognition, Vol. 16, 2002, pp.
significant improvement in the binarization of document images 249–252.
collected from various sources including broken and degraded books, [25] L. Lam, S. Lee, C. Suen, Thinning methodologies-a comprehensive survey, IEEE
Trans. Pattern Anal. Mach. Intell. 14 (9) (1992) 869–885.
magazines and document files. The improvement in recognition is
[26] A. Pizurica, I. Vanhame, H. Sahli, W. Philips, A. Katartzis, A bayesian approach to
especially high for broken document images stuck by several torn nonlinear diffusion based on a laplacian prior for ideal image gradient, in:
pieces. Proceedings of the 13th Workshop on Statistical Signal Processing, IEEE/SP2005,
The proposed method primarily uses a non-local means method to 2005, pp. 477–482.
[27] C.-H. Chou, W.-H. Lin, F. Chang, A binarization method with learning-built rules
remove noise from document image. Its computational cost is still for document images produced by cameras, Pattern Recognit. 43 (4) (2010)
expensive. Integration of the non-local means denoising with adaptive 1518–1530.
threshold could further improve the binarization performance. [28] W. Niblack, An introduction to digital image processing, Strandberg Publishing
Company Birkeroed, Denmark, Denmark, 1985.
[29] J. Sauvola, M. Pietikainen, Adaptive document image binarization, Pattern
Acknowledgement Recognit. 33 (2) (2000) 225–236.
[30] I. Kim, D. Jung, R. Park, Document image binarization based on topographic
analysis using a water flow model, Pattern Recognit. 35 (1) (2002) 265–277.
This work was supported by National Natural Science Foundation [31] M. Wagdy, I. Faye, D. Rohaya, Document image binarization using retinex and
of China (Grant nos. 61601392, 61671399), Research Fund for the global thresholding, ELCVIA Electronic Letters on Computer Vision and Image
Doctoral Program of Higher Education (20130121120045) and by the Analysis 14 (1).
[32] N.R. Howe, A laplacian energy for document binarization, in: Proceedings of IEEE
Fundamental Research Funds for the Central Universities (Grant no. International Conference on Document Analysis and Recognition (ICDAR), 2011,
20720150110). pp. 6–10.

References Yiping Chen is currently a postdoc at the Xiamen


University. She got her Ph.D. degree in the School of
[1] E. Kavallieratou, E. Stamatatos, Improving the quality of degraded document Electronic Engineering, National University of Defence
images, in: Document Image Analysis for Libraries, 2006. DIAL'06. in: Proceedings Technology, China. Her research interests include image
of the Second International Conference on, IEEE, 2006, pp. 10. processing and 3D point cloud processing.
[2] J. Banerjee, A. Namboodiri, C. Jawahar, Contextual restoration of severely
degraded document images, in: CVPR, 2009, pp. 517–524.
[3] A. CBOVIK, T.S. HUANG, M. JR., A generalization of median filtering using linear
combinations of order statistics, IEEE Transactions on Acoustics, Speech, and
Signal Processing ASSP-31 (6).
[4] R.C. Gonzalez, R.E. Woods, Pocessamento Digital de Imagens, Prentice Hall,
Pearson, 2010.
[5] B. Han, Y. Zhu, D. Comaniciu, L.S. Davis, Kernel-based bayesian filtering for object
tracking, in: CVPR (1), 2005, pp. 227–234.
[6] B. Gatos, I. Pratikakis, S.J. Perantonis, Adaptive degraded document image
binarization, Pattern Recognit. 39 (3) (2006) 317–327. Liansheng Wang received the Ph.D. degree in Computer
[7] Ø.D. Trier, A.K. Jain, Goal-directed evaluation of binarization methods, IEEE Science from the Chinese University of Hong Kong in 2012.
Trans. Pattern Anal. Mach. Intell. 17 (12) (1995) 1191–1201. He is currently a Assistant Professor in the Department of
[8] R.F. Moghaddam, M. Cheriet, Adotsu: an adaptive and parameterless general- Computer Science, Xiamen University, Xiamen, China. His
ization of otsu's method for document image binarization, Pattern Recognit. 45 (6) research interests include medical image processing and
(2012) 2419–2431. analysis.
[9] R. Hedjam, R.F. Moghaddam, M. Cheriet, A spatially adaptive statistical method for
the binarization of historical manuscripts and degraded document images, Pattern
Recognit. 44 (9) (2011) 2184–2196.
[10] P. Stubberud, J. Kanai, V. Kalluri, Adaptive image restoration of text images that
contain touching or broken characters, in: ICDAR, 1995, pp. 778–781.
[11] A.P. Whichello, H. Yan, Linking broken character borders with variable sized masks
to improve recognition, Pattern Recognit. 29 (8) (1996) 1429–1435.
[12] G. Lazzara, T. Géraud, Efficient multiscale sauvolas binarization, Int. J. Doc. Anal.
Recognit. (IJDAR) 17 (2) (2014) 105–123.

280

Potrebbero piacerti anche