Sei sulla pagina 1di 5

Offline Handwriting Recognition System - A Review

Neha Tripathi Dr. Pushpinder Singh Patheja


School of Computer Science and Engineering School of Computer Science and Engineering
VIT Bhopal, MP, India VIT Bhopal, MP, India

Abstract
In the endeavor of going paperless and transforming every feasible process to digital it is more critical than
ever to have most efficient and accurate handwriting recognition system in place. There is a great
enthusiasm among research community to perfect the system with the help of most recent and advanced
technologies. The aim of this paper is to review various stages of handwriting recognition process. Also,
different techniques employed in the field of handwriting recognition are analyzed to understand their
approach, effectiveness, and efficiency.
There are various applications of handwriting recognition system such as postal address recognition,
manually filled-out tax forms recognition, bank cheque recognition etc. The remainder of the paper has
following sections: Introduction, History of Handwriting recognition System, Literature review, Conclusion
and future scope.

1. Introduction
Handwriting recognition is a very popular
research area in the field of Image Processing. Image Acquisition
Online and offline recognition are the two types
of handwriting recognition system. Offline
recognition is more difficult than online since
Pre-Processing
there is no real time input such as pen stroke,
applied pressure etc., in the offline system. In
offline recognition system already written
documents, scripts, and images are taken as an Segmentation
input. There are many problems which offline
system faces like distorted images, ambiguous
letters, and complex handwriting etc. The input
with such issues impacts system’s efficiency and Feature Extraction
accuracy rate very much.

Offline handwriting recognition works on the


principles of Optical Character Recognition Classification and
(OCR). In OCR, handwritten documents, printed Recognition
documents etc. are translated into an image and
are used later for character recognition. OCR Fig 1: Stages of Handwriting Recognition
consists of following stages in the handwriting
recognition process.
2. History hinder the identification process. The pre-
processing stage is to deal with all the noise in
since 1950's the recognition of handwriting has
the image and prepare it for further processing.
been under research. since then there is a
continues effort put by the researcher in this To begin with, the image of handwritten text is
area. Earlier much of the world’s information converted into grayscale format. This format of
was held captive in hard copy documents. OCR image helps in separating the character from the
systems convert this information into electronic background. The gray scale image is later
form but with many flaws. converted to binary image to keep the important
information and remove the unwanted one.
An OCR system which is making too many
Binarization helps in increasing the processing
mistakes has much room for improvement. In
speed and reducing the space requirement.
1965 Reader’s Digest and RCA collaborated to
build an optical character recognition system. In
3.3 Segmentation
1985 structural approaches were proposed with
the statistical methods. In this method system Segmentation is the next important stage in the
focused on shape of the characters as shown process of handwriting recognition. In
below. segmentation characters are separated from
each other and can be analyzed more
accurately with high recognition rate. There are
two types of segmentation:
External Segmentation divides the document in
logical writing units such as paragraphs,
Fig 2: Different shapes of lowercase alphabets
sentences, or words. This is a significant step in
layout analysis.
3. Stages of Handwriting Recognition
Internal Segmentation is the isolation of letters
The handwriting recognition system follows 5 and characters.
stages listed in Fig 1. The output of one stage
becomes the input for the next stage. At the end of the segmentation we get images
of different size and shapes due to the variable
3.1 Image Acquisition height and width of characters.

The very first stage of the HCR system is the Although many methods have been developed
acquisition of the images using a camera or an for segmentation, yet the segmentation of
optical scanner. All the handwritten documents cursive handwritten documents is an unsolved
are converted to images of a standard format problem. There is still a need of better
such as .jpg, .bmp, .png etc. segmentation algorithms to deal with tightly
packed words of diverse style and handwriting
3.2 Pre-Processing size.

The image acquired is not always in the desired 3.4 Feature Extraction
condition and may contain lot of information and
data which is not required for character Next step is feature extraction in which all the
recognition. There are also defects and features which can identify a character uniquely
distortions in the scanned image which might are captured. These features are used to classify
and recognize the characters accurately. In this
stage each character is represented as a feature is used to represent uncertainty in all parts of a
vector, which becomes its identity. It is statistical model. This is also called as the
important to extract the correct features from probabilistic classifier as it has least probability
the dataset, as wrongly chosen features can give of misclassifying a random pattern.
false information and lead to many problems for
Hidden Markov Model (HMM) formalizes
the recognition stage.
sequential observation of a system without
3.5 Classification and Recognition perfect access to state (i.e., state is “hidden”).
Also this is used for situations in which the data
There are many approaches for classification and consists of a sequence of observations and the
recognition such as template matching, observations depend (probabilistically) on the
statistical techniques, and neural networks. internal state of a dynamical system. HMM
extends the concepts of Markov model by
3.5.1 Template matching including the case in which the observation is the
Template matching is the process of OCR in probabilistic function of the state. This model
which the images of two characters are provides segmentation free recognition system.
compared. The process involves finding a sub
image called a template inside the image. In
template matching the degree of similarities
between two vectors (groups of pixels, shapes,
curvatures, etc.) in the feature space are
determined. Template matching has high speed
but its not very effective if the handwriting is not
very clear due to any reason.

3.5.2 Statistical Techniques

An approach where the variabilities are captured Fig 3: State Diagram in HMM
through probabilistic models. There are several
Support Vector Machine (SVM) is a supervised
approaches which come under this technique.
learning technique that analyzes data and does
K-Nearest Neighbor rule (KNN) is one of the pattern recognition using hyperplanes which act
simplest algorithms mostly used for as a decision boundary among different classes.
classification. It stores all the available cases and It can also be used on non-linear data using
classifies new case based on the similarity kernel function. The accuracy rate is very high in
measure (e.g. distance function). A case is this technique.
classified by a majority vote of its neighbors, with
the case being assigned to the class most 3.5.3 Artificial Neural Network
common amongst its K nearest neighbors Neural Network is inspired by biological neural
measured by a distance function. The drawback networks in the brain and works in pretty much
of this algorithm is the complexity in searching similar way. It is represented as interconnections
the nearest neighbors for each sample. which are capable o calculations and hence used
Bayesian Classifier is primarily used for text for pattern recognition. It works in three main
classification, which involves high dimensional paradigms:
training data set. It works well when probability
• Supervised Learning - Cost function is area. There is still room for improvement to
calculated based on the given data. increase the recognition accuracy rate and their
• Unsupervised Learning - Cost function can effectiveness.
be any function of the input data, need not
be dependent. 5. References
• Reinforcement Learning - The data [1] Plamondon, Réjean, and Sargur N. Srihari.
generated due to the agents’ interaction "Online and off-line handwriting recognition: a
comprehensive survey." Pattern Analysis and
with the environment is given. At each point
Machine Intelligence, IEEE Transactions on 22,
of action an instantaneous cost function is
no. 1 (2000): 63-84.
generated. [2] Arica, Nafiz, and Fatos T. Yarman-Vural. "An
ANN is first applied on set of samples, and then overview of character recognition focused on
is used to recognize the characters having a off-line handwriting." Systems, Man, and
Cybernetics, Part C: Applications and Reviews,
similar feature set. ANN is high noise tolerant
IEEE Transactions on 31, no. 2 (2001): 216-233.
recognition system.
[3] Casey, Richard G., and Eric Lecolinet. "A survey of
Multi-Layer Perceptron (MLP) is the most widely methods and strategies in character
studied and used neural network. MLP with segmentation." Pattern Analysis and Machine
backpropagation is among the most popular and Intelligence, IEEE Transactions on 18, no. 7
versatile forms of neural network classifiers for (1996): 690-706.
handwriting recognition. For printed characters [4] Marti, U-V., and Horst Bunke. "Using a statistical
and handwritten digits artificial neural network language model to improve the performance of
has proven to be an excellent recognizer, but an HMM-based cursive handwriting recognition
recognition of handwritten words still requires system."International journal of Pattern
more research. The use of Neural Networks has Recognition and Artificial intelligence 15, no. 01
become extremely popular, and will definitely (2001): 65-90.
enable the researchers to move closer to solving [5] Pradeep, J., E. Srinivasan, and S. Himavathi.
the handwriting recognition problem. "Neural network based recognition system
integrating feature extraction and classification
for English handwritten." International Journal of
Engineering-Transactions B: Applications 25, no.
2 (2012): 99.
[6] Choudhary, Amit, Rahul Rishi, and Savita
Ahlawat. "Off-line handwritten character
recognition using features extracted from
binarization technique."AASRI Procedia 4 (2013):
306-312.
[7] Pooja Yadav, Nidhika Yadav (2015),
Fig 4: Neural Network Layers International Journal of Computer Applications
(0975 – 8887) Volume 114 – No. 19, March 2015
[8] Flávio Bortolozzi, Alceu de Souza Britto Jr., Luiz
4. Conclusion S. Oliveira and Marisa Morita(2005). Recent
Advances in Handwriting Recognition.
In this research paper we have reviewed so many
[9] A Literature Survey on Handwritten Character
techniques like Template Matching, KNN,
Recognition(2016)Ayush Purohit, Shardul Singh
Bayesian Classifier, Support Vector Machine,
Chauhan, International Journal of Computer
Neural Network techniques etc. for offline
Science and Information Technologies, Vol. 7 (1)
handwriting recognition. This material serves as
, 2016, 1-5
a guide for the readers for future research in this
[10] Vinciarelli, S. Bengio, H. Bunke (2004). Offline
Recognition of Unconstrained Handwritten
Texts Using HMMs and Statistical Language
Models. IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 26, number 6, pp 709-
720.
[11] Rajiv Kumar Nath, Mayuri Rastogi (2012)
“Improving Various Off-line Techniques used for
Handwritten Character Recognition: a Review,”
International Journal of Computer Applications
(0975 – 8887) Volume 49– No.18,.
[12] Anita Jindal, Renu Dhir, Rajneesh Rani (2012)
“Diagonal Features and SVM Classifier for
Handwritten Gurumukhi Character Recognition,”
Volume 2, Issue 5, ISSN: 2277 128X International
Journal of Advanced Research in Computer
Science and Software Engineering.
[13] Dewi Nasien, Habibollah Haron, Siti Sophiayati
Yuhaniz (2010)” Support Vector Machine (Svm)
For English Handwritten Character Recognition”
Second International Conference on Computer
Engineering and Applications.
[14] Mohamed Cheriet, Nawwaf Kharma, Cheng-Lin
Liu,Ching Y. Suen (2007), Character Recognition
Systems: A Guide for students and Practitioners,
(John Wiley & Sons, Inc., Hoboken, New Jersey.
[15] R.O. Duda, P.E. Hart, D.G. Stork (2001), Pattern
Classification, second edition, Wiley Interscience.

Potrebbero piacerti anche