Sei sulla pagina 1di 5

Combined Online and Ofine Assamese Handwritten Numeral Recognizer

G. Siva Reddy, Puspanjali Sharma, S. R. M. Prasanna, C. Mahanta and L N Sharma


Dept. of EEE, IIT Guwahati, Guwahati 781039, India Email: {r.gangireddy, puspanjalisharma, prasanna, chitra, lns}@iitg.ernet.in

AbstractThis work describes the development of an Assamese handwritten numeral recognizer. Online handwritten numeral recognition system is developed using x, y coordinates as the feature and Hidden Markov Model (HMM) as the modelling technique. Ofine handwritten numeral recognition system is developed using vertical projection prole and horizontal projection prole (VPP-HPP), zonal discrete cosine transform (DCT), chain code histogram (CCH) and pixel level information as features and vector quantization (VQ) as the modelling technique. The confusion patterns of online and ofine systems are analysed. Based on this, the two systems are further combined to obtain a nal numeral recognition system. The combined system exhibits improved performance over the individual approaches, demonstrating the signicance of different natures of information present in each mode.

I. I NTRODUCTION With the advancements in computational and hand held devices, the input-output modes of information to these devices also changes. Apart from keypad, speech and visual modes, handwriting is also a preferable mode for entering information to these devices. The advantage of handwriting is, it is an effortless mode of communication, like speech. If there is a solution for recognizing message from the handwriting, then it may be viewed as an alternate to the existing keyboard based approach. Most of the text input systems are built for English, which are machine read by their corresponding ASCII codes. ASCII codes do not effectively represent the complexity of Indian languages. For native Indian languages, text input via existing keyboards is a cumbersome process. Alternatively, text input in his/her own handwriting, backed by a recognition system is simpler and efcient way. The process of automatically recognizing the message from handwritten data is termed as handwriting or character recognition. The main difculty associated with handwriting recognition is the variability involved in handwriting from person to person and also cursiveness. In human to human communication, most of the information is perceived with the help of a few visual hints in the sequence and extrapolating the message by the human brain and is difcult to articulate. Thus, building automatic handwriting recognizer that recognizes the written message from cursive handwritten data is a distant goal. However, simple handwriting recognizers like numeral recognizers may have good practical applications like automatic telephone number, roll number, cheque number, employee code or pincode recognition and many other form lling applications. For instance, in a hand held device, we can input the telephone
978-1-4673-0816-8/12/$31.00 2012 IEEE

number as a handwritten digit sequence and the numeral recognizer recognizes the same and dials the required person. This work describes the development of a handwritten numeral recognizer in Assamese. Handwriting recognition can be classied according to the mode of data acquisition as online and ofine [2]. In the online handwriting recognition, two dimensional spatial coordinates (x, y) as a function of time are captured by writing on an electronic surface such as digitizer using an electronic pen or stylus. In the ofine handwriting recognition, content to be recognized is available as an image, which is captured by a scanner. Efforts have been made to build ofine, online and also combined systems [8]. These works include the Bangla basic character recognizer using Hidden Markov Model (HMM) and sub-stroke features [4], an online handwritten character recognition system for Telugu symbols using HMM [5], an online Tamil character recognition system using HMM and spatial dynamic time warping (SDTW) [6], an online handwritten character recognizer developed for Telugu and Malayalam [7], handwriting recognition system for Kannada numerals using K-nearest neighbour classier [11] and Arabic handwriting recognition system using projection proles [12]. Further, notes written on a white board are recognized by combining ofine and online systems [8]. This work aims at developing an Assamese numeral recognition system by integrating online and ofine recognition systems built using different features and classiers. In the present work, the online handwritten numeral recognition system is developed using x, y coordinates as the feature and HMM as the modelling technique. The ofine handwritten numeral recognition system is developed using vertical projection prole and horizontal projection prole (VPP-HPP), zonal discrete cosine transform (DCT), chain code histogram (CCH) and pixel level features, and vector quantization (VQ) as the modelling technique. The images for ofine system are constructed from the x, y coordinates. The online and ofine systems are later combined to obtain an improved numeral recognizer. In the proposed combined recognizer, improvement in the performance is achieved due to different features, models and modes of recognition employed. The organization of the remaining work is as follows: Section II describes the collection of database. Section III describes the development of online handwritten numeral recognizer. Section IV describes the development of ofine handwritten numeral recognition system. The development of combined

online and ofine recognizer is described in Section V. The summary and conclusions of the present work and future scope are mentioned in Section VI. II. A SSAMESE H ANDWRITTEN N UMERAL DATABASE Assamese numeral set consists of 10 numerals depicted in Fig. 1. The handwritten numeral examples were collected from native Assamese writers in three different sessions. In the rst session, 53 writers provided one example for each of the ten numerals. In the second session, 44 writers out of the 53 writers from the rst session gave one example for each numeral. In the third session, 11 writers gave ten examples for each numeral. Out of the eleven writers, eight were common to the rst and second sessions. Thus, for each numeral we have 53, 44 and 110 examples, respectively from the rst, second and third session and a total of 207 examples. Out of the 207 examples, the rst 165 were used for training and the remaining 42 for testing. HP Tablet PC was used to collect the data using an open source tool provided by HP with a sampling rate of 120 Hz. The writer was instructed to write each Assamese numeral in separate boxes displayed on the tablet PCs screen using the stylus.

of each numeral and used to interpolate the missing points in the raw data. The variability of number of points in the raw data depends on the writing speed of the writer and the sampling rate of the digitizer. The variations are removed by re-sampling the coordinate sequence spatially. Each numeral is re-sampled to 60 points and these 60 points are equidistant. The handwritten numeral gures before and after each stage of preprocessing are shown in the Fig 2.
y coordinate values> y coordinate values>

2000 1900 1800 1700 1600 1500 3400


50
y coordinate values>

50 40 30

50 40 30

(a)

(b)
20 10 0 0 10 20 30 40 50

(c)
20 10 0 0 10 20 30 40 50

3500

3600

3700

3800

x coordinate values>
y coordinate values> y coordinate values>

x coordinate values>
50 40 30

x coordinate values>

40 30

(d)
20 10 0 0 10 20 30 40 50

(e)
20 10 0 0 10 20 30 40 50

x coordinate values>

x coordinate values>

Fig. 2. Numeral four of Assamese script (a) before normalization, (b) after normalization, (c) after smoothing, (d) after interpolation, (e) after re-sampling

B. Feature Extraction
Fig. 1. Assamese Numerals

The preprocessed x, y coordinates are considered as features. C. Online Numeral Models

III. O NLINE A SSAMESE N UMERAL R ECOGNIZER The online recognition system is viewed in four stages, namely, preprocessing, feature extraction, modelling and testing. A. Preprocessing In online handwriting, numeral is a sequence of points in the xy plane. The preprocessing stage removes duplicate points, performs size normalization, smoothing, linear interpolation and re-sampling. The size normalization is performed as follows [3] xi xmin xi = W (1) xmax xmin yi ymin H yi = ymax ymin

(2)

where (xi , yi ) denotes the original point, (xi , yi ) is the corresponding point after normalization, xmin = min{xi }, ymin = min{yi }, xmax = max{xi }, ymax = max{yi }, W and H are the width and height of the normalized character, respectively. Here 1 i L and L is the number of points in a numeral. Stroke smoothing is performed by the moving average lter of size three. Smoothing removes any noise due to unpredictable pen motion. Before performing the linear interpolation, the cumulative distance is calculated along the arc

A continuous density HMM with N states and M distinct observations v1 v2 ......vM can be characterized by using the state transition probabilities A = {aij }, where aij = P (qt+1 = sj |qt = si ) , 1 i, j N, state conditional probabilities bj (k) = P (Ot = vk |st = j), k = 1, 2, ......M , where Ot is the observation and st is the active state at time t, respectively, and initial state distribution i = P (q1 = i) [1]. For each numeral, an online model is developed using HMM by choosing the feature vectors from the training examples. After preprocessing, we have a 120 dimensional feature vector per numeral. The dimensionality of the feature vector, number of states and Gaussian mixtures for each state are optimized for the best performance. The HTK tool kit was used for training and testing of online models. Experimentally, the maximum performance was obtained for 4 feature vectors per numeral example, i.e., 120 dimension divided into four 30 dimension vectors and HMM model trained for 3 states and 19 Gaussian mixtures. D. Testing For the best performing feature dimension and HMM conguration, the performance for the testing set of 42 examples in the form of confusion matrix is given in the Table I. There are confusions among many numeral pairs. For instance, in the case of one and four, the lower curvature is similar (Fig.1).

TABLE I C ONFUSION MATRIX OF THE ONLINE NUMERAL RECOGNITION SYSTEM ( IN %).

class 0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 100 - 92.8 - 7.2 - 100 - 95.2 - 4.8 - 100 2.4 - 7.1 88.1 - 2.4 - 4.8 - 95.2 - 100 - 100 - 2.4 - 2.4 - 95.2

2) Zonal Discrete Cosine Transform: DCT coefcients represent an image as a sum of sinusoids of varying magnitude and frequencies. Most of the visually signicant information may be concentrated in just a few coefcients of the DCT. The 64 64 images were divided into 64 blocks of 8 8. The low frequency coefcients occur in the upper left corner of the DCT coefcient matrix. Ten DCT coefcients were extracted from each of the 64, 8 8 blocks and thus a feature of 640 dimension was formed for each numeral example image. The equation below shows the two dimensional DCT of an image.
Bi,j = i j
M 1 N1 m=0 n=0

Am,n cos

(2m + 1)i 2M

cos

(2n + 1)j 2N

(5)

where, IV. O FFLINE A SSAMESE N UMERAL R ECOGNIZER Ofine numeral recognition is developed using the constructed images from the x, y coordinates. The ofine recognizer is also viewed in four stages, namely, preprocessing, feature extraction, modeling and testing. A. Preprocessing The images are digitized in gray tone and converted to binary images. Cropping is done in order to extract the numeral for further compaction. Due to variation of shapes and sizes, the cropped images are non uniform and size normalization is done to get 64 64 size images. B. Feature Extraction Different features like vertical and horizontal projection proles (VPP-HPP), zonal discrete cosine transform (DCT), chain-code histograms (CCH) and pixel level values are then extracted from the preprocessed images. 1) VPP-HPP: VPP and HPP are dened as the sum of pixel values along every column and every row of a given input image, respectively. VPP shows the variation of the image along its width, while HPP shows the variation of the image along its height. The VPP and HPP vectors are unique features for a given numeral and vary from numeral to numeral. Mathematically VPP and HPP can be represented as follows
V P P (j) =
M N

i =

3) Chain Code Histogram: CCH provides the directional information of the exterior of the numeral image. In this process, the direction of 2 consecutive point sequences are approximated to the nearest quantum. We have divided the 64 64 images into 16 blocks of 16 16 size. Four CCH are used taking the four directions into account as depicted in Fig. 3 and hence a feature of dimension 16 4 = 64 is formed. 4) Pixel Level Feature: The pixel intensity values of the images are directly taken after being resized for 1616 images [10]. For a particular image, a feature of 256 dimension is formed. 5) Combination of Features: All the above four features provide different types of information about the image and hence are combined at the score level to improve the performance.

1 M, i = 0 2/M , 1 i M 1

j =

1 N, j = 0 2/N , 1 j N 1 (6)

Fig. 3.

Chain Code Histogram computation.

A(i, j); j = 1, 2, . . . N

(3)

C. Ofine Models using Vector Quantization Modelling is done to make the model for every numeral class in such a way that it should hold all the variations of that numeral class. This reduces the computational complexity and time. We have used vector quantization (VQ) as a modelling technique which employs binary split algorithm and k-means clustering [9]. One ofine model per numeral and feature is developed. D. Testing In order to evaluate the performance of the numeral recognition system, all the feature extraction techniques described above and also a combination of them are used. In the

i=1

HP P (i) =

A(i, j); i = 1, 2, . . . M

(4)

j=1

where, M and N are the number of rows and columns in the given image, respectively, and i and j are the row and column index, respectively. A(i, j) is the pixel value at ith row and j th column. We have used 64 64 images and hence VPP and HPP feature of 64 dimension each. By combining the two, we have a feature of 128 feature dimension for each numeral example image.

combined features case, the features are combined at the score level and to compensate for the different ranges of feature scores, maximum normalization and simple sum rule for fusion are used. Confusion matrices for different cases are shown below. Among the individual features, the CCH gives the maximum recognition accuracy and VPP-HPP gives the least. ZDCT and pixel level features give intermediate recognition accuracies.
TABLE II C ONFUSION MATRIX OF THE OFFLINE NUMERAL RECOGNITION SYSTEM USING VPP-HPP ( IN %).

TABLE IV C ONFUSION MATRIX OF THE OFFLINE NUMERAL RECOGNITION SYSTEM USING CCH ( IN %).

class 0 1 2 3 4 5 6 7 8 9

0 95.2 9.5 2.4 7.1 2.4 -

1 64.3 9.5 2.4 11.9

2 2.4 80.9 2.4 -

3 2.4 73.8 4.8 23.8 2.4 -

4 9.5 92.9 2.4 2.4 2.4

5 4.7 78.5 4.8 7.1 -

6 4.8 19.1 9.5 76.2 2.4 -

7 8 9 2.4 2.4 - 11.9 - 2.4 92.8 - 88.1 2.4 - 83.3

class 0 1 2 3 4 5 6 7 8 9

0 95.2 2.4 2.4 7.1 4.8

1 95.2 2.4 2.4 19

2 2.4 90.5 2.4

3 2.4 88.1 7.1 16.7 -

4 5 6 7 8 9 - 2.4 - 2.4 2.4 - 4.7 - 9.5 95.2 - 2.4 - 83.3 2.4 - 83.3 2.4 - 97.6 2.4 - 95.2 2.4 - 2.4 69.0

TABLE V C ONFUSION MATRIX OF THE OFFLINE NUMERAL RECOGNITION SYSTEM USING PIXEL LEVEL FEATURE ( IN %).

In Table II, confusion is seen to arise in patterns where there are similarities in structural information of the patterns. For example, Assamese numeral 0 is seen to confuse with Assamese numeral 3 and 7 because of the fact that vertical and horizontal projections are similar in all the three cases.
TABLE III C ONFUSION MATRIX OF THE OFFLINE NUMERAL RECOGNITION SYSTEM USING ZDCT ( IN %).

class 0 1 2 3 4 5 6 7 8 9

0 97.6 2.4 16.7 28.5

1 97.6 2.4 4.8 4.8

2 78.6 9.5 2.4

3 2.4 7.1 88.1 2.4 16.7 -

4 2.4 97.6 2.4 2.4 2.4 -

5 71.4 2.4 2.4 -

6 7 8 9 - 2.4 - 9.5 2.4 9.5 7.1 78.5 - 2.4 - 97.6 - 71.4 9.5 - 64.3

TABLE VI C ONFUSION MATRIX OF THE OFFLINE NUMERAL RECOGNITION SYSTEM ( IN %) FOR COMBINED FEATURE .

class 0 1 2 3 4 5 6 7 8 9

0 88.1 2.4 2.4 2.4 2.4 -

1 2 3 4 5 6 7 8 9 2.4 - 2.4 - 4.7 2.4 76.2 - 4.7 2.4 - 16.7 - 90.5 4.7 - 2.4 - 76.2 - 21.4 9.5 2.4 - 85.7 - 2.4 - 7.1 - 76.2 11.9 - 2.4 - 16.7 - 76.2 - 4.7 2.4 - 2.4 - 92.8 2.4 2.4 - 2.4 - 92.8 2.4 16.7 - 2.4 - 80.9

class 0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 100 - 97.6 - 2.4 - 100 - 95.2 - 4.8 - 100 2.4 - 2.4 - 95.2 - 2.4 - 97.6 - 100 - 100 - 7.1 2.4 - 90.5

The shapes of Assamese Numerals like 3 and 6 are similar. Hence, the energy distribution in the spectral domain of the two patterns may be somewhat similar leading to confusion. The corresponding confusion pairs are referred from the Table III. Confusion is seen to arise in patterns having similar directional information. For example, Assamese numerals like 3 and 6 have the similar directions of the blocks of pixel. The corresponding confusion pairs are referred from the Table IV. The confusion between Assamese numerals 2 and 9 as seen when considering the energy distribution in spectral domain vanish when their energy distributions are taken in the time domain. This is because, their shapes are different.

In numeral patterns like 1 and 4, confusion is seen to exist using ZDCT, pixel level features and CCH as their energy distribution in spectral and spatial domains as well as their directional information more or less remain the same. However, this confusion is compensated by the VPP-HPP feature as their horizontal projections are easily distinguishable. Similar phenomenon happens in the other confusing pairs. V. C OMBINED A SSAMESE N UMERAL R ECOGNIZER The scheme for the proposed combined Assamese numeral recognizer is given in Fig. 4. During testing, in the case of ofine numeral recognition system the Euclidean distances are calculated. In the online numeral recognition system, the log

likelihoods are computed. After normalizing these Euclidean distances and log likelihoods to the range 0 to 1, both are added and assigned to the class for which the combined value is minimum.

TABLE VIII AVERAGE RECOGNITION RATES OF THE ONLINE , OFFLINE AND COMBINED SYSTEMS ( IN %).

System Online Ofine

Features x, y coordinates VPP-HPP, Zonal DCT, CCH and Pixel level features

Classier HMM VQ

Recognition rate 96.6 97.6

Combined

99.3

when compared to the corresponding individual systems. Since we did not use any script dependent information, we can extend the approach for other Indian scripts similarly. Future work may include developing applications based on numeral recognizer and also extending the similar work to the isolated character recognition. There is a need to use feature set which can reduce the misclassication rate further at the individual system level. VII. ACKNOWLEDGEMENT
Fig. 4. Block diagram of the proposed combined numeral recognizer.

This work is part of the ongoing project on the development of online Assamese handwriting recognition system funded by the TDIL division, DIT, Govt. of India. R EFERENCES
[1] L. R. Rabiner, A tutorial on Hidden Markov Models and selected applications in speech recognition, Proc. of IEEE, vol. 79, no. 2, pp. 257286, 1989. [2] N. Arica and F. T. Yarman-Vural, An Overview of Character Recognition Focused on Off-line Handwriting, IEEE Trans. Systems, Man, Cybernetics Part C: Applications and Reviews, vol. 31, no. 2, pp. 216-233, May 2001. [3] X. Li and D-Y Yeung, Online handwritten alphanumeric character recognition using dominant points in strokes, Pattern Recognition, vol. 30, no. 1, pp. 31-44, 1997. [4] S. K. Parui, K. Guin, U. Bhattacharya, and B. B. Chaudhuri, Online Bangla Handwritten Character Recognition using HMM, in Proc. 19th Int. Conf. on Pattern Recognition (ICIP), Tampa FL, 2008, pp. 1-4. [5] V. J. Babu, L. Prasanth, R. R. Prasanth, R. R. Sharma, G. V. P. Rao and A. Bharath, HMM-based online hanwriting recognition system for telugu symbols, in Proc. 9th Int. Conf. on Document Analysis and Recogntion (ICDAR), Curitaba, Brazil, 2007, pp. 63-67. [6] K. Shashikiran, K. S. Prasad, Rituraj Kunwar, A. G. Ramakrishnan, Comparision of HMM and SDTW for Tamil handwritten character recognition. in Proc. Int. Conf. on Signal Processing and Communications, IISc Bangalore, India, 2010, pp. 1-4. [7] A. Arora and A. M. Namboodiri, A Hybrid Model for Recognition of Online Handwriting in Indian Scripts, in Proc. of Int. Conf. on Frotiries in handwriting Recogntion, Kolkata, 2010, pp. 433-438. [8] M. Liwicki and H. Bunke, Combining On-line and Ofine systems for Handwriting Recognition, in Proc. Int. Conf. on Document Analysis and Recognition, Curitaba, Brazil, 2007, pp. 372-376. [9] Lawrence Rabiner and Biing-Hwang Juang, Fundamentals Of Speech Recognition, PTR Prentise-Hall,Inc., pp. 242285, 1993. [10] D. Keysers, C. Gollan, H. Ney, Local Context in Non- Linear Deformation Models for Handwritten Character Recognition, in Proc. of Int. Conf. on Patter Recognition, Germany, 2004, pp. 511-514. [11] H. R. Mamatha, K. S. Murthy, S. Sudan, Vinay. G. Raj and Sumukh. S. Jois, Fan Beam Projection Based Features to Recognize Handwritten Kannada Numerals, in Int. Conf. on Software and Computer Applications, Singapore, 2011, pp. 257-262. [12] H. Aljuaid, D. Mohamad and M. Sarfraz, Arabic Handwriting Recognition Using Projection Prole and Genetic Approach, in Int. Conf. on Signal-Image Technology and Internet based systems, Marrakesh, 2009, pp. 118-125.

The confusion matrix for the combined numeral recognizer is given in Table VII. Compared to the performance of individual systems, namely, online given in Table I and ofine given in Table VI, the performance of the combined recognizer is better. For instance, in case of individual systems, the recognition rates of the Assamese numeral 5 and 9 are comparatively lesser. After combining the scores, the corresponding recognition rates improve. Some of the confusion pairs that arose in the individual system are eliminated or minimized in the combined system. The average recognition rates of the online, ofine and combined system are shown in VIII.
TABLE VII C ONFUSION MATRIX OF THE COMBINED NUMERAL RECOGNITION SYSTEM ( IN %)

class 0 1 2 3 4 5 6 7 8 9

0 100 2.4 -

1 100 2.4

2 100 -

3 100 2.4 -

4 5 6 7 8 9 100 - 97.6 - 97.6 - 100 - 100 - 97.6

VI. S UMMARY

AND

C ONCLUSIONS

We have presented a combined online and ofine approach for recognition of Assamese numerals. The average recognition rate of the combined system signicantly increased

Potrebbero piacerti anche