IMPROVEMENTS ON HANDWRITTEN DIGIT RECOGNITION BY COOPERATION OF MODULAR NEURAL NETWORKS
Claudio A Perez
Patricio A Galdames
Carlos A Holzmann
Department of Electrical Engineering, University of Chile. AV.Tupper 2007, Santiago, Chile
ABSTRACT
In this paper modular neuml networks are used to improve handwritten digit recognition. _{T}_{o} evaluate the _{p}_{e}_{r}_{f}_{o}_{r}_{m}_{a}_{n}_{c}_{e} of modular networks, a comparison is made with a global neural network, on the same database. Two basic kind of modular
networks
networks are used. Five of them are provided for digits 0, 1, 2, 5, 6,7. The other two modular networks are for the pair of digits 38 and 49 respectively. The second kind of modular neural network considers an expert module for each feature extracted eom the handwritten digit image. The coopemtion is among modules extractingslope and radial projection fiom each digit. Two type of cooperation among modular networks are considered: neural network and weighted combination of the modules outputs. The models were trained with a set of 1.837handwitten digits, tested on a diEerent set of 918 digits where the _{b}_{e}_{s}_{t} weight set was selected for each neural network and finaly results were validated on a mfferent set of 9 19 digits. Results show that by using modular network for features, it is possible to improve classification performance on handwritten digits, fiom 91.0% in the case of global networks to 93.5% of modular networks.
are considered. In the first one, seven expert modular
1. INTRODUCTION
Specialization is found in the nervous system as modules dedicated to process specific fundions such as vision, touch, hearing, etc. [1,2]. The notion of speciali2ation or modularity was implemented in expert systems for decision making in the past decade. In these applications problems were decomposed in subproblems and then solved by specialist modules [3,4]. This notion was also implemented in neural networks to create expert modules in subproblems. It has been stated that global neural networks have disadvantages compared to modular networks in plasticity, in difficultiesto learn heterogenictasks and in wg time when hrge networks are needed [5]. Since the early ~O'S,the combination of multiple classifiers have been proposed as a new direction for the development of character recognition system [5,6]. Several forms of Cooperation between modules, such as vow, Bayes and confusion matrices were considered in [7,8]. Jacobs et.al. [9], considered a model composed of modular and
0780347781/98 $10.00 8 1998 IEEE
gatmg networks to combine its outputs. For each input the gating network determines stochasticaly the appropriate expert module
weighted combination of
the outputs of the expert networks. This approach was applied to approximate functions of one variable. A similar cooperation scheme was presented by Tresp [ 11, applied to the handwritten digits. Fuzzy logic has also been applied to the cooperation among
em u21.
to respond [9]. Wem [101 considered a
In the recognition of handwritten digits, Sebire [13] defined a number of experts less or equal to the total number of the total number of classes. The cooperation among experts is performed using perceptrons and by winner takes all technique. _{I}_{n} other applications one feature has been extracted as input to each classifier [14,15]. Cao and Ahmadi [16] used principal components in the cooperation ofmultipleexperts.The dimension of the input was reduced si@icantly and results were better _{t}_{h}_{a}_{n} those obtained with a backpropagation network. Error rates were less than 1% for rejection rates of 1015%. For rejection rates _{h}_{i}_{g}_{h}_{e}_{r}_{t}_{h}_{a}_{n}_{1}_{8}_{%}_{b}_{o}_{t}_{h} classifiersyield similarresults.
In the literature, results on handwitten digit recognitionrates vary between 68% [17] and 97.7%[14]. It is not possible to compare different systems only on the basis of the correct classificationrate. Most systems have been tested on different data sets and under Merent conditions [14]. Fin a method to determine the optimum architecturefor a modular neural network and the best form of cooperation among modules, are among the problems being addressed by the scientific _{c}_{o}_{m}_{m}_{u}_{n}_{i}_{t}_{y} [8,9,18].
Improvements on the recognition rates of handwritten digits by standard neural networks (My connected, feedforward, backpropagation) were introduced by augmentmg the training set by shim and magnification [19211. In [20221 genetic algorithms were applied to select the appropriate number of hidden units of the network. In th~spaper, modular networks are used to improve handwritten recognition.Two type of cooperation schemes are used:one with a neural network and another with adjustableweights.
4172
2. COOPERATIONAMONG MODULAR NETWORKS
Two type of modular networks are considered. The fist one considersa specialistmodule for each digit.Therefore,the number of modules of the modular network will be equal to the defmed number of classes. The second tVpe of modular network uses modules specialized in one feature extracted ffom the handwritten digit image. One of the features is the slope and it is extracted by a gradient operator. The other feature is radial projection respect to the geometricalcenter.
Experts in Digits An expert module is created for each digit (or subset of digits) and trainedto recognize only that digit (or subset of digits). For pairs of digits 38 and 49, where most confusions are produced, only one module was defmed for each pair. Module i should have a high output only when digit i appears in the input. Modules for subsets of dlgits have a number of outputs equal to the number of digts in the subset. In Figure 1, a general scheme for expert mcdules per
InFig.1 (a) coopemtion is restricted to the output
digits is shown.
of the modules. In Fig.1 (b), cooperation is extended to the input patterns.
ooperation
image
Mod. N
Mod. I
 
Mod. N
Figure 1: (a) Modular network with expert modules per digit and with cooperationamong its outputs. (b) Cooperation is extended to the input pattems.
fiooperatia
output
t
ooperatio
Image
image
b
a Figure 2: (a) Modular network with expert modules per feature and cooperation among its outputs. _{(}_{b}_{)} Cooperation _{i}_{s} extended to the input pattems.
41 73
Expert Modules in Features Cooperation schemes for the expert modules in one feature are shown in Figure 2. In Fig.:Z(a) there is cooperation only among the outputs of the modular networks. In Fig.2(b) cooperation is extended to the input pattems.
"
Slope detection Slopes are detected in several directions for each character. The
operation is
digit image with a 3x3 Prewitt gradient operator [23] rotated for the desired directions. F'our directions are considered for the gradient: O", 45@,90@,and 135". The rotated Prewitt operators are shown in Figure 3. The re,mlt of the convolution, C,, between the gradient operator, 4, and the image, A,, is represented m equation (1). For the case of a 3x3 operator, n=m=3.
implemented by convolution of the 23x15 handwritten
nm
k=l
I=1
convolvingthe gradient operator with
the image of a handwitten 5 considering (a) the original image
and (bd)the four rotations.
Figure 4 shows the result of
Radial Projections Relative to de Geometric Center The radlal projection relative to the geometric center of the handwritten digit is obtained by detemlining the radial dlstribution of the digit's pixels respect to its geometric center. First the geometric center of the chiuacter is obtained. A convex region is fomed to include the mass of the charader by segments tangential to its contour as shown in Figure 5. The average vertical and horizontal length of the cortvex region define the geometric center of the character. Second, the radial projection is obtained _{b}_{y} inkgratmg the digit's mass in the direction of the segment joining the geometric center with one of the pixels of the perimeter of the 23x15 image. Therefore, the resulting vector with the radial projections has dimensionsof 76 elements.
Figure3: Prewitt operator rotated for (a) O",
(d)145@.
1 1 0
CdI
(b) 45", (c)
90"
U.
s
I
U##
P
H*
e
".
ft
*
Figure4
the convolution ofthe Prewtt operator for O",45", 90" and 45"
(a) Ongmal mage of a handwtten 5, (bd)Result of
iI.
m
II
II
Figure 5:
(b) Convexregion cont"g
(a) Original handwritten 3 digitued in an 23x15 image.
the digit.
Types of Cooperation Two types of cooperation are considered. In the first type, the cooperation is performed by a neural network receiving the outputs ofthe modular networks. Additionally, the input pattems could also be considered as inputs for this network. In the second type of cooperation, adjustable weights are _{u}_{s}_{e}_{d} to combine the modular networks' outputs. The weights are adjusted iteratively to maximize the overall recognition rate. The cooperation scheme provides 10 outputs (one for each digit) and the maximum is selected as the systemresponse.
3. TRAINING AM) TESTING
Network Dimensions According to previous work [19,21] using global neural networks for the problemofhandwritten digt recogrution,the dimensions of a two hidden layer network are 345xN1xN2xlO. Each network is
by
trained by backpropagation using an augmented training set sluftingthe patterns in the input.
Training, Testing and Validation Sets
A database of 3,674 handwritten digits, obtained form university
students is used for training, testmg and validation. The data base was segmented in three subsets. The training set is composed of 1,837 patterns and it is used for training global and modular
networks. Augmentation of the _{t}_{r}_{a}_{i}_{n}_{i}_{n}_{g} set was performed by shfiing the inputs pattems [19,21] which is in part equivalent to
expand artificially the training set to 9,185 pattems. The testing set is formed by 918 pattems, different %omthose of the training set. The testing set is used to adjust the cooperation scheme after the modular networks have been trained individually. The validation set is composed of 919 difkrent patterns and it is used only to
determinethe generalization performance of the network. Figure 6 shows a sample of the handwitten digit database. In (a), 110 dig&used in training and in (b), a set of 110 digts used for
validation.
I
I
(b)
Figure 6 shows a sample of the handwtten dlgit database In
110 digits used m trmg and 111 (b), a set of 110 digits used for validabon
(a),
Weight Adjustment Based on work by Hashem [lo] and Tresp [ll], an algorithm to adjust the weights was developed to maximize the total recowtion rate for a testing database. The performance of the system is measured on the validation set.The algorithm does not
4174
guaranties the optimum weigh set because the method perfoms a random search to find the best set [24]. For K modular networks, the ith output of the classifier exx> is denoted as dx). The cooperation consists of a linear combination of the modules' outputs.
k=l
with
Pi
k and s, (x)
the ith output Of the
10)
vi E A, 1 5 k <. K
And A={1,2 ,
,
Classification system is the set of symbols
iden 
each digit. The weights 
should be computed so that 
each 
component of the output vector is normalized to a maximum 
of 1. Therefore the weights are normalized according to equation
(3).
The weights . a,k. are computed using an algorithmsirmlarto
~41.
RecognitionRate
All networkswere trainedat least for 100 epochsand selectingthe set of weights that maximizes the recognition rate of the network in the testing set. Once the weight set was chosen, the network was applied to validation set to detennine the gendization
performance of the network.
To show with some level of confidence that the improvements in recognition rate are not due to local minima, all networks were trained from merent random starting weight sets. Results are presented by the average recognition rate of the ten and the standard deviation. The Student ttest was used to determine if
differences between different models were statistically significant.
A ~0.05was consideredto be statistically sigtuficant.
4. RESULTS
Table 1 shows the results of the classification, for the vahdation set, considering a rejection threshold of 0.5. _{I}_{n} the _{f}_{i}_{r}_{s}_{t} _{c}_{o}_{l}_{u}_{m}_{n}_{,}_{i}_{t} is shown the average recognition rate in % for the 10 simulations. The second column shows the recognition rate in % for the best trained network. The rows correspond to the following architectures: GNI=global network with an image as input; GNS=globalnetwork with slopes as inputs; GNP=globalnetwork with radial projectioq MNCN=modularnetwork for image, radial projection and slope with cooperation performed by _{a} neural network; MNCW=modularnetwork for image, radial projection and slope with cooperation by weights; MNDN="dular networks for digits and cooperation by a neural network, _{M}_{N}_{D}_{F}_{m}_{o}_{d}_{u}_{l}_{a}_{r} networks for digits and cooperation by weights
4175
In the case of MNCN, no s:ignificantimprovement was measured if the input patterns were included in the cooperationin addition to the modular networks. Therefore,the _{i}_{r}_{t}_{c}_{r}_{e}_{a}_{s}_{e} in the dimensions of the network for cooperation is not jus.tified. It is observed from Table 1, that WCN with tmperation by weights presents higher recognitionrate thanglobal networks (~~0.001).
Table 2 has the same organization as Table 1. In Table 2, no rejection rate was considered It is obsemed that modular networks for features achieve the best classification performance. The differences in classification rate between global and modular neural networks are highly sigmficant (p<O.OOOl). sigtuficant differences between cooperation with tlie neural network and that obtained by the weights @0.068). ?here were no significant differenceswhen input patkms were included in the cooperation in addition to the outputs from the modular networks. Figure 7 shows the error rate as a fimction of the rejection rate for global networks and for modular networks for features. It is observed that the lowest error rates are obtainedfor the modular network and for the global network with slopes as inputs. The highest mor rates correspond to the case ofthe: global network with radial projection as input.
Table 1: Results of classification perftmmce on the validation set, considering a rejection threshold of '0.5. In the first column, it is shown the average recowition rate in YOfor the 10 simulations. The second column shows the recognibion rate in _{Y}_{O}_{f}_{o}_{r} the best trained network. The rows correslpond to the following architectures: GNI=global network with an image as input; GNSglobal network with slopes as inputs; GNP=global network with radial projection; MNCN=modularnetwork for image, radial projection and slope with cooperation performed by a neural network; MNCW=modular network for image, mhal projection and slope with cooperation &y weights; bdNDN=modularnetworks
r7Rejection threshold = 0.5
for digits
and cooperation by a neural network; MNDP=modular
networks for digits and cooperation by weights.
 Classificationconecyl
X+ STD
90,o k 0,2 

GNP 
71,9f0,6 

MNCN without Image 
92,O 
k 0 
8 
WCNwith Image 
91,9 k 0 
3 

93,1402 

MNDP 
90,4 k 0.5 90,4 *0.9 
92,7
92,4
93,5
91,2
91,9
In table 3, the confusion matrix obtained for the modular network for image, radial projection and slope with cooperation by weight adjustmentwhen applied to the validation set. Each row shows the number of confusions of the network for the digit specified on that row. From this table it is possible to identifj the cases with largest number of confusions and therefore to develop strategies to elirmnate the sources of confusion. It is observed that the largest number of confusions are for digits: 35,49 and 56. Rey account for 16 cases of confusion.
L
Rejection Threshold=O
GNI
GNS
GNP MNCN without Image
MNCN with Image MNCW MNDN
MNDP
c.l
5
01
I
m
8%
7%
8%
5%
CT 4%
E
3 3%
2%
1%
0%
ox
Correct
Classification

XLSTD
92,3 f0.3 92,7 k 0.4 76,8 k 0.8
0.5
93,7
93,3
f0.5
944*0.2
1
,
91Jf0.4
90,9f 0.7
50%
Rejection Rate 1x1
I
I
Best Result
[”/.I
92,7
93,l
78,l
94,O
94,7
94.9
92,3
92,2
100%
Figure 7. shows the error rate as a function of the rejection rate for global networksand for modular networks for features.
Confusion malxix for the modular network considering
image, radial projection and slope with cooperation by weight
adjustment on the validation set. Column number
number of patterns rejected for each digit. Each row shows the confusionsDer di&.
10 ind~catesthe
Table 3:
5. CONCLUSIONS
In this work two types of cooperation among modular neural networks were presented and results were compared to global neural networks for the problem of handwitten digt recogrution. Results show that by using modular network for features, it is possible to improve classification performance on handwritten digits, kom 91.0% in the case of global networks to 93.5% of modular networks. This improvementwas achievedfor a rejection threshold of 0.5. Besides when no rejection is applied, it is possible to improve recognition rates &om 93.1%, for the case of the global network, to 94.9% for the case of a modular network.
ACKNOWLEDGEMENTS Tlus research has been fimded by FONDECYT, grant no. 1960921 and by the Dept. ofElectnca1Engineering, U. of Chile.
REFERENCES
[l] Fishler MA, Firschein 0. “ The Brain and the Comuuter”. in Intelligence: The Eye; the Brain and The Comiuter ”; Addison Wesley, pp.2358, 1987.
ChurcNand
Computation in Neural Networks”, in The Foundations of Artificial Intelligence: A Sourcebook, Partridge & Wilks eds., CambridgeUniv. Press, pp.337372, 1990.
Speed
PM,
“Representation
and
High
131 Erman LD, HayesRoth F, Lesser VR, Reddy RD, “The HearsayII Speech Understanding System: Integrating Knowledge to Resolve Uncertainty“, ACM Computing Surveys 12(2), 1980.
4176
_{P}_{I}
Rich E, Knight K, “Distributed Reasoning Systems”, in Artificial Intelligence,McGraw Hill, pp.433446, 1991.
[5] Ronco E, Gawthrop P, “Modular Neural Networks: A
State
University of Glasgow, May 12,22p, 1995.
of
the
Art”, Technical
Report
CSC95026,
[6] Tumer K, Ghosh J, “Analysis of decision boundaries in linearly combined neural classifiers”, Pattern Recognition,29(2): 341  348, February 1996.
[7] Xu L, Krzyzak A,
Suen C, “Methods of Combining
Multiple Classifiers and their Applicationsto handwriting
Recognition” IEEE Transactions on System, Man and
Cybernetics, Vol. 22, No .1,
pp. 418 _{} 435, May/June
1992.
[8] Ho KT,
Hull J, “Decision Combination in Multiple
Classifiers Systems”, JEEE Trans. on Pattern Anal. and Machine Intelligence, Vol. 16,No.1, pp.6675, Jan 1994.
[9] Jacobs R, Jordan M y Barto A, “Task decomposition through competition in a modular connectionist architecture: The what and where vision tasks”, Cognition Science, 15:pp. 219  250, 1991.
[lo] Hashem S y Schmeiser B, “Improving model accuracy using linear combinations of trained neural networks”. IEEE Transaction on Neural Networks. 6(3), pp 792 
794,1995.
[111Tresp V, Tanigushi M, “Combining estimators using non constant weighting functions”, NIPS 7, _{M}_{l}_{T} Press, CambridgeMA, pp. 419426,1995.
[12] Cho S and Kim J, “Combining Multiple Neural Networks by Fuzzy Integral for robust Classification”, IEEE Transaction Systems, Man,and Cybemetics, Vol. 25, pp. 380384, 1995.
[13] Sebire P, Dorizzi B, “MLP Modular Networks for Multiclass Recognition”, Proceedings ESAN” 93,
Brussels,
Belgium, pp. 111116, 79, April 1993.
[141 Cho, SB, “NeuralNetwork Classifiers for Recognizing Totally Unconstrained Handwritten Numerals”, IEEE Transactions on Neural Networks, Vo1.8, No. 1, pp.43 53, 1997.
[15] Knerr S, Personnaz L y Dreyfus G, “Handwritten Digit Recognition by Neural networks with Single _{} Layer Training ”. IEEE Transaction on Neural Networks. V01.3, No6, pp.962968, NOV.1992.
[16] Cao J, Ahmadi M, Shridhar M, “A Hierarchical Neural
Network
Recognition”, Pattern Recognition, Vo1.30, No.2, pp.289294, 1997.
Numeral
for
Architecture
Handwritten
41 77
[17] Lee DS, Srihari SN, and Pawlicki. T, “Experiments with
Neural
Recognition”, in Systems and Signal Processing, R.N.
Madan, N. Viswanadham, R.L. Kashyap (eds.), pp.757 774, 1991.
Digit
Network
Models
for
Handwritten
[18] Huang Y, Suen C, “A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals”, IEEE Trans. on Pattern Anal.
& Mache Intelligence, Vol. 17, No. 1, pp.9094, Jan
1995.
[191Perez, CA, Hohann, CA, MorelX, IR,“Optimization of One and Two Hidden Layer Neural Networks Architecture for Handwritten Digit Recognition”, Proceedlings of the 1995 IEEE International Conference on System, Man and Cybernetics, Vancouver, Canada, Oct. 22,25,pp.27952799, 1995.
[20] Perez C,4, Holmann CA, Diaz E, “Genetic Selection of Multilayer Neural Networks Cor Handwritten Digit Recognitionto aid the: Blind, 18thAnnual International Conference EEEEMES, Amsterdam, The Netherlands, Oct. 31NOV. 3, 3p., 1996.
[21] Perez CA, Holzmann CA, “Improvementson Handwritten Digit ]Recognition by Genetic: Selection of Neural Network Topology and by Augmented Training”, 1997 IEEE International Conference on Systems, Man and Cybernetics, Orlando, USA, Oct.1215, pp.14871491,
1997.
[22] Perez CA, Holi” CA, Diaz E, “Genetic Selection of N”s Hidden Units Improving Handwritten Digit Recognition Aiding the Blind to Read”, Med. & Biol. Eng. & Computing, Abstracts of the World Congress on Medical Physics and Biomedical Engineering, Nice, France, Sept.1419, V.135, pp.513, 1997.
[23] Jain AK, “Edge Detection”, in l+ndamentals of Digital Image Processing, Prentice Hall, 1989,pp.347356.
[24] R. Rutenbar, “Simulated Annealing Algorithms: An Overview”, IEEE Circuits and Devices Magazine, pp 19  26, January 1989.
Molto più che documenti.
Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.
Annulla in qualsiasi momento.