Sei sulla pagina 1di 6

IMPROVEMENTS ON HANDWRITTEN DIGIT RECOGNITION BY COOPERATION OF

MODULAR NEURAL NETWORKS

Claudio A Perez Patricio A Galdames Carlos A Holzmann

Department of Electrical Engineering, University of Chile.


AV.Tupper 2007, Santiago, Chile

ABSTRACT gatmg networks to combine its outputs. For each input the gating
network determines stochasticaly the appropriate expert module
In this paper modular neuml networks are used to improve to respond [9]. W e m [101 considered a weighted combination of
handwritten digit recognition. To evaluate the performance of the outputs of the expert networks. This approach was applied to
modular networks, a comparison is made with a global neural approximate functions of one variable. A similar cooperation
network, on the same database. Two basic kind of modular scheme was presented by Tresp [11, applied to the handwritten
networks are considered. In the first one, seven expert modular digits. Fuzzy logic has also been applied to the cooperation among
networks are used. Five of them are provided for digits 0, 1, 2, 5, e m u21.
6,7. The other two modular networks are for the pair of digits 3-8
and 4-9 respectively. The second kind of modular neural network In the recognition of handwritten digits, Sebire [13] defined a
considers an expert module for each feature extracted eom the number of experts less or equal to the total number of the total
handwritten digit image. The coopemtion is among modules number of classes. The cooperation among experts is performed
extractingslope and radial projection fiom each digit. Two type of using perceptrons and by winner takes all technique. In other
cooperation among modular networks are considered: neural applications one feature has been extracted as input to each
network and weighted combination of the modules outputs. The classifier [14,15]. Cao and Ahmadi [16] used principal
models were trained with a set of 1.837handwitten digits, tested components in the cooperation ofmultipleexperts. The dimension
on a diEerent set of 918 digits where the best weight set was of the input was reduced si@icantly and results were better than
selected for each neural network and finaly results were validated those obtained with a backpropagation network. Error rates were
on a mfferent set of 9 19 digits. Results show that by using modular less than 1% for rejection rates of 10-15%. For rejection rates
network for features, it is possible to improve classification higher than 18% both classifiers yield similar results.
performance on handwritten digits, fiom 91.0% in the case of
global networks to 93.5% of modular networks. In the literature, results on handwitten digit recognitionrates vary
between 68% [17] and 97.7%[14]. It is not possible to compare
different systems only on the basis of the correct classificationrate.
1. INTRODUCTION Most systems have been tested on different data sets and under
Merent conditions [14]. Fin- a method to determine the
Specialization is found in the nervous system as modules optimum architecture for a modular neural network and the best
dedicated to process specific fundions such as vision, touch, form of cooperation among modules, are among the problems
hearing, etc. [1,2]. The notion of speciali2ation or modularity was being addressed by the scientific community [8,9,18].
implemented in expert systems for decision making in the past
decade. In these applications problems were decomposed in Improvements on the recognition rates of handwritten digits by
subproblems and then solved by specialist modules [3,4]. This standard neural networks ( M y connected, feed-forward,
notion was also implemented in neural networks to create expert backpropagation) were introduced by augmentmg the training set
modules in subproblems. It has been stated that global neural by shim and magnification [19-211. In [20-221 genetic
networks have disadvantages compared to modular networks in algorithms were applied to select the appropriate number of
plasticity, in difficultiesto learn heterogenic tasks and in w g hidden units of the network. In th~spaper, modular networks are
time when hrge networks are needed [5]. Since the early ~ O ' Sthe
, used to improve handwritten recognition.Two type of cooperation
combination of multiple classifiers have been proposed as a new schemes are used:one with a neural network and another with
direction for the development of character recognition s ys
tem adjustable weights.
[5,6]. Several forms of Cooperation between modules, such as
v o w , Bayes and confusion matrices were considered in [7,8].
Jacobs et.al. [9], considered a model composed of modular and

0-7803-4778-1/98 $10.00 8 1998 IEEE 417 2


2. COOPERATIONAMONG MODULAR NETWORKS Expert Modules in Features
Cooperation schemes for the expert modules in one feature are
Two type of modular networks are considered. The fist one shown in Figure 2. In Fig.:Z(a)there is cooperation only among the
considers a specialist module for each digit. Therefore, the number outputs of the modular networks. In Fig.2(b) cooperation is
"
of modules of the modular network will be equal to the defmed extended to the input pattems.
number of classes. The second tVpe of modular network uses
modules specialized in one feature extracted ffom the handwritten Slope detection
digit image. One of the features is the slope and it is extracted by a Slopes are detected in several directions for each character. The
gradient operator. The other feature is radial projection respect to operation is implemented by convolution of the 23x15 handwritten
the geometricalcenter. digit image with a 3x3 Prewitt gradient operator [23] rotated for
the desired directions. F'our directions are considered for the
Experts in Digits gradient: O", 45@,90@,and 135". The rotated Prewitt operators are
An expert module is created for each digit (or subset of digits) and shown in Figure 3. The re,mlt of the convolution, C,, between the
trained to recognize only that digit (or subset of digits). For pairs of gradient operator, 4, and the image, A,, is represented m
digits 3-8 and 4-9, where most confusions are produced, only one equation (1). For the case of a 3x3 operator, n=m=3.
module was defmed for each pair. Module i should have a high n m

output only when digit i appears in the input. Modules for subsets
of dlgits have a number of outputs equal to the number of digts in k=l I=1
the subset. In Figure 1, a general scheme for expert mcdules per
digits is shown. In Fig.1 (a) coopemtion is restricted to the output Figure 4 shows the result of convolving the gradient operator with
of the modules. In Fig.1 (b), cooperation is extended to the input the image of a handwitten 5 considering (a) the original image
patterns. and (bd)the four rotations.

Radial Projections Relative to de Geometric Center


The radlal projection relative to the geometric center of the
handwritten digit is obtained by detemlining the radial dlstribution
ooperation
of the digit's pixels respect to its geometric center. First the
geometric center of the chiuacter is obtained. A convex region is
fomed to include the mass of the charader by segments tangential
Mod. N
Mod. I --- Mod. N to its contour as shown in Figure 5. The average vertical and
horizontal length of the cortvex region define the geometric center
of the character. Second, the radial projection is obtained by
inkgratmg the digit's mass in the direction of the segment joining
the geometric center with one of the pixels of the perimeter of the
image
23x15 image. Therefore, the resulting vector with the radial
projections has dimensionsof 76 elements.
Figure 1: (a) Modular network with expert modules per digit and
with cooperation among its outputs. (b) Cooperation is extended to
the input pattems.

fi
ooperatia
output
t
ooperatio
-1 -1 0
CdI
Figure3: Prewitt operator rotated for (a) O", (b) 45", (c) 90"
(d)145@.

Image image

a b
Figure 2: (a) Modular network with expert modules per
feature and cooperation among its outputs. (b) Cooperation is
extended to the input pattems.

41 73
....... .. expand artificially the training set to 9,185 pattems. The testing set
....
PI

is formed by 918 pattems, different %omthose of the training set.


0
..
..
..
o
* a

..
m a

.I
.
The testing set is used to adjust the cooperation scheme after the

... ... . ....


"I..
. 1 . U . U##
..P
..
......". modular networks have been trained individually. The validation
set is composed of 919 difkrent patterns and it is used only to

.
e
....
.#

..
H* e f t
* determine the generalization performance of the network. Figure 6

........ .". .
ea *I
.I
.. shows a sample of the handwitten digit database. In (a), 110
B S D S .
I
n u n dig&used in training and in (b), a set of 110 digts used for
s
U. I validation.

Figure4 (a) Ongmal mage of a handwtten 5, (b-d) Result of


the convolution ofthe Prewtt operator for O",45", 90" and 45"

i I
I
I.
m
I
I

Figure 5: (a) Original handwritten 3 digitued in an 23x15 image.


(b) Convex region cont"g the digit.

Types of Cooperation
I I

Two types of cooperation are considered. In the first type, the


cooperation is performed by a neural network receiving the outputs
of- the modular networks. Additionally, the input pattems could
also be considered as inputs for this network. In the second type of
cooperation, adjustable weights are used to combine the modular
networks' outputs. The weights are adjusted iteratively to
maximize the overall recognition rate. The cooperation scheme
provides 10 outputs (one for each digit) and the maximum is
selected as the system response.

3. TRAINING AM) TESTING

Network Dimensions
According to previous work [19,21] using global neural networks
for the problem ofhandwritten digt recogrution,the dimensions of
a two hidden layer network are 345xN1xN2xlO. Each network is (b)
trained by backpropagation using an augmented training set by Figure 6 shows a sample of the handwtten dlgit database In (a),
slufting the patterns in the input. 110 digits used m t r m g and 111 (b), a set of 1 10 digits used for
validabon
Training, Testing and Validation Sets
A database of 3,674 handwritten digits, obtained form university
students is used for training, testmg and validation. The data base Weight Adjustment
was segmented in three subsets. The training set is composed of Based on work by Hashem [lo] and Tresp [ll], an algorithm to
1,837 patterns and it is used for training global and modular adjust the weights was developed to maximize the total
networks. Augmentation of the training set was performed by recowtion rate for a testing database. The performance of the
shfiing the inputs pattems [19,21] which is in part equivalent to system is measured on the validation set.The algorithm does not

41 74
guaranties the optimum weigh set because the method perfoms a In the case of MNCN, no s:ignificantimprovement was measured
random search to find the best set [24]. For K modular networks, if the input patterns were included in the cooperation in addition to
the ith output of the classifier exx> is denoted as d x ) . The the modular networks. Therefore,the irtcreasein the dimensions of
cooperation consists of a linear combination of the modules' the network for cooperation is not jus.tified. It is observed from
outputs. Table 1, that W C N with tmperation by weights presents higher
recognition rate thanglobal networks (~~0.001).

Table 2 has the same organization as Table 1. In Table 2, no


k=l rejection rate was considered It is obsemed that modular networks
with..Pi k and s, (x) ... the ith output Of the Classification system for features achieve the best classification performance. The
vi E A , 1 5 k <. K And A={1,2 ,..., 10) is the set of symbols differences in classification rate between global and modular
iden- each digit. The weights ... should be computed so that neural networks are highly sigmficant (p<O.OOOl). sigtuficant
each component of the output vector is normalized to a maximum differences between cooperation with tlie neural network and that
of 1. Therefore the weights are normalized according to equation obtained by the weights @0.068). ?here were no significant
(3). differences when input patkms were included in the cooperation
in addition to the outputs from the modular networks. Figure 7
shows the error rate as a fimction of the rejection rate for global
networks and for modular networks for features. It is observed that
the lowest error rates are obtained for the modular network and for
the global network with slopes as inputs. The highest mor rates
correspond to the case ofthe: global network with radial projection
as input.
The weights . a ,k. are computed using an algorithm sirmlar to
~41.

Recognition Rate
All networks were trained at least for 100 epochs and selecting the Table 1: Results of classification perftmmce on the validation
set of weights that maximizes the recognition rate of the network set, considering a rejection threshold of '0.5. In the first column, it
in the testing set. Once the weight set was chosen, the network is shown the average recowition rate in YOfor the 10 simulations.
was applied to validation set to detennine the gendization The second column shows the recognibion rate in YOfor the best
performance of the network. trained network. The rows correslpond to the following
architectures: GNI=global network with an image as input;
To show with some level of confidence that the improvements in GNSglobal network with slopes as inputs; GNP=global network
recognition rate are not due to local minima, all networks were with radial projection; MNCN=modular network for image, radial
trained from merent random starting weight sets. Results are projection and slope with cooperation performed by a neural
presented by the average recognition rate of the ten and the network; MNCW=modular network for image, mhal projection
standard deviation. The Student t-test was used to determine if and slope with cooperation &y weights; bdNDN=modularnetworks
differences between different models were statistically significant.
A ~ 0 . 0 was
5 considered to be statistically sigtuficant.
r7
for digits and cooperation by a neural network; MNDP=modular
networks for digits and cooperation by weights.
Rejection threshold = 0.5
c o n
Classification
-
X+ STD
e c y l
4. RESULTS

Table 1 shows the results of the classification, for the vahdation 90,o k 0,2
set, considering a rejection threshold of 0.5. In the first column,it GNP 71,9f0,6
is shown the average recognition rate in % for the 10 simulations. MNCN without Image 92,O k 0 8 92,7
The second column shows the recognition rate in % for the best W C N w i t h Image 91,9 k 0 3 92,4
trained network. The rows correspond to the following 93,1402 93,5
architectures: GNI=global network with an image as input; 90,4 k 0.5 91,2
GNS=globalnetwork with slopes as inputs; GNP=global network MNDP *
90,4 0.9 91,9
with radial projectioq MNCN=modular network for image, radial
projection and slope with cooperation performed by a neural
network; MNCW=modular network for image, radial projection
and slope with cooperation by weights; MNDN="dular networks
for digits and cooperation by a neural network, MNDFmodular
networks for digits and cooperation by weights

41 7 5
In table 3, the confusion matrix obtained for the modular network Table 3: Confusion malxix for the modular network considering
for image, radial projection and slope with cooperation by weight image, radial projection and slope with cooperation by weight
adjustment when applied to the validation set. Each row shows the adjustment on the validation set. Column number 10 ind~catesthe
number of confusions of the network for the digit specified on that number of patterns rejected for each digit. Each row shows the
row. From this table it is possible to identifj the cases with largest confusionsDer di&.
number of confusions and therefore to develop strategies to
elirmnate the sources of confusion. It is observed that the largest
number of confusionsare for digits: 3-5,4-9 and 5-6. R e y account
for 16 cases of confusion.

Rejection Threshold=O Correct Best Result


Classification
- [”/.I 5. CONCLUSIONS
XL-STD
GNI 92,3 f 0.3 92,7
In this work two types of cooperation among modular neural
GNS 92,7 k 0.4 93,l
networks were presented and results were compared to global
GNP 76,8 k 0.8 78,l
neural networks for the problem of handwitten digt recogrution.
MNCN without Image 93,3 0.5 94,O
Results show that by using modular network for features, it is
MNCN with Image 93,7 f 0.5 94,7 possible to improve classification performance on handwritten
L
MNCW , *
94 4 0.2 94.9 digits, kom 91.0% in the case of global networks to 93.5% of
MNDN 1 91Jf0.4 I 92,3 modular networks. This improvement was achieved for a rejection
MNDP 90,9f 0.7 I 92,2
threshold of 0.5. Besides when no rejection is applied, it is
possible to improve recognition rates &om 93.1%, for the case of
the global network, to 94.9% for the case of a modular network.

ACKNOWLEDGEMENTS
8% Tlus research has been fimded by FONDECYT, grant no. 1960921
and by the Dept. ofElectnca1Engineering, U. of Chile.
7%
REFERENCES
8%
[l] Fishler MA, Firschein 0. “ The Brain and the Comuuter”.
c.l
5 5% in Intelligence: The Eye; the Brain and The Comiuter ”;
01 Addison -Wesley, pp.23-58, 1987.
I
m
CT 4%
ChurcNand PM, “Representation and High Speed
E Computation in Neural Networks”, in The Foundations of
3 3% Artificial Intelligence: A Sourcebook, Partridge & Wilks
eds., Cambridge Univ. Press, pp.337-372, 1990.
2%
131 Erman LD, Hayes-Roth F, Lesser VR, Reddy RD, “The
1% Hearsay-II Speech Understanding System: Integrating
Knowledge to Resolve Uncertainty“, ACM Computing
0% Surveys 12(2), 1980.
ox 50% 100%
Rejection R a t e 1x1 PI Rich E, Knight K, “Distributed Reasoning Systems”, in
Figure 7. shows the error rate as a function of the rejection rate for Artificial Intelligence, McGraw- Hill, pp.433-446, 1991.
global networks and for modular networks for features.

41 7 6
[5] Ronco E, Gawthrop P, “Modular Neural Networks: A [17] Lee DS, Srihari SN, and Pawlicki. T, “Experiments with
State of the Art”, Technical Report CSC-95026, Neural Network Models for Handwritten Digit
University of Glasgow, May 12,22p, 1995. Recognition”, in Systems and Signal Processing, R.N.
Madan, N. Viswanadham, R.L. Kashyap (eds.), pp.757-
[6] Tumer K, Ghosh J, “Analysis of decision boundaries in 774, 1991.
linearly combined neural classifiers”, Pattern
Recognition,29(2): 341 - 348, February 1996. [18] Huang Y, Suen C, “A Method of Combining Multiple
Experts for the Recognition of Unconstrained
[7] Xu L, Krzyzak A, Suen C, “Methods of Combining Handwritten Numerals”, IEEE Trans. on Pattern Anal.
Multiple Classifiers and their Applications to handwriting & M a c h e Intelligence, Vol. 17, No. 1, pp. 90-94, Jan
Recognition” IEEE Transactions on System, Man and 1995.
Cybernetics, Vol. 22, No .1, pp. 418 - 435, May/June
1992. [191Perez, CA, H o h a n n , CA, MorelX, IR,“Optimization of
One and Two Hidden Layer Neural Networks
[8] Ho KT, Hull J, “Decision Combination in Multiple Architecture for Handwritten Digit Recognition”,
Classifiers Systems”, JEEE Trans. on Pattern Anal. and Proceedlings of the 1995 IEEE International Conference
Machine Intelligence, Vol. 16, No.1, pp.66-75, Jan 1994. on System, Man and Cybernetics, Vancouver, Canada,
Oct. 22,-25,pp.2795-2799, 1995.
[9] Jacobs R, Jordan M y Barto A, “Task decomposition
through competition in a modular connectionist [20] Perez C,4, Holmann CA, Diaz E, “Genetic Selection of
architecture: The what and where vision tasks”, Cognition Multilayer Neural Networks Cor Handwritten Digit
Science, 15: pp. 219 - 250, 1991. Recognition to aid the: Blind, 18th Annual International
Conference EEEEMES, Amsterdam, The Netherlands,
[lo] Hashem S y Schmeiser B, “Improving model accuracy Oct. 31-NOV. 3, 3p., 1996.
using linear combinations of trained neural networks”.
IEEE Transaction on Neural Networks. 6(3), pp 792 - [21] Perez CA, Holzmann CA, “Improvements on Handwritten
794,1995. Digit ]Recognition by Genetic: Selection of Neural
Network Topology and by Augmented Training”, 1997
[111 Tresp V, Tanigushi M, “Combining estimators using non- IEEE International Conference on Systems, Man and
constant weighting functions”, NIPS 7, MlT Press, Cybernetics, Orlando, USA, Oct.12-15, pp.1487-1491,
Cambridge MA, pp. 419426,1995. 1997.

[12] Cho S and Kim J, “Combining Multiple Neural Networks [22] Perez CA, H o l i ” CA, Diaz E, “Genetic Selection of
by Fuzzy Integral for robust Classification”, IEEE N ” s Hidden Units Improving Handwritten Digit
Transaction Systems, Man, and Cybemetics, Vol. 25, Recognition Aiding the Blind to Read”, Med. & Biol.
pp. 380-384, 1995. Eng. & Computing, Abstracts of the World Congress on
Medical Physics and Biomedical Engineering, Nice,
[13] Sebire P, Dorizzi B, “MLP Modular Networks for France, Sept.14-19, V.135, pp.513, 1997.
Multiclass Recognition”, Proceedings ESAN” 93,
Brussels, Belgium, pp. 111-116, 7-9, April 1993. [23] Jain AK, “Edge Detection”, in l+ndamentals of Digital
Image Processing, Prentice Hall, 1989, pp.347-356.

[141 Cho, SB, “Neural-Network Classifiers for Recognizing [24] R. Rutenbar, “Simulated Annealing Algorithms: An
Totally Unconstrained Handwritten Numerals”, IEEE Overview”, IEEE Circuits and Devices Magazine, pp 19
Transactions on Neural Networks, Vo1.8, No. 1, pp.43- - 26, January 1989.
53, 1997.

[15] Knerr S, Personnaz L y Dreyfus G, “Handwritten Digit


Recognition by Neural networks with Single - Layer
Training ”. IEEE Transaction on Neural Networks.
V01.3, No 6, pp.962-968, NOV.1992.

[16] Cao J, Ahmadi M, Shridhar M, “A Hierarchical Neural


Network Architecture for Handwritten Numeral
Recognition”, Pattern Recognition, Vo1.30, No.2,
pp.289-294, 1997.

41 77

Potrebbero piacerti anche