10 1 1 8 8687

E ective Implementation of Linear Discriminant Analysis for Face Recognition and Veri cation
Yongping Li, Josef Kittler, Jiri Matas

Centre for Vision Speech and Signal Processing, University of Surrey, Guildford, Surrey GU2 5XH, England
fY.Li, J.Kittler, G.Matasg@ee.surrey.ac.uk
Linear Discriminant Analysis (LDA) play an important role when the LDA is applied to the high dimensional pattern recognition problem such as face recognition or veri cation. The LDA implementation in the context of face recognition and veri cation is investigated in this paper. Three main algorithmic techniques: matrix transformation, the Cholesky factorisation and QR algorithm, the Kronecker canonical form and QZ algorithm are proposed and tested on four publicly available face databases(M2VTS, YALE, XM2FDB, HARVARD)1. Extensive experimental results support the conclusion that the implementation based on the Kronecker canonical form and the QZ algorithm accomplishes the best performance in all experiments.
Abstract. The algorithmic techniques for the implementation of the
1 Introduction
The linear discriminant analysis (LDA) approach to feature extraction is well known 5]. A detailed description of the LDA for pattern recognition can be found in 2]. Theoretically, LDA-based features should exhibit classi cation performance superior to that achievable with the features computed using Principal Components Analysis (PCA). However, in the context of face recognition or verication the LDA method has only occasionly been reported to outperform the PCA approach 7]. Better performance of the LDA method as compared with the PCA approach was reported in 1] 3] 4]. Notably, no details regarding the implementation of the LDA algorithm were presented in these papers. So far little attention has been paid to the implementation of LDA. In this work we test the hypothesis that poor performance of LDA in face recognition experiments (as an example of high-dimensional problems with a small training set) can be at least partially explained by incorrect selection of the numerical method for solving the associated eigenvalue problem.
1
See http://www.tele.ucl.ac.be/M2VTS for M2VTS database; see http://cvc.yale.edu/projects/yalefaces/yalefaces.html for YALE face database; see http://www.ee.surrey.ac.uk/Research/VSSP/xm2fdb for XM2FDB database; see ftp://ftp.hrl.harvard.edu/pub/faces for HARVARD face database.
The work presented here investigates the e cacy of various algorithms for face recognition and veri cation based on Linear Discriminant Analysis. An implementation which employs the Kronecker canonical form and the QZ algorithm is recommended because of its stability. This robust algorithm achieves the best performance in all experiments. The experiments were undertaken on four publicly available face databases: the M2VTS database, the XM2FDB database, the Yale face database and the Harvard face database. The paper is organised as follows. In Section 2 we brie y describe the theoretical framework of face recognition approaches based on the eigenvalue analysis of matrices of second order statistical moments. Computational considerations and algorithmic techniques for the implementation of LDA are presented in Section 3. A description of the experiments including the face databases, experimental protocol, algorithms involved and the experimental results obtained are given in Section 4. Finally a summary and some conclusions are presented in Section 5.
2 Linear Discriminant Analysis

For a set of vectors xi ; i = 1; : : : ; M , xi 2 RD belonging to one of c classes fC1 ; C2 ; : : : ; Cc g, the between-class scatter matrix SB and within-class scatter matrix SW are de ned as
SB = SW
c
X( ? )( ? ) X X (x ? )(x ? =
c k=1 k k i=1 xk 2Ci k i k
(1)
i
)T
(2)
where is the grand mean and i is the mean of class Ci . The objective of LDA is to nd a transformation matrix Wopt maximising W T SB the ratio of determinants jjW T SW W jj . Wopt is known to be the solution of the W following eigensystem problem ( 2]):
SB W ? SW W = 0 ? Premultiplying both sides by SW1 , (3) becomes: ? (SW1 SB )W = W
(3) (4)
where is a diagonal matrix whose elements are the eigenvalues of Equation (3). In the context of face recognition, the column vectors wi (i = 1; : : : ; c ? 1) of matrix W are referred to as sherfaces 1]. Dimensionality reduction. In high dimensional problems (e.g. in the case where xi are images and D is 105) SW is almost always singular, since the number of training samples M is much smaller than D. Therefore a dimensionality reduction must be applied before solving the eigenproblem (3). Commonly, the
dimensionality reduction is achieved by Principal Component Analysis 11] 1]; the rst (M ? c) eigenprojections are used to represent vectors xi . This also allows SW and SB to be calculable in computer with a normal memory size. The optimal linear feature extractor Wopt is then de ned as: where Wpca obtained by maximising
Wopt = Wf ld Wpca (5) is the PCA projection matrix and Wf ld is the optimal projection
jW T W T S W W j Wf ld = arg max jW T Wpca SW W pcaW j T W

pca B pca
(6)
3 Algorithms for the SB ? SW (pencil) eigenproblem

Though both SB and SW are real symmetric matrices, the product matrix SP ? where SP = SW1 SB needs not be symmetric 6]. Therefore directly solving the eigenproblem of single matrix (SP ) will lead to unstable eigenvalues and eigenvectors. Several algorithmic techniques can be used to solve the eigenproblem of SB w ? SW w. The set of matrices in the form of SB ? SW is referred to as a linear pencil. As both SB and SW are covariance matrices, they are either positive de nite or semi-positive de nite real symmetric matrices. The matrix transformation technique. A matrix transformation technique described in 2] is used to convert the pencil to a transformed real symmetric matrix. In order to obtain this transformation, the eigenvalues and eigenvectors of SW are rstly computed. They are denoted as w and Vw and satisfy SW Vw ? Vw w = 0 (7) If SW is a symmetric positive de nite matrix, there exists a matrix B such that
B T SW B = I
Comparing (7) and (8), we have
(8)
problem becomes
B = Vw ?1=2 (9) w 0 Let us de ne a tranformation matrix SB = B T SB B . Then the eigensystem

(10) c ? 1) non-zero eigenvalues. This means that all the relevant information is compressed into d 0 eigenvectors associated with the non-zero eigenvalues of SB . Denoting the system of these eigenvectors by V', i.e.,
0 SB V ? V ~ = 0 As the rank of SB ' is at most c ? 1, it will have only d (d
V 0 = fv1; v2 ; :::; vd g
(11)
the optimal feature extractor Wopt is given as
Wopt = BV 0 = Vw ?1=2 V 0 w
(12)
Cholesky factorisation and QR algorithm. When the dimensions of SW and SB are reduced from D to n(n (M ? c)), SW becomes positive de nite and the pencil turns out to be a symmetric de nite pencil. The problem of (3) can then be converted to (13) below using the congruence transformation.
(X T SB X )W ? (X T SW X )W = 0
T T where SW = SW 2 IRn n , SB = SB 2 IRn n
(13)
and the matrix X satis es
X T SW X = I; X T SB X = diag( 1 ; : : : ; n ) (14) The steps to compute Wopt are de ned as follows { The Cholesky factorisation SW = LLT is computed rst using the method
given in 8]. Formula (4) now becomes
L?T L?1 SB W = W Multiplying both sides by LT , we get L?1 SB W = LT W

which can be rewritten as (L?1 SB L?T )(LT W ) = (LT W ) or where P = L?1SB L?T and Y = LT W decomposition
(15) (16)
PY ? Y = 0
(17)
{ Then the symmetric QR algorithm can be applied to compute the Schur { Finally, the eigenvectors are calculated by
QT PQ = diag( 1 ; : : : ; n ) Wopt = L?T Q
(18) (19)
Kronecker canonical form and QZ algorithm. The more general situation is that both SW and SB are near singular, thus neither the matrix transformation nor the Cholesky factorisation can be applied. In such situations the QZ algorithm must be employed. The main idea of the QZ algorithm introduced in 9] is to transform matrices ~ ~ SW and SB simultaneously to triangular matrices B and A that satisfy ~ ~ B = QSW Z; A = QSB Z (20)
where matrices Q and Z are derived as a product of the Gauss transformation. Hence the eigenproblem of SB W = SW W is equivalent to ~ ~ AW 0 = BW 0 (21) ~ If the diagonal elements of matrix B are non zero, i.e., ~ii 6= 0, then the eigenb values and eigenvectors of the original pencil are obtained by = aii =~ii ; Wopt = ZW 0 ~ b (22) The behaviour of the QZ algorithm on pencils that are not only regular but also nearly singular is analysed in 15]. The results reported in that paper strongly support that when the pencil is singular or near singular the QZ algorithm should be preceded by an algorithm which extracts the singular part of the pencil. This situation is likely to arise when LDA is involved in face identi cation. If the pencil is converted into the Kronecker canonical form(see 13]), the general QZ algorithm always works well. This is also proven by our experimental results shown in Section 4.
i
4 Experiments
4.1 Face Veri cation Experiments
The comparative performance of the algorithms presented in Section 3 was tested as part of a face veri cation experiment performed on data from three publicly available face databases. All implementations of LDA were tested using the same protocol. Three experimental ensembles were selected from the three di erent face databases. They are all registered using an approach based on the eyes positions. { The EPFL ensemble: 37 persons, 4 images per person selected from four shots of the M2VTS database. { The YALE ensemble: 15 subjects, 11 images per subject containing all images in the YALE face database. { The SURREY ensemble: 100 individuals, 4 sessions, one image per session selected from the XM2FDB database
Experimental ensembles
Experimental protocol
The experimental protocol was rstly designed for the performance evaluation of various methods carried on the M2VTS database. It combines the 'leave-oneout' strategy and the rotation scheme 2]. For a general ensemble of c persons, s sessions(shots), each person in turn is labelled as an imposter, whilst the (c ? 1) others are considered as clients. The training set consists of (s ? 1) shots of (c ? 1) clients. The remaining one shot is used as the test set. Each client tries to access under his or her own identity (ID) and the imposter tries to access under the ID of (c ? 1) clients. After all rotations, the number of client and imposter tests is
c s (c ? 1). This procedure leads to 5328 tests for EPFL ensemble, 2310 tests
Results of the face veri cation experiments
for YALE ensemble and 39600 tests for SURREY ensemble.
The following implementation methods are applied in the experiments for comparison. { Eigenface approach based on the Principal Components Analysis abbreviated as \PCA". { LDA method implemented using the Cholesky factorisation and the QR algorithm abbreviated as \QR". { LDA method implemented using the Kronecker canonical form and the QZ algorithm abbreviated as \QZ". { LDA method implemented using matrix transformation techniques abbreviated as \MT". The eigenface (PCA) method used all of the available eigenfaces to make the results as good as possible, so did all the 'LDA' algorithms. Experimental results are presented in terms of the receiver operating characteristics curve(ROC). All ROC curves are displayed in Figures 1, 2, 3. The equal error rate(EER) for all experiments are given in Table 1. We can nd from these results that the \QZ" algorithm achieves the best performance on every ensemble.
ROC curves of LDA algorithms
(Images: EPFL ensemble from M2VTS database)
0.30
False Rejection
0.20
Fig. 1. Performances on
the EPFL ensemble
EE R
PCA(Eigenface) LDA(QZ algorithm) LDA(QR algorithm) LDA(MT algorithm)
0.10
0.00 0.00
0.10
0.20 False Acceptance
0.30
Results of algorithms Ensemble \PCA" \MT" \QR" \QZ" EPFL 22.3 26.2 26.2 3.1 YALE 33.3 18.2 6.5 2.3 SURREY 9.3 19.6 19.6 2.9
Table 1. Equal error rates of veri cation experiments using the eigenprojection method (PCA), Cholesky factorisation (QR), Kronecker canonical form (QZ) and the Matrix transformation technique (MT).
From Table 1, we nd not only that there are big di erences between the results when LDA is implemented using various algorithms but also that the eigenface
Results analysis and improvement consideration

(Images: YALE ensemble from YALE database) 0.40
0.30
False Rejection
0.20
the YALE ensemble
EE
0.10
0.00 0.00
0.10
0.30
0.40

(Images: SURREY ensemble from XM2FDB database)
0.30
EE
False Rejection
0.20
the SURREY ensemble
0.10
0.00 0.00
0.10
0.30
method in some cases achieves better results than the \QR" and \MT" algorithms. Similar results were also obtained by other researchers. One example can be found in 7] where the results of the eigenface method were better than those yielded by the MDF(most discriminant feature) method2 . This contrasts with the theory which favours LDA. We were surprised that the LDA method was seldom reported to exhibit better results than the eigenface approach. Eventually, we developed a robust implementation (\QZ") of LDA which always outperforms the eigenface method. While comparing the immediate results, we found that an important processing step was omitted in both \QR" and \MT" algorithms. Checking the background of these algorithmic techniques, some pre-requisites play an important role in the application. For example, the Cholesky factorisation requires the matrix to be \well-conditioned enough"; the Jacobi method
2
The MDF method is a di erent name of the LDA approach.
which nds the eigensolution for a real symmetric matrix performs the eigenvector normalisation at every iteration with a roundo thresholding precision. If the matrix is quite well-conditioned, the orthonormalisation of the eigenvectors will be satisfactory for both the \QR" algorithm and the \MT" algorithm. The QZ algorithm is designed to deal directly with the pencil problem(in the form of Ax = Bx) and its performance is una ected by singularity or near-singularity of A, B or A ? B . However, except when SW is really singular, both \QR" and \MT" algorithms can achieve nearly the same results as \QZ" algorithm by applying a supplementary normalisation of the nal eigenvectors.
4.2 Face Recognition Experiments

As a comparison, the face recognition experiments described in 1] were repeated using the LDA approach implemented with \QZ" algorithm. Images from both the Harvard database and the Yale database were centred and cropped to emulate the data shown in 1]. Results are given in Tables 2, 3.
Error Rate(%) of face recognition experiments on Harvard Database Method Reduced Extrapolating from subset 1 Interpolating between subsets 1, 5 Space Subset 1 Subset 2 Subset 3 Subset 2 Subset 3 Subset 4 sherface4 4 0.0 0.0 4.6 0.0 0.0 1.2 LDA (\QZ") 4 0.0 0.0 3.1 0.0 0.0 0.0
Table 2. Extrapolation and interpolation experimental results of variation in lighting

\Leave-one-out" of Yale Database Method Reduced Error Rate(%) Space Close Crop Full Face sherface4 145 7.3 0.6 LDA (\QZ") 14 3.6 0.6
Table 3. Experimental results
of variation in facial expression and lighting
The results show that the LDA approach implemented using the \QZ" algorithm achieves better results in all three face recognition experiments compared with the best results reported in 1].
5 Conclusions
In this paper, a detailed study of three numerical algorithms for linear discriminant analysis(LDA) in the context of face veri cation has been investigated. A
4 5
No description for the sherface method implementation was given in 1]. There are only 15 subjects(11 images/subject) in the Yale database; the maximum reduced space is 14.
robust algorithm has been proposed and tested. It achieves a very good performance when face images are registered using a semi-automatic face image registration method based on eyes positions. The LDA is a powerful classi er, it is robust to lighting conditions, facial expressions and small pose changes. The experiments performed show that the selection of an appropriate implementation of LDA in uences signi cantly the verication performance. The \QZ" algorithm always outperforms other algorithms regardless of the face ensemble.
Acknowledgements
The research work has been carried out within the framework of the European ACTS-M2VTS project. Acknowledgement is also due to the centre for Computational Vision & Control at Yale University and Harvard Robotics Laboratory for the permission to use their face databases.
References
1. P. Belhumeur et al, \Eigenfaces vs. Fisherfaces: Recognition Using Class Speci c Linear Projection," in Proc. of ECCV'96, pp 45{58, Cambridge, UK, 1996. 2. Pierre A. Devijver and Josef Kittler. \Pattern Recognition: A Statistical Approach," Prentice-Hall, Englewood Cli s, N.J., 1982. 3. K. Etemad and R.Chellappa, \Face Recognition using Discriminant Eigenvectors," In Proc. of ICASSP'96, pp. 2148-2151, 1996. 4. K. Etemad and R.Chellappa, \Discriminant Analysis for Recognition of Human Face Images," In Proc. of AVBPA'97, pp. 127-142, 1997. 5. R.Fisher, \The use of multiple measures in taxonomic problems," Ann. Eugenics, vol.7, pp. 179-188, 1936. 6. G.H. Golub, and C.F.Van Loan, Matrix computations, The Johns Hopkins University Press, Baltimore and London, 1989. 7. C.Liu and H.Wechsler, \Face Recognition Using Evolutionary Pursuit," in Proc. of ECCV'98, Vol. II, pp. 596-612, June 1998. 8. R.S. Martin et al, \Symmetric decomposition of a positive de ne matrix," Numer. Math., Vol. 7, pp. 362-383, 1965. 9. C.B.Moler et al., \An algorithm for generalized matrix eigenvalue problem Ax = Bx," SIAM J.Numer. Anal., Vol. 10, pp. 99-130, 1973. 10. William H. Press, et al., Numerical recipes in C: the art of scienti c computing, Cambridge Press, 1989. 11. L. Sirovich et al, \Low-dimensitional procedure for the characterization on human faces," J. Opt. Soc. Am. A, vol. 4, no. 3, pp 519-524, 1987. 12. M.Turk and A.Pentland, \Eigenface for Recognition," Journal of Cognitive Neuroscience, Vol. 3, no. 1, pp. 70-86, 1991. 13. P. Van Dooren, \The computation of Kronecker's form of a singular pencil," Linear Algebra and Appl., Vol. 27, pp. 103-140, 1979. 14. J. Wilkinson et al., \Handbook for Automatic Computation Vol II," Linear Algebra, Springer-Verlag, Berlin Heidelberg, New York, 1971. 15. J. H. Wilkinson, \Kronecker's Canonical Form and the QZ Algorithm," Linear Algebra and Appl., vol. 28, pp. 285-303, 1979.

10 1 1 8 8687

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

10 1 1 8 8687

Caricato da

Copyright:

Formati disponibili

E ective Implementation of Linear Discriminant Analysis for Face Recognition and Veri cation

Yongping Li, Josef Kittler, Jiri Matas

Abstract. The algorithmic techniques for the implementation of the

2 Linear Discriminant Analysis

SB W ? SW W = 0 ? Premultiplying both sides by SW1 , (3) becomes: ? (SW1 SB )W = W

jW T W T S W W j Wf ld = arg max jW T Wpca SW W pcaW j T W

3 Algorithms for the SB ? SW (pencil) eigenproblem

B = Vw ?1=2 (9) w 0 Let us de ne a tranformation matrix SB = B T SB B . Then the eigensystem

the optimal feature extractor Wopt is given as

and the matrix X satis es

L?T L?1 SB W = W Multiplying both sides by LT , we get L?1 SB W = LT W

Results of the face veri cation experiments

for YALE ensemble and 39600 tests for SURREY ensemble.

0.20 False Acceptance

Results analysis and improvement consideration

ROC curves of LDA algorithms

PCA(Eigenface) LDA(QZ algorithm) LDA(QR algorithm) LDA(MT algorithm)

0.20 False Acceptance

ROC curves of LDA algorithms

0.20 False Acceptance

The MDF method is a di erent name of the LDA approach.

4.2 Face Recognition Experiments

Table 2. Extrapolation and interpolation experimental results of variation in lighting

Table 3. Experimental results

of variation in facial expression and lighting

Potrebbero piacerti anche