Sei sulla pagina 1di 46

Face Recognition using PCA

based Eigenfaces
by

Sourav Gupta
Hitesh Tikmani
Dipanjan Das
Mayank Shekhar
Manish Chakraborty

Mentor- Prof. Sourav Mitra

Overview

Introduction
Objectives
Problem Statement
Principle Component Analysis
Methodology
Flowchart
Data Flow Diagram
Results
Discussion
Conclusion
Future Scopes
References

Introduction

Face recognition can be applied for a wide variety


of problems like image and film processing,
human-computer interaction, criminal
identification etc.
Our goal is to implement the model for a
particular face and distinguish it from a large
number of stored faces with some real-time
variations as well.
APPLICATION AREAS:

Access Control
Entertainment
Smart Cards
Information Security
Law Enforcement & Surveillance

Objective

To create Face Recognition system using


Principle Component Analysis based
Eigenfaces

To comprehend Eigenfaces method of


recognizing face images and tests its
accuracy.

To set up test platform for determining the


accuracy of this technique

Problem Statement
Given

an image, to identify it as a face and/or extract face images


from it. To retrieve the similar images (based on a heuristic) from
the given database of face images.

Other

problem is to identify certain criminals especially in


identification technique used by the police. Face recognition helps
to recognize the facial image in more efficient and accurate in
order to match with the identity stored in the database.

In

security system, many type of password are used to access the


private and confidential data. Passwords and PINs are hard to
remember and can be stolen or guessed . Face recognition is more
secure because facial image had been used as the ID. It also helps
to avoid any duplicated identification.

Principle Component Analysis


Introduction

PCA is the general name for a technique which uses


sophisticated underlying mathematical principles to
transforms a number of possibly correlated variables
into a smaller number of variables called principal
components.

PCA uses a vector space transform to reduce the


dimensionality of large data sets. Using mathematical
projection, the original data set, which may have
involved many variables, can often be interpreted in
just a few variables (the principal components).

In PCA dimension of data is reduced and


decomposes the face structure into orthogonal
and uncorrelated components which are the
eigenfaces. The face image can be represented as
a weighted sum of the Eigenfaces.

When the new face image to be recognized its


eigenvalue and weights are calculated. Then
these weights are compared with the weights of
the known face images in the training set. It is
done by calculating the Euclidian distance
between the new face image and the faces in
training set. If the Euclidian distance is minimum,
then the face is known and if it is maximum, then
the face is unknown.

Training Set

A training set is formed by combining different


face images. The images of all the different
individuals were collected in a set called the
training set . Face images are collected from
different type of sources. The sources may be
video camera or face image database on the
photos. The collected face images should have the
pose, illumination and expression, etc. variation in
order to measure the efficiency of the face
recognition system under these conditions.
Processing of the face database require
sometimes before it is used for recognition.

EigenFace

Eigenfaces are the set of features in the form of vector that


denotes the variation between faces. The basic idea behind
eigenface is that think of the face as a weighted combination
of some component or base faces. These components are the
principal components of the face image. The training face
images and the new face image can be represented as linear
combination of eigenfaces.
For generating eigenfaces, large set of images of human face
is taken first. Then the images are normalized to get the
features. Then the eigenvectors of the covariance matrix of
the face image are extracted. Eigenvectors are formed by
converting the image matrix into vector form and the vector
should satisfy the eigenvalue equation.
Hence eigenfaces provide a means of applying data
compression to images for identification purposes.

Weight Vector
For recognition, weight of the largest Eigenfaces is
calculated from the training faces. When the new face
image to be recognized, we calculate the weights
associated with the Eigen faces, which linearly approximate
the face or can be used to reconstruct the face. Now these
weights are compared with the weights of the known face
images so that it can be recognized as a known face or
unknown face

Euclidean Distance
Euclidean distance or Euclidean metric is the ordinary distance
between two points that one would measure with a ruler, and is
given by the Pythagorean formula. By using this formula as
distance, Euclidean space becomes a metric space. The
Euclidean distance between points p and q is the length of the
line segment connecting them. Here we use the Euclidian
distance to compare the training faces and input faces. We
calculate the Euclidian distance between the input image and
training face. Known face is with minimum Euclidian distance
and unknown face with largest distance. The input face is
considered to belong to a class if Euclidean distance (k) is
below a threshold . Then the face image is considered to be a
known face. If the input image is above the threshold, the face
is determined as unknown.

Methodology

Development of Eigenface method have some process


which is the first stage is load images from database. This
process is to load all training images and return their
contents (intensity values). Then construct the image which
is in this part there have some calculation to get mean
image, normalized image, covariance matrix and determine
the eigenvector. Third stage is classified the new image
where we need to insert the new name image to continue
the process. The functionality of this process is to given a
test image, this function is able to determine whether it is a
face image or not. If it is a face image, does it belong to
any of the existing face classes? If so, which face class does
it correspond to? The result will appear as bar chart after
done the process.

1.Loading Database
. The Database used in this project is AT&T Database of faces.
. The database contains 400 images of 40 different persons
i.e. 10 images for each person. . For some subjects, the
images were taken at different times, varying the lighting,
facial expressions (open / closed eyes, smiling / not smiling)
and facial details (glasses / no glasses).
. Later for test cases the database is modified only 3 images
of each person is kept and for another test case 5 for each
person
. No extra code is needed to load the database as the images
are stored in same folder where the matlab file for project
source code is kept which help us get rid of some another
block of codes.

Shows the training set of project, each person with 1 image only

2. Normalization of Images
Here we change the mean and std of all
images. We normalize all images.
This is done to reduce the error due to
lighting conditions.
Transpose of all images are taken into
consideration to decrease the dimensionality
of images.
Images are converted into column vectors as
shown in the Fig.

Shows the normalized images of database images

3. Calculation of Mean
Now after the images are normalized the
common features of all images are retrieved
called mean face and after calculation of
mean face, from each and every image in
database the mean is subtracted so that
each and every image only contain its
unique features.
Image you merged all your columns into one
column. Thus, you will add all Images
Columns pixel-by-pixel or row-by-row. And
then, it is divided by M total images. You
should have a single column which contains
the mean pixels. This is what we called

Mean face

4. Calculation of covariance matrix


After calculation of mean face and subtraction of mean from
each image in database the vector formed is multiplied with
its own transpose matrix
Suppose: L=A*AT
And using function eig(L) the eigenvectors and eigenvalues
are calculated and eigenvectors with eigenvalues zero are
neglected .
Now sorting the eigenvectors according to their eigenvalues
will return an ascending sequence.
From these eigenvectors eigenfaces can be driven in sequence
like eigenfaces with most features will be first then with little
lower features and like this eigenface with lowest features will
at last.

Shows block diagram of eigenface

Shows the eigenfaces of images

5. Finding weight of each face in training


set
Eigenfaces are shadow of their original faces
or can be said as lower dimensional result of
their original faces , also the features
associated with each and every eigenface is
unique for their respective face images
except the mean we calculated which is
common feature of all the face images
called mean face.
Each image in dataset can be represented in
terms of these k principle components. Each

6. ACQUIRE NEW IMAGE AND RECONSTRUCTION


Take input image read it and reconstruct it.
Here we change the mean and std of input image.
We normalize input image. This is done to reduce
the error due to lighting conditions.
Transpose of input image are taken into
consideration to decrease the dimensionality of
images.

Shows the input image and reconstructed input


image

7. EUCLIDEAN DISTANCE AND RECOGNITION


Just like the weight column vector calculated for database
set the same will be calculated for input image giving the
result of eigenfaces it is associated with and the proportions
of each eigenface respectively.
Now the weight column vector of input image is matched
against all the weight column vector of datasets If accurate
match is not found then the one with minimum
difference/distance (Euclidean distance) is given as output
now if the distance is under the threshold kept for
recognition then the output is recognized Image else
Unknown image .

Show the final result of the project with weight of input


image,
Euclidean distance and recognized/Unknown face.

Data Flow Diagram

Shows level 0 DFD

Shows level 1 DFD

RESULTS
M is the number of face images in training set.
1. Training set with 400 images including 40 persons with
10
images of each .
In total M=400, Threshold=1.2200e+04
* Input Image is one of the images in database itself

NOTE: exact in output means that exactly same image is


recognized.
Image No. is the serial number of that image in database.
exact is the favorable output.
REMARK: Since input image is itself in the database used in
this case the output is exactly the same image is recognized.
ACCURACY: 100%

2. Training set with 350 images including 35 persons with 10


images of each
In total M=350, Threshold=1.2200e+04
* Input Image is one of the image among 50 left out
images
including 5 persons with 10 image of each. So.
Database this time doesnt contain any of the input image

REMARK: Since, images of person in input image are


removed from database, all the output must have been
UNKNOWN instead 5/10 is recognized. Accuracy can be
improved by decreasing the threshold.
ACCURACY: 50%

3. All datasets, input images, value of M are same as in test case


2 except
threshold is changed to 1.1800e+04.

NOTE: Threshold is changed to 1.1800e+04 from test case 2.


REMARK: Observing and analyzing the behavior system the threshold is
changed, resulting more accurate system but only for this database
because dataset contains 10 image of each person providing enough
features on each person. More the images of single person used in
database more accurate the result will be, just be careful while choosing
threshold.
ACCURACY: 90%
* Overall accuracy of test case 1 and 3 with threshold 1.1800e+04 is
95%

4. Training set with 280 images including 40 persons with 7 images of


each
In total M=280, Threshold=1.2200e+04
* Input image is one among 120 images left out, including 40 persons
with
3 images of each.

REMARK: Decreasing the number of images of each and every person


in
database leading to less features of each person so might have to keep
the threshold value a little bit higher. In this case keeping threshold to
1.1800e+04 didnt affect the result but it will for sure when not even a
single image of person in input image is found in database.
ACCURACY: 90%

5. Training set from 280 is decreased to 210 removing all images of 10


persons from current database.
M=210, Threshold=1.2200e+04
* Input Image is one the 70 images removed from database just.

REMARK: Since, images of person in input image are removed from


database, all the output must have been UNKNOWN instead 3/10 are
recognized. Increasing images of single person in database will help for
sure and carefully choosing the threshold might help.
ACCURACY: 70%
* Overall accuracy of test case 4 and 5 with threshold 1.2200e+04 is
80%

6. Training set with 200 images including 40 persons with 5


images of each.
In total M=200, Threshold=1.2200e+04
* Input image is one among 200 images left out, including 40
persons with 5 images of each.

REMARK: Decreasing the number of images of each and every


person in
database leading to less features of each person so might have to
keep
the threshold value a little bit more higher than what it might have
been for
test case 4.
ACCURACY: 70%

7. Training set from 200 is decreased to 150 removing all


images of 10
persons from current database. M=150,
Threshold=1.2200e+04
* Input Image is one the 50 images removed from database
just.

REMARK: Since, images of person in input image are removed from


database, all the output must have been UNKNOWN instead 3/10 is
recognized. Increasing images of single person in database will help for
sure and carefully choosing the threshold might help.
ACCURACY: 70%
* Overall accuracy of test case 6 and 7 with threshold 1.2200e+04 is
70%.

Discussion
Only

PCA based eigenfaces algorithm was analyzed in this paper.


The result we get is in a rough scale. Many other issues were
ignored to simplify the research scope.

The

Database used in this project is AT&T Database of faces. There


are ten different images of each of 40 distinct subjects. For some
subjects, the images were taken at different times, varying the
lighting, facial expressions (open / closed eyes, smiling / not
smiling) and facial details (glasses / no glasses).

More

number of images of each person in database is


recommended at least 10 to get good result .

Threshold

must be chosen after carefully analyzing and observing


the system otherwise the result might not be good.

Conclusion

The experiment has been done in a short period of time. Only one
algorithm was analyzed in this paper. So from the result we can
generalize in a rough scale. As many other issues were ignored to
simplify the research scope, this generalization may not be entirely
relevant to a real life dataset.

The approach is definitely robust, simple, and easy and fast to


implement compared to other algorithms. It provides a practical
solution to the recognition problem. We are currently investigating
in more detail the issues of robustness to changes in head size and
orientation.

This technology could be used commercially for example as a


security measure at ATMs; instead of using a bank card or
personal identification number, the ATM would capture an image of
customers face, and compare it to his photo in the bank database
to confirm his identity. In elections government can employ facial
recognition software to prevent voter fraud.

Future Scopes

This project is based on eigenface approach that gives the


accuracy maximum 92.5%. There is scope for future using Neural
Network technique that can give the better results as compare to
eigenface approach and accuracy can be improved. The whole
software is dependent on the database and the database is
dependent on resolution of camera, so in future if good resolution
digital camera or good resolution analog camera will be in use then
result will be better. So in future the software has a very good
future scope if good resolution camera and neural network
technique will be used. Proper help for user is not being developed;
the help messages for the same to the user can be developed in
future, along with software documentation.

References

Matthew A. Turk and Alex P. Pentland. Eigenfaces for


recognization. Journal of cognitive neurosciences, Volume 3,
Number 1, Nov 27, 2002.

https://www.cs.princeton.edu/picasso/mats/PCA-TutorialIntuition_jp.pdf

IOSR Journal of Engineering e-ISSN: 2250-3021, p-ISSN: 22788719, www.iosrphr.org Vol. 2, Issue 12 (Dec. 2012), ||V4|| PP 15-23

Volume 3, Issue 5, May 2013 ISSN: 2277 128X International


Journal of Advanced Research in Computer Science and Software
Engineering Research Paper Available online at: www.ijarcsse.com

IJRET: International Journal of Research in Engineering and


Technology eISSN: 2319-1163 | pISSN: 2321-7308

Potrebbero piacerti anche