Sei sulla pagina 1di 26

Real-time Face detection and Emotion & Gender

Classification using convolution neural network

Presented by:
Md. Abu Rumman Refat
Roll:1218005
Reg:1264
Session:2013-14
Dept. of Information & Communication
Engineering
Islamic University, Kushtia-7003,
Bangladesh.
Synopsis
o Introduction
o Motivation
o Problem description
o Deep learning
o Deep convolution neural network
o Experiment and evolution
o Live demo of the project
o Conclusion
oFuture Scope
Introduction
o Facial emotion and gender recognition is one of the most cognitive function that our brains
perform efficiently.
o Facial emotion and gender is an important cue for sensing human behavior understanding and
intention in people.
o Facial expression recognition has multiple
areas:
1. Medical research and Assistive healthcare
2. Security
3. Targeted marketing
4. Pain detection and Detection of truthfulness
5. Augmented reality
6. Better selfie
Motivation
oThe computer scientists has empowered to tackle problem in computer vision.
o In spite difficulties facial emotion and gender recognition systems are robust and computationally
efficient.
o This work present the design and implementation of a real-time vision system which accomplishes
following task:

 Face detection
 Facial emotion classification
 Gender classification
Problem description
o Gender classification can be classified into
Training data
two class such as Men and Women where
Human facial emotion recognition can be
Feature
easily classified into seven universal emotions:
Extraction
happy, angry ,sad, surprise, fear, disgust and
neutral.
Input Face Classifica
Prediction
o Real-time facial expression recognition image detection tion
problem has three important task:
• Face detection
• Feature extraction and
• Classification
Face detection
o Used OpenCv to capture the live image.
o Using haarcascade image processing technique to detect the human faces.
o Haar Classifier is a machine learning based approach created by Paul Viola and Michael
Jones.
Deep Learning
oDeep Learning is an artificial intelligence function that enables a machine to imitate the
workings of the human brain for use in decision making.
oDeep learning maps inputs to outputs and It finds correlation.

Applications:

Colorization of black and white images

Automatic Machine Translation.

Object Classification in Photographs.

Image Caption Generation.
Convolution neural
network( CNN )
o CNNs are artificial neural network (ANN) that impose constrains on the weight in order to
account for a more specialized data.
o CNNs replace the matrix multiplication in fully connected ANNs for the convolution
operation.
What is Convolution
Activation function- ReLu
Advantage:
o Greatly accelerate the convergence of SGD.
o Avoids expensive computations ReLu.
Max-polling

o It reduces the dimensionality of each feature map but retains the most important information.
Fully connected layer

o The purpose of the Fully Connected layer is to use high-level features for classifying the input
image into various classes based on the training dataset
Proposed Model
o Our proposed model is inspired by Xception
architecture.
o This architecture combines two of most successful
experimental assumption in CNNs: the use of
residual modules and depth-wise separable
convolutions module.
o We also perform Global average pooling and
softmax operation.
Residual modules
o Residual modules modify the desired mapping between two subsequent layer. So that the
learned feature become the different of the original feature map and the desired feature.

Figure: Residual modules


Depth-wise separable
convolution
o The main purpose of these layer is to convert spatial cross correlation from the channel cross
correlation.
o The use of depth-wise separable convolutions reduce the computation with respect to the
standard convolution.

Figure : Different between (a) Standard convolution and (b) Depth-wise separable convolution
Global average pooling
o A solution to reduce the number of parameters is to use global average pooling.

Figure: Global average pooling layer.


Experiment and Evaluation
Dependency:
o Programming Language used: python
o Deep learning frameworks: keras and Tensorflow
o Real time implementation : Webcam, OpenCv
o Training environment: Google collaborator (Provide free GPU services )
Experiment and Evaluation
Dataset:
o FER- 2013 dataset ( Emotion classification)
oTraining set: 28,709 patterns
oValidation data set: 3,589 patterns
oTesting set: 3,589 patterns

Figure: FER-2013 dataset


o IMDB-face images dataset ( gender classification)
oTraining set:  460,723 face images 
oValidation data set: 91,145 images
oTesting set: 73,715 images
Figure: IMDB-face images dataset
Experiment and Evaluation
Model Evolution:
o Our model architecture is a fully convolutional neural network that contains 4 residual depth-
wise separable convolution modules.
o We have approximately 60,000 parameters ( a reduction of more than 10 times compared to
other pretrained CNNs model)
o Our complete pipeline including the Open cv face detection, the gender and facial emotion
recognition takes :
◦ 0.11 ± 0.02 seconds on a i5 4210M CPU.
Confusion Matrix

From the matrix, we can infer that fear is confused with sadness
and sadness is confused with neutral.
Ambiguous
Live demo of the project
Conclusion
Conclusion :
o Our proposed model achieves accuracies of 95% in gender classification and again 66% in
facial emotion recognition.
o It achieves almost same accuracies in classification task of our literature review method but
we can reduce the number of parameters means computation efficiency and memory space.
Future Scope
Future plan :
o Extend the data samples (eliminate the dataset bias ) : Most of the sample contain western
face in gender classification ( IMDB dataset ).
o Reduce more parameters.
o In corporate more classification such as “Age” .
What do you think?

Can we build machine that


can recognize our facial
emotion and gender
efficiently by using CNN ?

Yes or NO?

Potrebbero piacerti anche