Sei sulla pagina 1di 21

Multimedia Tools and Applications

https://doi.org/10.1007/s11042-019-7243-y

An efficient face recognition system based on hybrid


optimized KELM

S. Anantha Padmanabhan 1 & Jayanna Kanchikere


2

Received: 5 November 2018 / Revised: 4 January 2019 / Accepted: 17 January 2019

# Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
Face recognition (FR) from video offers a challenging issue in the area of image exploration along
with computer visualization, furthermore, as such recognized heaps of deem over the previous years
on account of its numerous applications in the scope of domains. The chief challenges in the video
centered FR are the restraint of the camera hardware, the random poses captured by means of the
camera as the subject is noncooperative, and changes in the resolutions owing to disparate lighting
conditions, noise along with blurriness. Numerous FR algorithms were generated in the previous
decennium, although these approaches are much better, the image’s accuracy is less only. To trounce
such difficulties, an efficient FR system centered on hybrid optimized Kernel ELM is proposed. The
proposed work encompasses five phases, explicitly (i) preprocessing, (ii) Face detection, (iii)
Feature Extraction, (iv) Feature Reduction, and (v) Classification. In the preliminary phase, the
data-base video clips are converted in to the frames in which pre-processing are performed utilizing
a Modified wiener filter to eliminate the noise. The succeeding phase is employed for detecting the
pre-processed image via the viola–jones (V-J). With this technique, the face is identified. After that,
the features are extorted. The extracted ones then will be provided as the input to the Modified PCA
approach. Then, perform classification operation using hybrid (PSO-GA) optimized Kernel ELM
approach. The similar process is replicated for query images (QI). At last, the recognized image is
found. Experimental results contrasted with the previous ANFIS classifier and existing methods
concerning precision, accuracy, recall, F-measure, sensitivity along with specificity. The proposed
FR system indicates better accuracy when compared with the prevailing methods.

Keywords Kernel extreme learning machine (KELM) . Modified principal component analysis
(MPCA) . Hybrid particle swarm optimization-genetic algorithm (PSO-GA) . Adaptive neuro-
fuzzy inference system (ANFIS) . Modified wiener filter (MWF)

* S. Anantha Padmanabhan
ananthu.padmanabhan@gmail.com

Jayanna Kanchikere
jayannak69@gmail.com

1
Department of ECE, Gopalan College of Engineering and Management, Bangalore, India
2
Department of EEE, St. Peters Engineering College, Hyderabad, India
Multimedia Tools and Applications

1 Introduction

FR stands as a computer technology that ascertains the location and size of a people’s face on a
digital image which is a key technology in facial information processing. It was extensively
implemented for pattern recognition, electronic commerce, human-computer- interface, au-
thentication, identity, automated video surveillance, health and finance, and also others [8].
There are 2 common FR applications, i) identification, ii) verification. Face identification
means identifying the individual via their own face images. FR establish the presence of an
authorized individual in place of just checking whether a legitimate identification (ID) / key is
being utilized or else whether they know the secret personal identification numbers (Pins) or
passwords [20]. In face verification/authentication, there is a one-on-one matching which
contrasts a QI in opposition to a template image whose identity is being alleged [13].
The most imperative clue for person recognition is their face. If the information is related to
the face, then identify the individual and also search for details using face query. This is
extremely prospective, as cameras now are present on every portable device. Not to mention
wearable device epoch initiated via Google Glass is also scheduled. A system to utilize FR
using a portable device and server communication are promising [19].
Although numerous applications and systems have applied FR technology, this topic is still
a particularly challenging problem in determining the best methods to provide high accuracy
along with low computational. Many researchers were working on FR for years, but still, have
some challenges to solve. To date, numerous methods were employed as of customary
methods, like Eigenface and also Fisherman by implementing algorithms like Principal
Component Analysis (PCA), Linear Discriminant Analysis (LDA) [2], Independent Compo-
nent Analysis (ICA), et cetera. The classification method of Support Vector Machine (SVM)
[18] in addition to Local Binary Pattern (LBP) [3, 11, 21] are as well extensively utilized for
FR. additionally, NN approach [12] or the deep learning (DL) approaches, for instance, the
convolutional neural network (CNN) were used in facial recognition for overcoming the
challenges.
DL stands as a new area on the computer vision & machine learning that was productively
used for image dimensional reduction and recognition. DL is a machine learning technique that
adapts neural network architecture and also consists of a multilayer perceptron (MLP) from
multi-hidden layers. By utilizing a model architecture consisting of several non-linear (NL)
transformations, DL is able to find high-level features in the data. This feature is derived as of
the lowest degree to make a hierarchical depiction [14].
Every FR algorithms meet a performance fall whenever the face appearances are changed
on account of the factors of occlusion, expression, illumination, accessories, pose or aging
[23].
Here, Section 2 offers the surveys of the associated works regarding the proposed work. In
sections 3, a concise discussion about the proposed methodology is presented; section 4
analyzes the Investigational outcome and section 5 will convey the conclusion of this paper.

2 Related work

Sasirekha and Thangavel [22] suggested a scheme to recognize the human faces centered upon
K-nearest neighbor (KNN) employing particle swarm optimization (PSO). Originally, the
aspects were extorted employing LBP. The metaheuristic optimization, say, GA, PSO, along
Multimedia Tools and Applications

with ant colony optimization was examined for the feature selection. The KNN was optimized
utilizing the population-centered PSO. Lastly, the FR was done via the PSO– KNN. Exper-
imentation was performed on instantaneous face images, composed as of 155 subjects, each of
it with 10 orientations utilizing Logitech Web-cam along with ORL faces dataset. The
experimentation outcome of the PSO–KNN was weighted against other standard recognition
techniques, say, decision table, SVM, MLP and also traditional KNN, for deducing the
approach’s effectiveness. This approach gave a drawback (i.e.,) variants of KNN was incor-
porated with PSO optimization which didn’t give a better result.
Suchitra et.al [1] presented a Biogeography-PSO (BPSO) centered Counter Propagation
Network (CPN), explicitly, BPSO-CPN aimed at Sketch-Based Face Recognition (SBFR). A
norm of choosing exemplar vector utilizing biogeography learning-centered PSO was utilized
for the optimization of Mean Square Error (MSE) betwixt feature vector of sketch & photo.
Here, Histogram of Gradient (HOG) feature vector was employed for similarity gauges betwixt
sketch & photo. Choose a photo as QI as of database, and also utilizing BPSO-CPN for
recovering similar photos as of database. BPSO-CPN was experienced on CUHK along with
IIITD dataset enclosing almost 1000 sketches along with photos. The experimentation out-
come envisioned that BPSO-CPN provided propitious outcomes and also attained higher
precision as compared with other existent methods and NN. The stimulation for this work
came from discovering missing/wanted individuals who involved in anti-national behavior [5,
7]; furthermore, it aided investigating agencies to confine the suspects swiftly. The drawback
of this approach is that fewer images were taken for testing.
Harihara et.al [9] offered an algorithm for FR along with human tracking. Individuals were
tracked employing Gaussian mixture model (GMM). For tracking the individual, particularly,
the model of GMM was split into 4 regions that were put one over the other, and as well
tracked concurrently. For identifying the person, the HOG aspects of the face area were
provided to the SVM. 3 experimentations were performed on taking the training images.
Every tenth, every fifth together with every third frame of the 1st hundred frames were
regarded. Remaining frames on the video were regarded for testing by utilizing the SVM. 3
datasets, explicitly, AITAM1 (easy), AITAM2 (reasonable) along with AITAM3 (intricate)
were utilized. The experimentation outcomes illustrated that as the complexity of dataset
augmented, the performance metrical were getting lessened. More the quantity of training
images on preparing a classifier, the superior was the FR. This was done for the entire data-
sets. The Performance outcomes demonstrated that the blend of the tracking with the FR
algorithm not merely tracks the individual but as well identified that person. This exclusive
property of tracking along with recognition made it best suited for video scrutiny applications.
Gabriel et.al [10] recommended an FR centered on combining thermal with visible
descriptors. This was bifurcated into 2 steps: i) training and ii) validation. On the former
one, the system attained the optimum weights as of the PSO to maximize the recognition rates
(RR) attained as of disparate blends of local descriptor (LD) techniques utilizing a thermal face
data-base (Equinox). The weights were after that utilized to combine visible and thermal face
descriptors to attain higher RR amid the latter stage utilizing the Pontificia Universidad
Católica de Valparaiso – Visible Thermal Face (PUCV-VTF). 3 local matching techniques
were utilized to do the FR: LBP, HOG, along with LD Pattern. Additionally, the article
regarded a contrast with the subsequent methods: a preceding work centered on GA together
with a customized PSO. The outcomes illustrated that the RR above ninety-nine % aimed at
the PUCV-VTF, largely outdoing the outcomes for GA. The fusion method was determined to
be unaltered to changes in illumination along with expression conditions, uniting the visible
Multimedia Tools and Applications

with the thermal information competently via the PSO, and therefore selecting the optimum
regions in which a specified spectrum was further pertinent. The cons of this approach were
that this doesn’t explore new solutions based upon Deep Learning utilizing deep convolutional
networks to merge visible and thermal images.
Ze et.al [16] projected a color space LuC1C2 centered on a structure of building
effectual color spaces aimed at FR. It was made of 1 luminance constituent Lu and 2
chrominance constituent C1, C2. The Lu was chosen as of four disparate luminance
candidate by contrasting their R, G, B in addition to color sensor property. Aimed at
C1, C2, the directions of the transform vectors (TV) were ascertained by means of the
discriminate and covariance scrutiny on the chrominance sub-space of the RGB. The
magnitudes of their TV were ascertained via the discriminate values of Lu C1 C2.
Wide-ranging experimentations were carried out on four standard data-bases to assess
this color space LuC1C2. The experimentation outcomes attained by utilizing two
disparate color aspects along with three disparate dimension lessening methods
showed that this color space LuC1C2 attained constantly better FR performance than
the top-notch color spaces in three data-bases. It as well showed that the color space
achieved high face RR than the top-notch on FRGC. Additionally, the face identifi-
cation’s performance was enhanced considerably by uniting CNN aspects with easy
raw-pixel aspects as of the LuC1C2 color space in LFW along with FRGC.
Yang et.al [25] suggested a recurrent regression neural network (RRNN) to unite 2
typical errands of cross-pose FR in still images together with videos. For emulating
the modifications of images, overtly built the possible reliance of sequence images to
regularize the last learning model. Via doing progressive transforms aimed at succes-
sively adjoining images, RRNN adaptively memorizes and also forget the info that
was gained from the last classification. Aimed at FR of still images, provided any 1
image with any 1 pose, and then repeatedly foreseen the images with its sequence
poses for expecting to take some helpful info of other poses. Aimed at video-centered
FR, the recurring regression takes 1 whole sequence instead of 1 image as its input.
Lastly, verify RRNN on still face image data-set MultiPIE along with face video
YouTube Celebrities (YTC). The inclusive experiment outcomes showed the RRNN
efficiency.
Jingjing et.al [15] anticipated an FR centered upon the K-SVD learning for
resolving this big sample issue by utilizing joint sparse depiction. The core notion
of this method was to study variation dictionaries as of gallery together with probe
face images independently, and after that suggested an enhanced joint sparse depic-
tion, that employed the info learned as of gallery together with probe samples
efficiently. Lastly, the method was contrasted with some associated methods on
numerous recognized face databases, counting YaleB, CMU-PIE, AR, Georgia along
with LFW. The experiment upshots delineated that this method outshined numerous
associated FR. This approach has high computational time and the dictionary size was
too low.
Pratibha et.al [24] posited a GA [17] centered approach aimed at FR. The algorithm
identified an indefinite image by contrasting it with the definite training images stocked up
in the data-base and provided details concerning the individual identified [4, 6]. The algorithm
after that was weighted against other recognized FR viz.: PCA along with LDA. It was
perceived that the RR of this algorithm was superior. The optimization technique didn’t give
the perfect output.
Multimedia Tools and Applications

3 Proposed methodology

FR is turning to be a demanding chore for the researches these days. In video


processing, FR showed more curiosity during the previous years. Video FR plays a
crucial responsibility today in numerous applications, for example, video surveillance,
bio-metric identification and also content-centered video indexing/search. Its applica-
tions may well be extremely precious for the individual authentication together with
recognition. But instead, it is very complicated to implement because of all dissimilar
circumstances that a human face can be discovered. The proposed video FR system
springs through the phases are, initially converting the database video clips into the
frames. The pre-processing phase is done using a modified Wiener filter. From the pre
-processed image, the face is identified utilizing a V-J algorithm. And, Image features
of contrast, energy, correlation, cluster prominence, maximal probability, entropy,
homogeneity, cluster shade, local homogeneity, a total of squares or variance, dissim-
ilarity, autocorrelation and also inverse variant moment are extorted, once the face
detection is accomplished in images. Then, perform the feature reduction operation
using Modified PCA algorithm. The feature value will be presented as the input to the
Kernel ELM and the constraints in Kernel ELM are increased by Hybrid (PSO-GA)
algorithm for implementing precise high recognition. The structural design of the
proposed work is delineated in Fig. 1,

3.1 Input video

Consider (Iv) as the video database and,


 
I v ¼ f 1a;b ; f 2a;b ; ::::::::::f ma;b ð1Þ

as the number of frames existent on the video. Where, fma, b is one amongst the images fm at
position (a, b) number of frames.

Converted Preprocessing Face Detection


Input Video Video into
Frames Modified Viola Jones
Wiener Filter Algorithm

Classification Feature Reduction


Feature Extraction
Hybrid Modified PCA
Optimization
(PSO-GA)

Kernel ELM

Recognized Non-Recognized

Fig. 1 Block Diagram for proposed System


Multimedia Tools and Applications

3.2 Preprocessing

At first, the inputted image is pre-processed. Preprocessing the image is of vital


importance on the FR. For eradicating the noise, the processing comprises the
filtering. Here, preprocessing of the inputted image is executed via Wiener Filter
(WF).

3.2.1 MWF

The WF implements an optimum trade-off betwixt noise smoothing & inverse filtering (IF). It
eradicates the additive noise, additionally, reverses the blurring concurrently. This filtering is
optimum concerning the MSE. That is to say, it diminishes the general MSE on the procedure
of IF along with noise smoothing. The gray scale image is subjected as the input to the MWF.
Here, the quantity of noise which exists in the image was minimized. This was executed via
contrasting the received image with an evaluation of a desired noise-less signal. In this work,
the Weiner filter is combined with a Median square error, initially to calculate the MSE. This
involves the minimization of the median square error that is given here.

0 σ2 þ r2  0

bði; jÞ ¼ M þ Gði; jÞ−M ð2Þ
σ 2

Where, M' implies the representation of the MWF in the time domain, G(i, j) is the local
median around every pixel, r2 is the variance and represents the noise variance.
The frequencies domain representation of the image is σ2. The filtered image is mathemat-
ically represented as displayed below.

Hðu; vÞPxx ðu; vÞ


Gðu; vÞ ¼ ð3Þ
jH ðu; vÞj2 Pxx ðu; vÞ þ Pηηðu; vÞ

Where, G(u, v)denotes the Weiner filter. In which, Pxx(u, v), Pηη(u, v) are corresponding power
spectra of the real image, additionally, the additive -noise, H(u, v) implies the blurring filter. It
is effortless to view that the WF encompasses 2 separate parts, i) IF, and ii) noise smoothing
part. It not merely executes the de-convolution via IF (high-pass filter) but as well eradicates
the noise by means of a compression process (low-pass filter).

3.3 Face detection using V-J algorithm

The pre -processed image (fm') attained as of the adaptive median filter, after that it endures
object detection, which is accomplished via V-J. With the support of this particular technique,
the face, −left-right eye, nose, and the mouth of the particular images are identified. This
algorithm incorporates 4 phases explicitly:

& Haar Feature Selection,


& Generating an Integral Image,
& AdaBoost Training,
& Cascading Classifier.
Multimedia Tools and Applications

3.3.1 Haar feature selection

The preliminary phase of the V-J approach is the assortment of the haar attributes from the pre-
processed image. The below Haar features be competent to be of various height & width. From
this feature implemented to the face, the sum of the black pixel together with the sum of the
white pixel is computed; furthermore, they are subtracted to obtain a single value. On the off-
chance that this value is more in that area, then it signifies a face part and also is identified as
eyes, cheek, nose, etc. The attribute allied to the pre -processed image is delineated as,
V RF ðfmÞ ¼ ∑ ∑ Pðu; vÞAðblack Þ  ∑ ∑ Pðu; vÞAðWhiteÞ ð4Þ
1≤u≤N 1≤v≤N 1≤u≤N 1≤v≤N

Where, VRF(fm) implies the rectangular features’ value, P(u, v)A(black) is the pixels of the black
area, P(u, v)A(White) are the pixels of the white area.
Figure 2 has five types, type1 & type2 are considered as the two rectangle feature, type3
and type4 are considered as the three rectangle feature, and type5 specified as the 4 rectangle
feature. The value of 2 rectangle feature is the difference betwixt the sums of the pixels amid 2
rectangle regions. In 3 rectangles, the value of the center rectangle is deducted by means of the
addition of the 2 surrounding rectangles. While 4 rectangle features estimates the difference
betwixt the diagonal pairs of the rectangles.

3.3.2 Generating an integral image

The succeeding phase of V-J face detection is to change the inputted image into the integral
one that is attained by constructing each pixel akin to that of the aggregate sum of the entire
pixels over and also to the left-part of the associated pixel. The integral figure in location (u, v)
holds the total of the above pixels and also to the left-part of u, v inclusive,
I i ðu; vÞ ¼ ∑ iðu; vÞ ð5Þ
u0 < u0 ;v0 < v0

In which, Ii(u, v) implies the integral image, i(u, v) is the actual image intensity.

3.3.3 Ada Boost training

The 3rd phase of V-J approach stands as AdaBoost training that is a boosting algorithm
competent to create a sturdy classifier by means of a weighted merger of feeble classifiers. To
go with these terms to the existent hypothesis, each aspect is gauged as a possible feeble

Type 1 Type 2 Type 3 Type 4 Type 5

Fig. 2 Haar Features in V-J


Multimedia Tools and Applications

classifier. A feeble classifier is rationally stated below:



1; if k ðqðiÞÞ > kT
hði; q; k; T Þ ¼ ð6Þ
0; otherwise

Where i implies a 24*24 pixel sub-window, q implies the applied feature, k refers the polarity
along with T being the threshold that selects when the image is classified as positive (face) or
else negative (non-face).

3.3.4 Cascading classifier

The final phase is cascading feeble classifier. The task of each phase is to catch on if
a particular sub-window is definitely not a face or probably a face. If a sub-window is
sorted to be a non-face via a particular phase then it is instantaneously redundant. In
contrast, a sub-window that is classified as probably a face is passing to the
succeeding phase on the cascade. It pursues further stages of the particular sub-
window; the sub-window actually includes a face, mouth, nose, right-left eye.
Through V-J algorithm the face f, the specified image is identified.

3.4 Features extraction

This is the method of modifying the object detected image into the series of features.
Features of contrast, entropy, correlation, local homogeneity, energy, optimum proba-
bility, and also cluster shade, the total of squares, homogeneity, variance, cluster
prominence, autocorrelation, dissimilarity, along with inverse divergent moment are
utilized for ascertaining the image content.

3.4.1 Contrast (C)

It stands as the difference betwixt the higher and lower values of an adjacent compilation of
pixels. It gauges the amount of local variations available on the image.

C ¼ ∑ ju−vj2 Pðu; vÞ ð7Þ


u;v

In which, C is the contrast and P(u,v) implies the pixel at the location (u,v).

3.4.2 Cluster shade and cluster prominence

Cluster shade stands as a measurement of the a-symmetry of the matrix and is hoped to gauge
the perceptual thoughts of uniformity.
G−1 G−1
Shade ¼ ∑ ∑ fu þ v−μu −μu g3  pðu; vÞ ð8Þ
u¼0 v¼0

Where, μuand μvdenotes the clustering pixels. Cluster prominence is an asymmetry measure.
Whilst this value is highest, the image is very less symmetrical.
Multimedia Tools and Applications

G−1 G−1
P ¼ ∑ ∑ fu þ v−μu −μv g4  pðu; vÞ ð9Þ
u¼0 v¼0

3.4.3 Correlation and auto correlation

The correlation feature stands as a gauge of graylevel linear dependencies on the image. The
remaining of the textural features is secondary.

ðu; vÞPðu; vÞ−μu μv


C R ¼ ∑u ∑v ð10Þ
σx σy

3.4.4 Homogeneity (H)

It gauges the image homogeneity as it presumes large values for small gray level
differences on pair elements. It is extra responsive to the availability of near diagonal
elements on the GLCM. It has maximal value when the entire elements on the image
are same.

Pðu; vÞ
H¼∑ ð11Þ
u;v 1 þ ju−vj

3.4.5 Inverse variance moment (local homogeneity)

Inverse Difference Moment (IDM) is the regional homogeneity. It is huge whilst


regional graylevel is uniform. This weight value stands as the inverse of the Contrast
weight. GLCM contrast along with homogeneity is stronger, but inverse correlated
concerning equal distribution on the pixel pair’s distribution. It intends that the
homogeneity lessens if the contrast increases, whilst energy is kept.

G−1 G−1 1
IDM ¼ ∑ ∑ pðu; vÞ ð12Þ
u¼0 v¼0 1 þ ðu−vÞ2

3.4.6 Sum of squares

This stands as a mathematical technique for deciding the data points’ dispersion. In an
investigation, the intent is to ascertain how good a data series is suitable to the function that
might assist in illustrating the method which the data series was formed.

S ¼ ∑ Pðu; vÞ ð13Þ
u;v
Multimedia Tools and Applications

3.4.7 Energy (E)

This Returns the total of squared features on the GLCM. For a firmed image, Energy is one.

E ¼ ∑ Pðu; vÞ2 ð14Þ


u;v

3.4.8 Dissimilarity

Numerical measuring of how variant 2 data objects Ranges as of 0 (objects are same) to ∞
(objects are disparate).

3.4.9 Optimum probability

Probability is measured as a number betwixt 0 & 1 (in which 0 denotes impossibility and also
1 signifies certainty). An event’s optimum probability is more certain that the event will
emerge.

3.4.10 Entropy (e)

It is a randomness measure.
M
e ¼ ∑ Pðu; vÞlog2 fPðu; vÞg ð15Þ
v¼0

In which, M is the number of variant values which pixels may adopt.

3.4.11 Variance (V)

It is the measures that signifying what amount of the gray level is getting varied from the mean.

V ¼ ∑v ∑u Pðu; vÞPðu; vÞ−μ2 ð16Þ

3.5 Features reduction using modified PCA

After the completion of features extraction, feature reduction operation is executed


utilizing Modified-PCA (MPCA) algorithm. PCA methodology is commonly utilized
for feature reduction. It is employed for detecting the patterns on the data and empha-
sizes the difference and similarity. The chief upside of PCA is that once the pattern is
found, the number of dimensions in the dataset is diminished. Here, utilized the MPCA.
MPCA is an enhanced portrayal of PCA in which every image gets partitioned to
numerous sub-block images and subsequently, PCA is employed for every sub-block
image. In PCA technique, the modification is done by Gaussian kernel equation, which
is used in the second step, referred as square-exponential kernel.
Multimedia Tools and Applications

3.5.1 MPCA steps

Step: 1 Identify the prevailing dataset.


Step: 2 utilize Gaussian Kernel

temp ¼ sumððdata inð:; rowÞ−data inð:; col ÞÞ:∧2Þ; ð17Þ

!
kxi −xk2
k ðxi ; xÞ ¼ exp − ð18Þ
2

0
k ¼ k þ k! ð19Þ

Where,k(xi, x) denotes the Gaussian kernel function.

Step: 3 Co-variance Matrix is calculated; the co-variance is gauged in multi-dimensions.


The co-variance is offered by:

  
∑Ni¼1 U i −U V i −V
Covðu; vÞ ¼ ð20Þ
n−1
In a Co-variance matrix, if all the non-diagonal components’ values are positive it means that
U, V, Wvariable augments together.

Step: 4 EigenVectors along with EigenValues of this Co-variance Matrix are computed.
Step: 5 Interpret this data into components.

After completing the MPCA process, the features are optimized utilizing hybridized (PSO-
GA) optimization process.

3.6 Features optimization using hybrid (PSO-GA) algorithm

Exerted features deriving of the preprocessed images are maximized utilizing the algorithm
termed Hybrid-PSO GA which is the integration of PSO and GA algorithm. PSO stands as a
population-centered optimization tool, which could well be implemented as well as applied
effortlessly to resolve various function optimization issues, and GA evaluates data structure as
well as assigns reproductive opportunities in a manner that those chromosomes which signify a
superior solution to the objective issue. PSO-GA stands more dependable in offering better
quality solutions with sensible computational time, while the hybrid strategy evades the pre-
mature convergence of the search process to local optima, furthermore gives better exploration
of the search process. Therefore, this is the chief reason that the proposed work has used the
Hybrid (PSO-GA) algorithm. This algorithm is delineated below as,
Multimedia Tools and Applications

Step 1: Assign input populace/particles parameters with arbitrary positions and also
velocities in d dimensions on the problem space. Every parameter is concerned as
particles.
Step 2: For each particle, assess the preferred optimization fitness function in d variables.
Step 3: Contrast particles’ fitness evaluation with particlespbest. If its value is better
frompbest, then set pbestvalue equivalent to the current value and set thepbest location with
the current position in d-dimension space.
Step 4: Create the current best fitness (BF) value for the complete inputs which are
considered. If this current BF value is better on considering the global bestg(best), then
allot the current BF value to g(best) and the current coordinates tog(best) coordinates.
Step 5: Pick the particle with the BF value, re-initialize its position. Along with this,
assess the particle having the worst-fitness value and find whether its new position which
is attained is suitable. If it is in an acceptable range then update this position or else a new
position is allotted to the particle randomly in its neighbor and then re-new the position
together with a velocity of other particles utilizing the expression proffered below,

xi ðt n þ 1Þ ¼ xi ðt n Þ þ d 1 e1 ð gi ðtn Þ−hi ðtn ÞÞ þ d 2 e2 ð gi 0 ðt n Þ−hi ðt n ÞÞ ð21Þ

hi ðtn þ 1Þ ¼ hi ðtn Þ þ xi ðtn þ 1Þ ð22Þ

Once the velocity along with the position is evaluated, crossover & mutations are done to
make the optimization more effectual.

Step 6: Amongst disparate categories of crossovers, the 2-point crossover is chosen.


Here, 2 points are elected in the parent chromosomes utilizing the eq. (23) & (24). The
genes in-betwixt the 2 points are interchanged amongst the parental chromosomes to
attain the children chromosomes. The crossover points are ascertained as,

jvðpiÞ j
x1 ¼ ð23Þ
3

jvðpiÞ j
x 2 ¼ x1 þ ð24Þ
2
Now, these children chromosomes are stored individually and their matching indices are saved.

Step: 7 subsequently, the mutation is performed by substituting the count of genes as of


every chromosome with fresh genes. These substituted genes are the arbitrarily created
genes with no repetitions inside the chromosome. Then, the chromosomes picked for
crossover, and the one that is attained from the process termed mutation are integrated,
and so the populace pool is filled with the chromosomes.
Step: 8 The process gets repeated till the solution with BF value is obtained.
Multimedia Tools and Applications

In the pre-mentioned equations, d1 and d2signifies the acceleration constants that is requisite
for integrating every particle with the pbest and gbest. Updating the finest position of the
particle is proffered as,

gi ðtn Þ; S ðhi ðtn þ 1ÞÞ ≥S ðg i ðtn ÞÞ
g i ðtn þ 1Þ ¼ ð25Þ
hi ðtn þ 1Þ; S ðhi ðtn þ 1ÞÞ < S ðg i ðtn ÞÞ

The particle velocity in each dimension is in the [±Xmax]interval. It is gauged and is weighted
against theXmax. The Xmaxis a notable parameter. ‘Xmax‘aids in ascertaining the resolution with
which the region betwixt present position and the target position is searched. Contingent on the
Xmaxvalues, the particles are regarded to proffer a finer solution. The aforementioned equations
are utilized to evaluate solution’s fitness and to pick the better solution centered on those
fitness values (Fig. 3).

3.7 Classification utilizing KELM

The maximized features are extricated from the optimization of Hybrid PSO-GA
algorithm and are classified utilizing the known classifier termed KELM which
encompasses 3 layers of nodes. ELM stands as a nonlinear NN which maps the
inputted features to the feature space utilizing Non-Linear (NL) activation operation.
In Extreme Learning Machine (ELM) network, the biases and weights of hidden layer
(HL) are arbitrarily picked and the outputted weights are logically evaluated as of the
output matrix of HL. In KELM, instead of the activation function, kernel functions
are utilized at the HL nodes. Kernel ELM has the dominance over ELM that there
exist no requisite of choosing the count of hidden neurons meant for HL and to
arbitrarily create the biases and weights, unlike ELM (Fig. 4). KELM augments the
robustness of ELM by means of transmuting linearly non-separable data on a lower

Initialization
of Particle

Fitness
Evaluation

Genetic
Operators
Yes
Crossover Conditions Optimized
and mutation Obtained output

No

Velocity and position


Updation

Fig. 3 General Flow Diagram for Hybrid PSO with genetic operators
Multimedia Tools and Applications

Fig. 4 Architecture for Kernel ELM network

dimensional space onto a linearly separable one. It uses the input as well as output
data to fit a nonlinear function. KELM algorithm is delineated as,
In KELM regression, the output weights of HL is proffered by eq. (26)

Hβ ¼ I Ou ð26Þ

Where, β signifies the output weights of the HL, Hindicates the output matrix of HL and
I Ou specifies the recognized face image for the inputted image.
The ELM’s output weights are proffered by eq. 27,

β ¼ H ⊥ I Ou ð27Þ

Here, H⊥ signifies the Moore-Penorse inverse of the matrixH. To assess the inverse of a
matrixH, the orthogonal projection methodology is mostly employed. As per this methodol-
ogy, it HHTis non-singular, H⊥is signified by eq. 28.
 −1
H ⊥ ¼ H T HH T ð28Þ

Here, add the regulation coefficient Rto the term HHT to make the network utmost stable.
Therefore, to augment the constancy of the KELM regression network, the eq.27 is re-written
as

−1
1
β ¼ HT þ HH T I Ou ð29Þ
R

The output function of KELM is proffered by eq.


y ¼ hðxÞβ ð30Þ
Multimedia Tools and Applications

The kernel matrix is delineated centered on mercer’s condition, whilst HL feature mapping
h(x)is unknown, as in eq. 31

ΩELM ¼ HH T ð31Þ

Where, ΩELM indicates the mapping function. The KELM is implemented in one single step.
The preferred face image is assessed by eq. 32
 
I Ou ¼ k I Ol ; I Om *β ð32Þ

Ol
Where,I and I Om denotes the test image. Lastly, utilize the eq. 32 to attain the final recognized
face image.

4 Results and discussion

The FR system is implemented in the Mat Lab 14a software and executed in a personal
computer, which comprises Intel (R) Core (TM) i3 processor of 2.40 GHz CPU and 4 GB
RAM. FR procedure is analyzed with dissimilar video frames and the outcome of the intended
system is delineated below.
Discussion: Fig. 5 displays the pre-processed image of the inputted image; pre-processing
is executed by the MWF for different pose frames. The preprocessing has performed the RGB
to grey conversion for the recognized images and non-recognized ones with the assistance of a
Gaussian filter (GF) of MWF. After that, the pre-processed image is distinguished using the V-
J algorithm.

4.1 Performance analysis

4.1.1 Accuracy

This measures how close the recognized image is to QI.


ðTP þ TN Þ
Accuracy ¼ ð33Þ
ðTP þ FP þ TN þ FN Þ

Fig. 5 Pre-processed image using Modified WF for different pose


Multimedia Tools and Applications

4.1.2 Sensitivity

This estimates the images’ proportion, which is relevant to QI which is successfully recog-
nized.
TP
Sensitivity ¼ ð34Þ
TP þ FN

4.1.3 Specificity

This evaluates the images’ proportion that is pertinent to the QI which is not correctly
recognized.
TN
Specificity ¼ ð35Þ
FP þ TN

4.1.4 Precision

This stands as the fraction of the identified which are pertinent to the QI.
TP
Precision ¼ ð36Þ
TP þ FP
TP Signifies True Positive, TN implies the True Negative, FP means False Positive and FN
signifies False Negative.

4.1.5 Recall

Recall ascertains the fraction of images that are pertinent to the QI which is productively
recognized.
TP
recall ¼ ð37Þ
TP þ FN

4.1.6 F-measure

This is the weighted harmonic mean (W.H.M) of recall together with the precision.
Precision  recall
FMeasure ¼ 2 ð38Þ
Precision  recall

4.1.7 False discovery rate

This Rate is stated as the anticipated proportion of FN amongst the whole hypothesis rejected.
FP
FDR ¼ ð39Þ
FP þ TP
Multimedia Tools and Applications

4.1.8 False positive rate

This is gauged as the ratio of the total negative events incorrectly categorized as Positives
together with the aggregate number of real negative events.
FN
Specificity ¼ ð40Þ
FN þ TP

4.1.9 Mathew’s correlation coefficient

MCC deem true and also false positives along with negatives, additionally, are perceived as a
balanced gauge that can well be employed albeit the classes are of extremely varied sizes.
ðTP  TN Þ−ðFP  FN Þ
MCC ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð41Þ
ðTP þ FPÞðTP þ FN ÞðTN þ FPÞðTN þ FN Þ

4.1.10 Negative prediction value

NPV consider the image’s negative predictive value.


TN
NPV ¼ ð42Þ
TN þ FN

4.1.11 Positive prediction value

PPV deem the image’s positive predictive value.


TP
Precision ¼ ð43Þ
TP þ FP

4.2 Comparative analysis

The performance assessment of the proposed MWF, WF and GF for the disparate metrics
comparison is delineated in the below Table 1.
Discussion: The aforementioned comparison Table 1 illustrates the proposed MWF’s
performance with the WF and GF. The comparison is done using percentage. The proposed
MWF removes the noise from the clipped image. When contrasted with the existent filter
methods, the proposed MWF clearly removed noise which is proved using the above Table 1
values. The comparison graph is delineated in Fig. 6.

Table 1 Performance of Filters


S.No Filters Percentage (%)

1. MWF 46.0367
2. WF 37.6691
3. Gaussian Filter 36.78769
Multimedia Tools and Applications

Comparison of Filters
50
45
40

Percentage(%)
35
30
25
20
15
10
5
0
MWF WF Gaussian Filter
Filters

Fig. 6 Comparisons of Filters

Discussion: The above Fig. 6 demonstrates the assessment of MWF with the WF and
Gaussian filter. The proposed MWF gave the 46.0367 percentage but the WF and Gaussian
filter offered 37.66491 and 36.78769. Performance of the proposed MWF filter has 9.24 high
compared with the existing filtering methods. Hence, from the comparison, it can be perceived
that the proposed one offers better performance on comparing to the other filters.
Discussion: In Table 2, the proposed Kernel ELM is contrasted with the ANFIS, NN along
with KNN classifier concerning precision, specificity, F-measure, accuracy, sensitivity, recall,
FDR, FNR, FAR, MCC, FPR along with FRR. From the above Table 2, the accuracy of the
proposed ANFIS system has 97.3% but the existing NN based classification system provides
85.8% accuracy, and KNN provide 95.7%. In sensitivity Measure, the proposed KELM and
existing ANFIS system achieve 100% but the existing NN achieves 95.3% sensitivity, and
KNN score 98.9% sensitivity. Thus, it concludes that the proposed video FR system has better
when compared with the existing systems. The graph of the comparison is given in below
Fig. 7.
Discussion: Fig. 7 delineates the comparison betwixt the proposed Kernel ELM with
ANFIS, NN, and KNN. The proposed technique proffers high precision of 97% along with
100% on recall and sensitivity, accuracy of 97.3%, specificity of 94.5%, F Measure of 98.6%,
FDR of 4%, FNR of 0%, FPR of 6.27%, FAR of 4%, FRR of 0% and MCC of 95%. But the
existing technique ANFIS classifier only offer precision of 95% in addition to 100% on recall

Table 2 Performance table for Classification

Performance measure Proposed Kernel ELM ANFIS Neural Network KNN

Precision 97 95 81 94
Recall 100 100 95.3 98.9
F measure 98.6 97.4 87.6 96.4
Accuracy 97.3 96.9 85.8 95.7
Sensitivity 100 100 95.3 98.9
Specificity 94.5 92.5 75.3 91
FDR 4 5 19 6
FNR 0 0 4.71 1.05
FPR 6.27 7.46 24.7 8.96
FAR 4 5 19 6
FRR 0 0 4 1
MCC 95 93.8 72.6 91.2
Multimedia Tools and Applications

120

100
Values 80

60
Proposed Kernel ELM
40
ANFIS
20 Neural Network
0 KNN
Precision

Accuracy

Specificity
Recall

MCC
Sensivity

FDR

FPR
FAR
F measure

FRR
Performance Measures FNR

Fig. 7 Comparison graph for classification of Performance measure

and sensitivity, accuracy of 96.9%, specificity of 92.5%, F Measure of 97.4%, FDR of 5%,
FNR of 0%, FPR of 7.46%, FAR of 7.46%, FRR of 0% and MCC of 93.8%. Similarly, the
NN, along with KNN, offers low values when weighted against the proposed work. Hence-
forth, from the overall comparison, the proposed one gives superior accuracy and performance.

5 Conclusions

Here, an efficient FR system is proposed centered on hybrid optimized Kernel ELM. The
proposed system’s performance was analyzed utilizing the face images. The chief contribution
of this work is that the initial input video is clipped into the frames, and then removes the noise
from the image using MWF. Next, detect the preprocessed image using the V-J algorithm, and
the features are extorted then the extorted features are given as the input to the MPCA for
feature reduction. Finally, the lessened features are inputted to the classification using hybrid
optimized kernel ELM approach. The performance analysis illustrated that the proposed FR
has given an incredible rate of accuracy, sensitivity, along with specificity. The comparison
result illustrates that the proposed FR system has higher accuracy sensitivity along with
specificity than existing methods. In accuracy level, the proposed work has 97.3% accuracy
and efficiency but the existing techniques only offer 96.9% for ANFIS classifier, 85.8% for
NN and 95.7% for KNN. Consequently, the proposed FR more accurately recognized the
images than the existent methods. The future work would involve implementing a robust FR
system on the video surveillance system.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

References

1. Agrawal S, Singh RK, Singh UP, Jain S (2018) Biogeography particle swarm optimization based counter
propagation network for sketch based face recognition, Multimedia Tools and Applications, pp. 1–25
Multimedia Tools and Applications

2. Alapati A, Kang D (2015) An efficient approach to face recognition using a modified center-symmetric
local binary pattern (MCS-LBP). International Journal of Multimedia and Ubiquitous Engineering 10(8):
13–22
3. Alapati A, Kang D (2015) An efficient approach to face recognition using a modified center-symmetric
local binary pattern (MCS-LBP). Int J Multimed Ubiquitos Eng 10(8):13–22
4. Anupriya K, Gayathri R, Balaanand M, Sivaparthipan CB (2018) Eshopping scam identification using
machine learning. International Conference on Soft-computing and Network Security (ICSNS),
Coimbatore, India 2018:1–7. https://doi.org/10.1109/ICSNS.2018.8573687
5. BalaAnand M, Karthikeyan N, Karthik S (2018) Designing a framework for communal software: based on
the assessment using relation modelling. Int J Parallel Prog. https://doi.org/10.1007/s10766-018-0598-2
6. BalaAnand M, Karthikeyan N, Karthick S, Sivaparthipan CB (2018) Demonetization: a visual exploration
and pattern identification of people opinion on tweets. International Conference on Soft-computing and
Network Security (ICSNS), Coimbatore, India 2018:1–7. https://doi.org/10.1109/ICSNS.2018.8573616
7. BalaAnand M, Sankari S, Sowmipriya R, Sivaranjani S Identifying Fake User’s in Social Networks Using
Non Verbal Behavior, International Journal of Technology and Engineering System (IJTES), Vol.7(2), pg:
157–161
8. Cheng Y, Jin Z, Gao T, Chen H, Kasabov N (2016) An improved collaborative representation based
classification with regularized least square (CRC-RLS) method for robust face recognition.
Neurocomputing 215
9. Dadi HS, Pillutla GKM, Makkena ML (2017) Face recognition and human tracking using GMM, HOG and
SVM in surveillance videos, Annals of Data Science, pp. 1–23
10. Hermosilla G, Rojas M, Mendoza J, Farías G, Pizarro FT, Martín CS, Vera E (2018) Particle swarm
optimization for the fusion of thermal and visible descriptors in face recognition systems. IEEE Access 6:
42800–42811
11. Journal I, Trends C (2012) Study of musical influence on face using the local binary pattern ( LBP )
approach. Int J Comput Trends Technol 3:150–153
12. Kasar MM, Bhattacharyya D, Kim T (2016) ‘Face recognition using neural network : A review’, vol. 10, no.
3, pp. 81–100
13. Kavita M, Kau M (2016) A survey paper for face recognition technologies, International Journal of
Scientific and Research Publications, vol. 6, no. 7
14. Li M, Yu C, Nian F, Li X (2015) ‘A face detection algorithm based on deep learning’, vol. 8, no. 11, pp.
285–296
15. Liu J, Liu W, Ma S, Wang M, Li L, Chen G (2018) Image-set based face recognition using k-svd dictionary
learning, International Journal of Machine Learning and Cybernetics, pp. 1–14
16. Lu Z, Jiang X, Kot A (2018) Color space construction by optimizing luminance and chrominance
components for face recognition, Pattern Recognition
17. Maram B, Gnanasekar JM, Manogaran G et al (2018) SOCA. https://doi.org/10.1007/s11761-018-0249-x
18. Mellal B (2012) A new approach for face recognition based on PCA & double LDA treatment combined
with SVM. IOSR J Eng 2(4):685–691
19. Nasution AL, Bima Sena Bayu D, Miura J (2014) Person identification by face recognition on portable
device for teaching-aid system: Preliminary report, In Advanced Informatics: Concept, Theory and
Application (ICAICTA), 2014 International Conference of, IEEE, pp. 171–176
20. Parmar DN, Brijesh B Mehta (2014) Face recognition methods & applications
21. Patinge PB (2015) Local binary pattern base face recognition system. Int J Sci Eng Technol Res 4(5):1356–
1361
22. Sasirekha K, Thangavel K (2018) ‘Optimization of K-nearest neighbor using particle swarm optimization
for face recognition’, Neural Computing and Applications, pp. 1–10
23. Shermina J (2010) Impact of locally linear regression and fisher linear discriminant analysis in pose
invariant face recognition. International Journal of Computer Science and Network Security 10(10):111–115
24. Sukhija P, Behal S, Singh P (2016) Face recognition system using genetic algorithm. Procedia Computer
Science 85:410–417
25. Yang L, Zheng W, Cui Z, Zhang T (2018) Face recognition based on recurrent regression neural network,
Neurocomputing
Multimedia Tools and Applications

Dr.S. Anantha Padmanabhan Working as a Professor in the Department of ECE at Gopalan College of
Engineering and Management, Bangalore. He also published many articles in reputed journals and International
Conference. He obtained Ph.d from Anna university chennai in the field of Digital Signal Processing and his area
of research are signal processing, control systems, field theory and electrical machines.

Dr. Jayanna Kanchikere Working as a Professor in the Department of EEE at St. Peters Engineering College,
Hyderabad. He also published many articles in reputed journals and International Conference.

Potrebbero piacerti anche