Sei sulla pagina 1di 7

Multimodal Mixed Emotion

Detection

Soumili Kundu
M.Tech I.T (CWE)
Roll no. 06
SET, Jadavpur University

4/23/2019
INTRODUCTION

Emotion recognition is an emerging field and has important applications in the


field of medicine, education, marketing, security and surveillance. Machines
can enhance the human-computer interaction by accurately recognizing the
human emotions and responding to those emotions. Existing research has
mainly examined automatic detection of single emotion. But psychology and
behavioral science studies have shown that humans can concurrently experience
and express mixed emotions. For instance, a person can feel happy and sad at
the same time. In this research combinations of six basic emotions (happiness,
sadness, surprise, anger, fear, disgust and neutral state) were used. The aim of
this study is to develop features that capture data from facial expressions, head,
hand gestures and body postures. The multiple concurrent emotion recognition
is a multi-label classification problem. In a multi-label problem, each feature
vector instance is associated with multiple labels such as presence or absence of
one of each six basic emotions. The multi-label classification is receiving
increased attention and is being applied to a many domains such as text, music,
images and video based systems, security and bioinformatics. Here examined
recognition of concurrent emotional ambivalence and mixed emotions.
Additionally, the study examined two concurrent emotions (emotion duality) to
limit the scope of the research based on availability of scenarios.

LITERATURE SURVEY

Ismail, et al. [1] (2016) proposed Human Emotion Detection via Brain Waves
Study by Using Electroencephalogram (EEG).This research was conducted to
detect or identify human emotion via the study of brain waves. In addition, the
research aims to develop computer software that can detect human emotions
quickly and easily. The main objective of this recognition is to develop "mind-
implementation of Robots". While the research methodology is divided into
four; (i) both visibility and EEG data were used to extract the date at the same
time from the respondent, (ii) the process of complete data record includes the
capture of images using the camera and EEG, (iii) pre-processing, classification
and feature extraction is done at the same time, (iv) the features extracted is
classified using artificial intelligence techniques to emotional faces.

Tyagi, et al. [2] (2018) proposed Emotion Detection Using Speech Analysis. In
this paper we detect the basic emotion of human through speech analysis. Today
95% communication is based on vocals which shows different characteristics of
human and emotion is one of them which shows us attribute like fear, anxiety,
happiness, sadness and angry etc. Hence, voice and speech analysis causes
emotion inside human and very beneficial in different areas of communication.
The main objective of this paper is to develop a system which is helpful in near
future real time systems and improved human-technology interaction. The main
challenge in this project is to develop such system which can detect the emotion
in such a manner so that it is time saving, efficient in all test cases and user
friendly.

Kudiri, et al. [3] (2016) proposed Human Emotion Detection through Speech
and Facial Expressions. This research work revealed that the feature extraction
through speech and facial expression is the most prominent aspect effecting
emotion detection system accompanied by proposed fusion technique.
Although, some aspects considered affecting the emotion detection system, this
affect is relatively minor. It was observed that the performance of the bimodal
emotion detection system is low than the unimodal emotion detection system
through deliberate facial expressions. The results indicated that the proposed
emotion detection system showed better performance with respect to basic
emotional classes than the rest. Feature extraction of visual data is possible in
two ways namely geometric and appearance based.

PROPOSED FRAMEWORK

INPUT FEATURE OUTPUT

DATASET EXTRACTION
i) Input Dataset:
The proposed methodology has used 6 different individuals, expressed a
combination of mixed emotions from the collection of 6 basic emotions. 3
participants female and 3 participants male. But here downloaded dataset from
internet is thought of to be used.

ii) Feature Extraction:


Humans use facial expressions to convey emotions and eyes, eyebrows, lips,
cheeks, region around mouth are expressive parts of the face. co-ordinates were
extracted using the face recognition Application Programming Interface. This
research used 76 expressive points out of the 121 features from the face. The
intuition behind the initial selection of features was that the features from the
expressive part of the face contain higher discriminatory power to recognize
emotions.
a) The facial features contained x, y, z co-ordinates of the eyes, nose, lips,
eye-lids, chin, cheek and forehead. The distance between each pair of
tracked point and the angle made by each pair with the horizontal plane
was measured.

b) For the head modality, 12 points along the border of the skull were
tracked. The intuition behind these features was that the tracked points
would describe the shape of the head as well as capture the pitch, yaw,
roll, nod, shake, lateral, backward and forward motion of the head. The
distance between each pair of the tracked point, angle with the horizontal
plane and movement of each tracked point was measured.

c) For the tracking on the hand the palms, wrist, elbow and shoulder joints
of both hands were used resulting in 8 features. These tracked points were
chosen because these points capture the abrupt movement of arms along
all three axes. The distance between each pair of the joint, angle with the
horizontal plane and velocity and displacement of each joint was
calculated.

d) For body modality tracking, the spine centre, left and right hip, knee and
ankles were tracked. The feature vector was created using distance
between pair of joints, angle with horizontal plane and velocity and
displacement of each joint.

iii) Output:
The proposed methodology gives the proper output, which includes the proper
recognition of combintion of emotions based on various scenarios, giving us an
idea of how the Mixed Emotion Detection is paving it’s way to various aspects
of lives of ours. The paper also shows the accuracy percentage achieved in case
of using various modalities.

EXPERIMENTAL RESULT

In the paper, the overall accuracy of mixed simultaneously expressed emotions


using data from hand modality alone was 77.5%.The overall recognition
accuracy from the body modality alone was 65.2%. Results for both head
(94.3%) and face (92.4%) modality showed higher overall recognition accuracy
compared to results for the hand and body modality. The overall recognition
accuracy increased when combined features from multiple modalities (96.6%)
were used.
SCOPE OF DEVELOPMENT
The scope for development is using more than two emotion combinations
amongst larger and much more diverse study group. By extending the analysis
of emotion detection on social media. On numerous social media platforms,
such as YouTube, Facebook, or Instagram, people share their opinions on all
kinds of topics in the form of posts, images, and video clips. With the ease of
content sharing, people increasingly share their opinions on newly released
products or on other topics in form of video reviews or comments. This is can
be used efficiently by large companies to capitalize on, by extracting user
sentiment, suggestions, and complaints on their products from these video
reviews. This information also opens new horizons improving quality of life by
making informed decisions on the choice of products to buy, services to use,
places to visit, or movies to watch basing on the experience and opinions of
other users.

CONCLUSION
Existing studies have primarily examined automatic recognition of single
emotion. This paper has created 3D features from co-ordinates, positions,
movement and knowledge based behavioral patterns and then used the
combined feature vector to recognize mixed simultaneous emotions.
Combination of head, face, hand, body and audio data was used to generate the
feature vector. The head and face modality accuracy was higher compared to
results from the hand and body modality which indicates that although hands
and body have greater and more frequent displacement than other body parts,
the facial expressions and head movement have more discriminating power to
predict mixed concurrent emotions.
REFERENCES

[1] Ismail, WOAS Wan, et al. "Human emotion detection via brain waves study
by using electroencephalogram (EEG)." International Journal on Advanced
Science, Engineering and Information Technology 6.6 (2016): 1005-1011

[2] Tyagi, Riya, and Anmol Agarwal. "Emotion Detection Using Speech
Analysis." Science 3.3 (2018): 18-20.

[3] Kudiri, Krishna Mohan, Abas Md Said, and M. Yunus Nayan. "Human
emotion detection through speech and facial expressions." 2016 3rd
International Conference on Computer and Information Sciences (ICCOINS).
IEEE, 2016.

Potrebbero piacerti anche