Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ELSEVIER
Abstract
The goal of this research is to realize a face-to-face communication environment with machine by giving a facial expression to
computer system. In this paper, modeling methods of facial expression including 3D face model, expression model and emotion model
are presented. Facial expression is parameterized with Facial Action Coding System (FACS) which is translated to the grid's motion
of face model which is constructed from the 3D range sensor data. An emotion condition is described compactly by the point in a 3D
space generated by a 5-layered neural network and its evaluation result shows the high performance of this model.
Keywords." Face model; Range data; Emotion model; Neural network; Nonverbal communication;FACS
1. Introduction
E-mail: shigeo@ee.seikei.ac.jp
0141-9382/96/$15.00 1996 Elsevier Science B.V. All rights reserved
SSDI 0141-9382(95)01008-4
16
Digitizer unit
. o
........
" P(i,j+l)
--- . 0
"
,,' P(i, j)
oO
"
...... :.O"--
,"
o"
........ . 0 : .......
P(i-l,j+l)""
::
,'"
:.0--::
2. Modefing of face
2 . 1 . 3 D digitizer
17
Fig. 6. Featurepoint.
O0
A
* Q(o, h)
o
Voronoi region
Delaunay net
Mesh model
18
S. Morishima/Displays 17 (1996) 15 25
3. Modeling of expression
3.1. Generic model
Fig. 9. Reconstructed mesh model.
Table 1
Action units selection
Generic Model
Personal Model
AU No
FACS name
AU No
FACS name
AU1
AU2
AU4
AU5
AU6
AU7
AU9
AU10
AU12
AU14
AU15
AU 16
AU 17
AU20
AU23
AU25
AU26
Dimpler
Lip comer depressor
Lower lip depressor
Chin raiser
Lip stretcher
Lip tightener
Lip part
Jaw drops
19
4. Modeling of emotion
y
layer
'~C)/
layer
layer
20
: Training points
(b) Stage 2
(a) Stage 1
\/
(c) Stage 3
(d) Stage 4
Analysis
Synthesis
I I
Input-Layer
(EmotionSpace)
e-
Output-Layer
From the experiment mentioned above, it was considered that by linking the input layer and middle layer
in the Identity Mapping network, the features of the
training data could be extracted. Following this, the
Six basic expressions described by Action Unit parameters are used as the training data. By changing the
intensity of the degree of emotion every 25% up to
100%, the amount of learning data is equal to 25 expressions including neutrals (four for each expression). The
intensity of the degree of emotion is defined at the parameter level, not at the impression level. Table 2 shows the
combination of AU parameters used to describe the six
basic expressions. The numbers in parentheses indicate
the intensities of the individual AU parameters. These
21
2
(slantedness
)
Table 2
Combination of AU for basic emotions
Basic
emotion
Combination of AU parameters
Surprise
Fear
Disgust
Anger
Happiness
Sadness
\
Fear
Happines~]
Disgust
/
Y
I
Sadne
Angel
~eutral
oxlness
5.1.2D space
Fig. 20. Schematic face for analysis.
22
Surprise
7O
Fear
c AN
Surprise
Sadness
Sadness
-m
. . . . . . . . . . . . . . . . . . . . . .
6e
. . . . . . . . . . . . . . . . . . . . . .
4e
. . . . . . . . . . . . . . . . . . . . .
~e
. . . . . . . . . . . . . . . . . . . . .
2o
se
70 . . . . . . . . . . . . . . . . . . . . .
7o
. . . . . . . . . . . . . . . . . . . . .
~.
Anger~ ~
Sadness I
,o.
7e
. . . . . . . . . . . . . . . . . . . . .
--- . . . . . . . . . . .
.......
1'
o
Sun prise
4e . . . . . . . . . . . . . . . . . . . . .
~e . . . . . . . . . . . . . . . . . . . . .
z
/.e . . . . . . . . . . . . . . . . . . . .
2e
IO .
IO
o
1
;-210
i
4
= .....
ResponseforAnger
0, oAN,
CAN 1
Sadness
-1.95
Surprise
Re
70
70
se . . . . . . . . . . . . . . . . . . . . . . .
se
4e
. . . . . . . . . . . . . . . . . . . . . . .
3e
Fear
OT ~
I~
1o
IO
la
le
Sm ~se
. . . . . . . . . . . . . . . . . . . . .
70
. . . . . . . . . . . . . . . . . . . . .
~e
3e
bHappiness
//
Sadness
?o
. . . . . . . . . . . . . . . . .
7O
so
. . . . . . . . . . . .
so
ao
. . . . . . . . . . .
4o
. . . . . . . . . .
3e
2o
--
le
:le
le
Responsefor Happiness
Surprise
Responsefor Fear
ral
Anger
Surprise
1o . . . . . . . . . . . . . . . . . . . . . .
Disgust
Sadness
. . . . . . . . . . . . . . . . . . . . . . .
*so
70 . . . . . . . . . . . . . . . . . . . . . . .
Responsefor Disgust
. . . . .
mSadness
Surpri~m
l-
j-
2e
Io
~
Y
Fig. 23. Locations of basic emotions (3D emotion space).
Responsefor Sadness
II
n
S
,
6
m
7
u
s
n
~
It
Responsefor Surprise
5.2. 3D space
In this experiment subjects forming another group
were asked to classify the schematic expression faces
which had been produced in the former experiment,
i.e., into basic categories of emotions. A canonical discriminant analysis on the relationship between the displacements of the feature points of each expression face
and the distributions of categorical judgments, revealed
three major canonical variables. By putting these variables in rectangular coordinates, the basic expressions
could be placed in a 3D space as shown in Fig. 22. In
comparison with the basic emotions' locations obtained
in the middle layer of the network in Fig. 23, many points
were the same as those in the previous 2D experiment.
23
x "~''-
Haooiness
Fig. 25. New locus from subjectivetest, (a) locus drawn in 3D space;
(b) locus mapped onto 2D space.
According to the change from left to right on the horizontal axis in Fig. 24, one can see that the support for
Sadness is gradually reduced, the Disgust of a few
appears for a moment, and then the Happiness of
many appears; with time, the support for Surprise gradually increases. This result confirms the interpolation
effect of the 5-layered network. Therefore, all of the
expressions in this emotion space are continuously
located at an impression level. The appearance of
Happiness is confirmed by the fact that the line between
Surprise and Sadness exceeds the point of Happiness in
the space indicated in Fig. 23; the line is far from the
point of Disgust, so the influence of Disgust is weak.
Similar effects appear fbr other pairs of emotions.
24
Acknowledgements
Fig. 31. Synthesizedhappiness.
7. Conclusion
In this paper, three kinds of modeling methods
are presented. For face modeling, a 3D model constructed by range data and a mouth model is
I will give a special thanks to Prof. Demetri Terzopoulos, University of Toronto and Mr. Shoichiro
Iwasawa, Seikei University for making a face model
and assisting in the experiment. I would also like to
thank Prof. Hiroshi Yamada, Kawamura College and
Mr. Fumio Kawakami, Seikei University for performing
the psychological evaluation of emotion space.
References
[1] S. Morishima and H. H~xashima, A media conversion from speech
to facial image for intelligent man-machine interface, IEEE
Journal on Selected Areas in Communication, 9(4) (1991).
[2] T. Sakagnchi, M. Ueno, S. Morishima and H. Harashima,
Analysis and synthesis of facial expression using high-definition
wire frame model, Proc. 2nd IEEE International Workshop on
Robot and Human Communication, 1993, 194-199.
[3] I. Essa, T. Darrell and A. Pentland, Tracking facial motion, Proc.
Workshop on Motion and Non-rigid and Articulated Objects,
1994, 36-42.
[4] H. Kobayashi and F. Hara, Analysis of the neural network
recognition characteristics of 6 basic facial expressions, 3rd
IEEE International Workshop on Robot and Human Communication, 1994, 222-227.
[5] M. Rosenblum, Y. Yacoob and L. Davis, Human emotion recognition from motion using a radial basis function network architecture, Proc. Workshop on Motion and Non-rigid and
Articulated Objects, 1994, 43-49.
[6] Y. Lee, D. Terzopoulos and K. Water, Constructing physics-based
facial models of individuals, Graphics Interface '93, 1993, 1-8.
[7] H. Schlosberg, Three Dimension of Emotion, The Psychological
Review, 61(2) (1954) 81-88.
[8] H. Schlosberg, A scale for judgment of facial expression, Journal
of Experimental Psychology, 29 (194 l) 497-510.
[9] C.A. Smith and P.C. Ellsworth, Pattern of cognitive appraisal in
25