Sei sulla pagina 1di 6

Facial Pose Estimation (BETA)

Version 3.0

The DIVA3D Facial Pose Estimation module finds facial pose information. It
consists of two sub-modules: the “Facial Pose Angle Estimation” and the “Facial Pose
Detection” ones. The module can operate either on entire video frames or on facial images
(regions of interest, ROIs) produced by the DIVA3D face tracker and contained in the
relevant ANTHROPOS7 XML output file containing facial metadata (including facial ROIs),
as shown in Figure 1.

Figure 1. Example of a ROI produced by the DIVA tracker.

The first sub-module allows the user to estimate the facial pose angle of a person at
every video frame, as shown in Figure 2. The user must pre-specify the initial face pose
angles in the x-y-z axes on the first video frame. The results produced describe the angles of
deflection from the absolute frontal position in the x-y axes.

Figure 2. Head pose angles.

The second sub-module detects facial poses and finds pose label (e.g., frontal pose),
by using reference facial pose images. It is preferable that the input image belongs to the same
person, whose pose is to be detected. In its absence, any facial pose image representing a
particular facial pose could be used. Alternatively, the user could use one or more of the five
default pose images contained in the module (showing frontal, left, right, mid-right and mid-
left poses), by clicking the respective box displayed within the user interface.
For installing the module, it is enough to copy the PoseEstimation.dll in
DIVA3D working directory. It will be loaded automatically by DIVA3D the next time it is
started. After installation the entry Facial Pose Estimation is displayed under the DIVA3D
Modules menu.
The Facial Pose Angle Estimation sub-module requires the following files to
exist in DIVA3D working directory: vector1.txt, vector2.txt, vector3.txt, weight1.txt,
weight2.txt, weight3.txt, ipl.dll and ipla6.dll (or ipfDll1.dll, ipfDll2.dll and ipfDll3.dl).
It should be noted that this sub-module requires the video stream buffer size to be
larger than 1. The Facial Pose Detection sub-module requires the following dll files
to exist in DIVA3D working directory: previewer.dll, parser.dll. The file parser.dll is
needed for parsing, generating and validating XML documents.
At first, the user is prompted to import the input video file from the File IO Import
menu found under the menu Modules of DIVA3D. After the input video selection, the
following dialog box is displayed (Figure 3).

Figure 3. Dialog box illustrating the input video selection.


The options of the dialog box shown in Figure 4 are the following:
a) Facial Pose Angle Estimation. The specific algorithm detects calculates the angles
of deflection φ and θ from the beginning of the x,y axes, which are supposed to correspond to
the absolute frontal face. The user has to initialize the system by inserting the 3 angles of the
face x, y, z, which correspond to the pan, tilt and roll angles respectively, as shown in Figure
2. These angles describe the facial pose at the first input video frame of the input video and
are entered in the dialog box of Figure 5. We try to start facial pose estimation from entirely
frontal facial images.

Figure 4. Dialog box for the selection of the appropriate algorithm.

Figure 5. Dialog bog for the initialization of the algorithm.

After doing so and by pressing the Proceed button, the algorithm calculates the angles of
deflection for every subsequently video frame. The results are stored in an ANTHROPOS7
XML file produced at the end of the procedure. It should be noted that the XML file contains
the angles φ and θ for every frame of the video sequence. An example of an output XML file
is shown in Figure 6.
Figure 6. Example of an XML file for the Facial Pose Angle Estimation module.

b) Facial Pose Detection. This algorithm can find video frames containing a facial
image having the same pose with a reference facial pose. The user is asked to import an
image that describes the requested facial pose. At this point, the user is able to load any image
of his choice or to use one of the available default pose images. The facial image ROI in a
video frame is read from the corresponding input ANTHROPOS7 XML file. This
presupposes that an XML file produced by the Face Detection and Face/Object Tracking
module for the specific video is available. The corresponding dialog box is displayed in
Figure 7.
It should be noted that the current version of the module supports the selection of
more than one default poses. This is achieved by clicking more than one default poses found
under the menu “Poses Selection” (see Figure 7). The entries of this dialog box are:
• Load Image. It loads the input reference pose image. The user must specify the type
of pose by clicking the corresponding box (e.g., frontal, right pose etc.)
• Poses Selection. The user is able to use one of the default pose images, instead of
loading a new facial pose image.
• Use tracker info. This presupposes that the Face Detection and Face/Object Tracking
module has been firstly applied to this video file. The user has the ability to use the
information produced by this module, by reading the corresponding XML file.
• Input Video Name. It displays the input video name.
• Input Model Name. It displays the input reference facial pose image name.
• Facial Tracking XML. It displays the XML name produced by the Face Detection and
Face/Object Tracking module.
• Proceed. It begins the pose facial pose estimation/detection procedure.

Figure 7. Dialog box for the Facial Pose Detection sub-module.

The program will search for all the poses selected and the results will be stored in the
output XML file. The module is also possible to operate on multiple facial ROIs on the same
video frame. If more than one persons appear in the video sequence, the module takes this
fact into consideration. At the end of the processing cycle, the facial pose results are saved in
an XML file. The Actor Instance found to contain the required facial pose is characterized by
the type of the pose that contains (e.g. right pose). Every other Actor Instance brings the facial
pose label Not Matched. Figure 8 shows an example of the output XML file. In the illustrated
example, the Actor Instance “216” contains the requested pose.
Figure 8. Example of ANTHROPOS7 XML output of the Facial pose detection sub-module.

Figure 9. Actor Instance (facial image) containing the requested facial pose for the example
of Figure 8.

Potrebbero piacerti anche