Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Lee, Yung-Hui
D y na m ic S ta t ic
G e s t u re s U n int e nt io n a l M o ve m e nt s
M a n ip u la t ive C o m m u n ic a t ive
A c ts S y m b o ls
M im e t ic D e ic t ic R e fe re nt ia l M o d a liz in g
Fig.1: A Taxonomy of hand gestures for Human-computer Interaction. Meaningful gestures are
differentiated from unintentional movements. Gestures used for manipulation of objects are separated
from the gestures which possess inherent communicational character. Symbols are those gestures
having a linguistic role. They symbolize some referential action or are used as modalizers, often of
speech.
3.3 Temporal Modeling of Gestures
¾ A dynamic process
¾ Gesture interval consists of: preparation, stroke, and
retraction
¾ Hand pose during the stroke follows a classible path in
the parameter space
¾ Gestures are conned to a specied spatial volume
¾ Repetitive hand movements are gestures
¾ Manipulative gestures have longer gesture interval than
communicative gestures
(a) (b)
Fig. 5: Gesture-controlled panoramic map browser. (a)
System setting; (b) User interface.
3.3 Gesture Command Set
• Four translation gesture commands
– move up (1); move down (2); move left (3);
move right (4)
• Six rotation gesture commands
– yaw right (7); yaw left (8); roll clockwise (9);
roll counterclockwise (10); pitch down (11);
pitch (12)
• Two other gesture commands
– zoom in (5); zoom out (6).
4 Real-Rime Segmentation of
Continuous Dynamic Hand Gestures
• Goals
– segment the moving hand from background.
– portion of gesture streams into meaningful
sections.
• Methodology
– integrating multiple clues: skin color, motion.
– post-processing (morphological filtering
techniques).
Let t= 0, read a fram e from video buffer,
nam ed I t , then push I t into a stack.
Y
Y
A nalysis and recognition
Push I t+ 1 into the stack
of the gesture sam ple
and increase t b y 1. ?
stored in the stack.
t larger Y
E m pty the stack
than L 2 ?
Quadratic: ρ = x2
⎧ 2 α
Truncated quadratic: ⎪λ x <
ρ (x ,α , λ ) = ⎨ x
λ
⎪⎩α Otherwise
x2
Geman-McClure function: ρ (x , σ ) = 2
σ + x2
⎛ 1 ⎛ x ⎞
2⎞
Where,
∂ 2 E (Θ)
T (ai ) ≥
∂ai2
σ n +1 = 0.95 σ n
5.5 Multi-resolution Analysis
Θ0
Warp Estimate
Θ1 ΔΘ 0
+
Warp Estimate
Θ2 ΔΘ 1
+
Warp Estimate
ΔΘ 2
It +
I t +1
Isotropic expansion: m3 = a1 + a 5 ;
⎧ π
a ⎪θ , θ ∈ [0, ]
2
s1 = a s2 = s3 = ⎨
b π
⎪π − θ , θ ∈ ( , π ]
⎩ 2
a , b , and θ represent the length of the major axis, length ratio of the major axis
to the minor axis, and the normalized angle between the major axis and the x-axis
of the image plane.
6.3. Spatio-temporal Appearance
g L = [f0 , f1,..., f L −1 ]
Where,
ft = [m[t ], s[t ]]T
Fig. 11: DTW assumes that the endpoints of the two patterns have been
accurately located and formulates the pattern matching problem as finding
the optimal path from the start to the end on a finite grid. The optimal path
can be found efficiently by dynamic programming.
7.2 Modified DTW
• Our experiments find that the traditional DTW is not
adequate to match two spatio-temporal appearance
patterns.
– Unlike the high sampling rate used in speech recognition, the
sampling rate is usually 10 Hz in hand gesture recognition.
Therefore, the fluctuation in the time axis of hand gesture
patterns is much sharper than that of speech patterns.
– A modified DTW algorithm, a kind of non-linear re-sampling
technique, is developed to dynamically warp each spatio-
temporal pattern to a fixed temporal length, which can reserve
necessary temporal information and spatial distribution of
original patterns.
7.3 Template based Recognition
• The distance between two sptio-temporal appearance patterns
is calculated based on correlation between their warped
patterns.
∑∑ (w aˆ
j =0 i =0
i ij ) × ( wi bˆij )
D( A, B) = 1 −
K −1 9 K −1 9
∑∑ (w aˆ
j =0 i =0
i ij ) 2
∑∑ (w bˆ )
j =0 i =0
i ij
2
ft m1 m2 m3 m4 m5 s1 s2 s3
t
0 0.9338 1.1024 -0.1660 0.0572 0.1337 41.4148 2.1573 1.5349
1 0.2979 1.3516 -0.1949 0.0249 0.1628 41.5656 1.8194 1.5041
2 -0.0969 0.6288 0.0427 -0.0021 0.0772 44.0707 2.3240 1.4043
3 -0.8565 0.6642 0.0445 0.1200 0.1662 43.7648 2.4416 1.4120
8.6 Motion Appearance Vs Shape Appearance