Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
2.3.3.2. Segmentation
Task of the segmentation will be for the recognition and detection of the
placeholder objects and pointers where the visual output of the system will be
projected as well as hands in the 2d image captured. In order to achieve
invariance to changing size and form of objects to be detected the research used
colour pixel-based approach to segment spots of similar colour image.
Problems like lighting settings, changing illumination and skin colour detection
is discussed and was given solutions to (Moeslund T., 2004).
2.3.3.3. Gesture Recognition
A basic approach is done to solve this problem, by counting the number of
fingers. Hand and fingers can be approximated by a circle and a number of
rectangles, where it equates to the number of the finger that is projected. Polar
transformation around the centre of the hand and count the number of fingers
(rectangles) present in each radius. The algorithm does not contain any
information regarding the relative distances between two fingers, because it
makes the system more general, and secondly because different users tend to
have different preferences in the shape and size of their hands (Moeslund T.,
2004).
2.3.3.4. System Performance
Gesture-recognition has been implemented as part of the computer vision
system of a computer vision system of an AR multi-user application. The low
level segmentation (section 3) can robustly segment 7 different colours from
the background (skin colour and 6 colours for PHO and pointers), given there
are no big changes in the illumination colour (Moeslund T., 2004).
Segmentation Results
2.3.4. A Design Tool for Camera-Based Interaction
Constructing a camera-based interface can be difficult for most programmers
and would require a better understanding of machine algorithms that are involve.
Basically a camera-based interface is that a camera will serve as the sensor/eyes of the
system regarding with your input. The goal is to make the system interactive while not
wearing any other special devices to detect the input rather than having other
traditional inputs like keyboard etc. This makes computing set in the environment
rather than in our desktops. Problem lies in the designing of a camera-based system, the
programming and the mathematics part is complicated that ordinary programmers do
not have the skill for it especially when we are considering bare-hand inputs. The main
item to be considered in a camera-based interaction is a classifier that takes an image
and identifies pixels that is considered. Acquiring skills in building a classifier is
greatly needed to pursue the idea (Fails, J.A., 2003).
Crayons is one of the tools to make a classifier which can be exported in a form
that can be read by java. Crayons help User Interface (UI) designers to make the
camera-based interface even without detailed knowledge on image processing. But its
features are unable to distinguish shapes and object orientation but do well in object-
detection and hand and object tracking (Fails, J.A., 2003).
2.3.5. Using Marking Menus to Develop Command Sets for Computer Vision Based
Hand Gesture Interfaces
The use of hand gestures for interaction, in an approach based on computer
vision. The purpose is to study if marking menus, with practice, could support the
development of autonomous command sets for gestural interaction. Some early
problems are reported, mainly concerning with user fatigue and precision of gestures
(Lenman, S., 2002).
Remote control of electronic appliances in a home environment, such as TV
sets and DVD players, has been chosen as a starting point. Normally it requires the use
of a number of devices, and there are clear benefits to an appliance-free approach. They
only implemented a first prototype for exploring pie- and marking menus for gesture-
based interaction (Lenman, S., 2002).
2.3.5.1. Perceptive and Multimodal User Interfaces
Perceptive User Interfaces (PUI) strives for automatic recognition of
natural, human gestures integrated with other human expressions, such as body
movements, gaze, facial expression, and speech. The second approach to
gestural interfaces will be the Multimodal User Interfaces (MUI), where hand
poses and specific gestures are used as commands in a command language. In
this approach, gestures are either a replacement for other interaction tools, such
as remote controls, mouse, or other interaction devices. The gestures need not
be natural gestures but could be developed for the situation, or based on a
standard sign language.
There is a growing interest in designing multimodal interfaces that
incorporate vision-based technologies. It contrasts the passive mode of PUI
with the active input mode addressed here. It claims that although passive
modes may be less obtrusive, active modes generally are more reliable
indicators of user intent, and not as prone to error.
The design space for such commands can be characterized along three
dimensions: Cognitive aspects, Articulatory aspects, and Technological aspects.
Cognitive aspects refer to how easy commands are to learn and to remember. It
is often claimed that gestural command sets should be natural and intuitive,
meaning that they should inherently make sense to the user.
Articulatory aspects refer to how easy gestures are to perform, and how tiring
they are for the user. Gestures involving complicated hand or finger poses
should be avoided, because they are difficult to articulate.
However, the performance was far from real-time. The approach closest
was representing the poses as elastic graphs with local jets of Gabor filters
computed at each vertex. In order to maximize speed and accuracy in the
prototype, gesture recognition is currently tuned to work against a uniform
background within a limited area, approximately 0.5 by 0,65m in size, at a
distance of approximately 3m from the camera, and under relatively fixed
lighting conditions (Lenman, S., 2002).
Reference:
Hardenberg, C., Bérard, F., (2001). Bare-hand human-computer interaction. Orlando, FL USA.
Retrieved from
Kjeldsen, R., Levas, A., & Pinhanez, C. (2003). Dynamically Reconfigurable Vision-Based User
Interface. Retrieved from http://www.research.ibm.com/ed/publications/icvs03.pdf
DLP and LCD Projector Technology Explained. (n.d.). Retrieved June 2, 2006, from
http://www.projectorpoint.co.uk/projectorLCDvsDLP.htm.
Moeslund T., Liu Y., Storring M., (2004, September). Computer Vision-Based Gesture Recognition for
an Augmented Reality Interface. Marbella, Spain. Retrieved from
http://www.cs.sfu.ca/~mori/courses/cmpt882/papers/augreality.pdf
Fails, J.A., Olsen, D. (2003). A Design Tool for Camera-Based Interaction. Bringham University, Utah.
Retrieved from http://icie.cs.byu.edu/Papers/CameraBaseInteraction.pdf
Lenman, S., Bretzner, L., Thuresson B., (2002, October). Using Marking Menus to Develop Command
Sets for Computer Vision Based Hand Gesture Interfaces. Retrieved from
http://delivery.acm.org/10.1145/580000/572055/p239-
lenman.pdf?key1=572055&key2=1405429411&coll=GUIDE&dl=ACM&CFID=77345099&C
FTOKEN=54215790
Webcam. (n.d.). Wikipedia. Retrieved June 03, 2006, from Answers.com Web site:
http://www.answers.com/topic/web-cam.
Intel, (n.d.). Open source computer vision library. Retrieved June 4, 2006 from
http://www.intel.com/technology/computing/opencv/index.htm.
The Microsoft Vision SDK. (2000, May). Retrieved June 4, 2006 from
http://robotics.dem.uc.pt/norberto/nicola/visSdk.pdf