Sei sulla pagina 1di 5

A Real Time Hand Gesture Recognition System

Karthik DC and Laxmikantha Herle P

Department of electronics and communication engineering, Sri Venkateshwara College Of Engineering, Visvesvaraya
Technological University, Karnataka, India

Abstract— In this paper, we introduce a hand gesture recognition of gesture recognition research is to identify a particular
system to recognize real time gesture in unconstrained human gesture and convey information to the user pertaining
environments. Due to the effect of lighting and complex to individual gesture. From the many of gestures, specific
background, most visual hand gesture recognition systems work gesture of interest can be identified, and on the basis of that,
only under restricted environment. This is why visual hand specific command for execution of action can be given to
gesture recognition systems still are not popular in our daily life. system. Overall aim is to make the computer to understand
In this paper we explain how to design human-computer human body language thereby bridging the gap between
interface, System deals with a method to recognize hand-gesture machine and human. Hand gesture recognition can be used to
in system. The purposed system uses single camera to recognize enhance human–computer interaction without depending on
the user's hand-gesture. It is hard to recognize hand-gesture traditional input devices such as keyboard and mouse.
since a human-hand is the object with high degree of freedom,
the well-known problem in vision-based recognition area. A gesture is defined as a string of movements with specific
However, we use simple background subtraction technique along Breaks that are reached progressively over time. Some simple
with segmentation to increases the processing speed, this will
gestures commonly have only one position to reach from the
beginning to the end of the gesture. Other gestures cover
increase the human computer interaction in real time
multiple positions and rarely remain in a stationary pose.
environment. In the proposed method we divide the image into
Modeling and recognizing a gesture is a difficult challenge
four segments called quadrants and based on which quadrant the
since gestures occur dynamically both in shape and in
object lies a decision is made, which can be an action or some
duration. Gestures can broadly be divided into two categories,
a communicative/meaningful gesture and a non-
communicative or transitional gesture. In order to identify
Keywords— Hand gesture recognition, Human-computer
interface, background subtraction technique, processing speed, different types of communicative motions, it is important to
quadrant. classify gestures.

The aim of the proposed system is mainly designed for

I. INTRODUCTION controlling a machine by merely showing hand gestures in
In human-human interaction, multiple communication front of a camera. Simple video camera is used for computer
modals such as speech, gestures and body movements are vision, which helps in monitoring gesture presentation. This
frequently used. The standard input methods, such as text approach consists of four modules: (a) A real time hand
input via the keyboard and pointer/location information from a gesture formation monitor and gesture capture, (b) Feature
mouse, do not provide a natural, intuitive interaction between extraction, (c) Background subtraction, (d) Command
humans and machine. Therefore, it is essential to create device determination corresponding to shown gesture and performing
for natural and intuitive communication between humans and action by system. Real-time hand tracking technique is used
machine. Furthermore, for intuitive gesture-based interaction for object detection in the range of vision. The primary goal of
between human and machine, the machine should understand hand gesture recognition research is to create
the meaning of gesture with respect to society and culture.
The ability to understand hand gestures will improve the First we discuss the components involved in the design
naturalness and efficiency of human interaction with it, and which is explained using the block diagram in section II. The
allow the user to communicate in complex tasks without using background subtraction technique is discussed in the section
tedious sets of detailed instructions. This interactive system III. Followed by the decision making process in section IV.
uses cameras or web cameras to identify and recognize Then we see the experimental results obtained which is in the
gestures based on hand poses. section V. And finally we describe the future of technology
and conclude in section VI.
The term Gesture is defined as “movement to convey
meaning” or "the use of motions of the limbs or body as a
means of expression”, a movement usually of the body or
limbs that expresses or emphasizes an idea. The main purpose
Image Image Back Decision
Acquisition Processing Ground Making



Fig. 1 system components


A low cost computer vision system that can be executed in This is done through the hardware implementation using
a common PC equipped with an USB web cam is one of the serial communication from PC to microcontroller which in
main objectives of our approach. The system should be able to turn controls the device noticing the hand gesture.
work under different degrees of scene background complexity
and illumination conditions, which shouldn’t change during
Camera first gets controlled through its driver software
the execution. The following processes compose the general
which is provided with the camera hardware. This driver is
framework, which is shown in Fig. 1.
also customized for specific operating system. Now
A. Image acquisition application needs to contact operating system for camera
It can be achieved by using a web cam or a digital video access. Once this process complete live cam view displayed in
camera. This device will capture the image of object and send supportable control like picture box but this is real time live
it to the PC for further processing. view so it is not possible to process it directly so we need to
get the current frame out of live streaming for processing.
B. Image processing This is what we called it frame extraction. Even though we got
Color image is a combination of RGB planes. Here we are the frame it is not easy to identify the color from it so we need
converting RGB color images into gray images (Separating 3 to perform image processing here and as we know each pixel
planes). Finally we will consider only red plane for further is made up of 3 bit of RGB so we try to extract RGB value for
processing. each pixel and convert it into gray image. This image is
further processed using simple background subtraction
C. Back ground subtraction
technique to recognize the color pattern of object. The image
Object is having the maximum intensity where as
is converted into binary image, where black represents the
Background is having minimum intensity, so here we are
background and white represents the object which is nothing
setting a threshold level to convert Gray scale image to binary
but palm. Finally we calculate the centroid for the object. And
image. This technique is applied to remove the Background
based on its location in the quadrant centroid lies respective
noise except the object (palm). Finally we will calculate the
decision is taken.
Centroid for that obtained Binary Image.
All this process will repeat for next frame. Here we are
D. Decision making doing frame capturing and frame processing both, it should be
Here we are making decision that is in which quadrant the compulsory the both process should be synchronous for
object centroid lies. Based on that we will control the Devices. smooth performance. This is how real time image processing
work and utilized in our project.
III. BACK GROUND SUBTRACTION Fig. 1 shows the input image. This image is converted into
The acquired image has both the objects of interest and also grayscale. This is shown in Fig. 2. Grayscale image is
the unwanted background. It is hard to analyze the object with converted into binary image using predefined threshold
the presence of unwanted background. Hence it is necessary to values. Binary image so obtained is shown in Fig. 3.
remove the background from the image.
Background subtraction is a commonly used class of
techniques for segmenting out objects of interest. It involves
comparing an observed image with an estimate of the image if Extracting the object from the background and recognizing
it contained no objects of interest. The areas of the image the gesture is a very difficult task. There are various methods
plane where there is a significant difference between the to recognize the gesture. Some of the methods include
observed and estimated images indicate the location of the maintaining database of all possible gesture in memory, which
objects of interest. The name background subtraction comes can be recognized by the system. This consumes lot of time.
from the simple technique of subtracting the observed image In the proposed method we calculate the centroid of the object
from the estimated image and thresholding the result to obtained. We are going to divide the frame into different
generate the objects of interest. segments and examine in which segment the centroid of
Below Figures shows the example of the step involved in object lies. Depending on which segment the centroid lies a
the background subtraction technique. particular decision is made as shown in Fig. 5.

Fig. 5 segmentation and centroid

Fig. 2 input image

The above system was implemented using web cam
(AstraPix) as input device. MATLAB R2007b software was
used for processing the image and INTEL 2.8GHz processor
with 1GB RAM. The experiment is conducted as shown in
figures above and the results are listed as follows

A. The gestures were recognized with high degree of

accuracy when then the background was of lesser
Fig. 3 grayscale image B. The processing time was found to be very fast since
there is no need store and compare the given gesture
with stored one.
C. As the number of available gestures increased, the
complexity increases, which makes giving accurate
gestures very difficult.
D. The ability of the system to recognize gestures
decreased as the distance between the webcam and the
object increased.

Fig. 4 binary image

system will be converted to wireless model so actions can be
performed from long distances.
We can also combine this technology with other gesture
recognition technology to get a better performance for
example each segment of the image linked to a set of stored
gestures thereby reducing the number of image comparisons
and increasing its efficiency.

Fig. 6 input image


[1] Ms. Shubhangi J. Moon Prof. R. W. Jasutkar “A Real Time Hand

Gesture Recognition Technique by using Embedded device”
International Journal of Advanced Engineering Sciences and
Technologies Vol No. 2, Issue No. 1, 043 – 046.
[2] Dung-Hua Liou Prof. Chen-Chiung Hsieh “A Real Time Hand Gesture
Recognition System by Adaptive Skin-Color Detection and Motion
History Image” Department of Computer Science and Engineering
Tatung University.
[3] Weihua Wang, Fang Zhang “Image Area Measurement based on
Horizontal Scan Line” International Journal of Intelligent Information
Technology Application, 2009, 2(1):8-11.
Fig. 7 grayscale image [4] Alan M. McIvor “Background Subtraction Techniques”
[5] Ross Cutler, Larry Davis “View-based Detection and Analysis of
Periodic Motion” 14th International Conference on Pattern
Recognition, August 16-20, 1998, Brisbane , Australia

Fig. 8 binary image

Fig. 9 segmented image


There are many approaches to hand gesture recognition,

and each approach has its strengths and weaknesses. The
strength of the proposed method in this paper is simple
algorithm and fast processing. The Proposed algorithm
achieves high recognition rate.
The weakness of the system is that the actions performed
by device is for small distances, so in our future work, the