Scientific Bulletin of the „Petru Maior” University of Tîrgu Mureş
Vol. 11 (XXVIII) no. 1, 2014
ISSN-L 1841-9267 (Print), ISSN 2285-438X (Online), ISSN 2286-3184 (CD-ROM)
HAND GESTURES MOUSE CURSOR CONTROL
Marian-Avram VINCZE1, Horaţiu-Ştefan GRIF2
1,2
“Petru Maior” University of Tîrgu Mureş
Nicolae Iorga Street, no.1, 540088, Tîrgu Mureş, Romania
1
vincze.marian.avram@gmail.com
2
horatiu.grif@ing.upm.ro
Abstract
The paper describes the implementation of a human-computer interface for controlling the
mouse cursor. The test reveal the fact: a low-cost web camera some processing algorithms
are quite enough to control the mouse cursor on computers. Even if the system is
influenced by the illuminance level on the plane of the hand, the current study may
represent a start point for some studies on the hand tracking and gesture recognition field.
Keywords: human-computer interface, mouse cursor, hand tracking, hand gesture, HSV color model
1. Introduction based on the accelerometer from a mobile device. The
In Human-Computer Interaction (HCI), traditional system able to recognize a collection of 10 different
input devices such as keyboard, mouse and joystick hand gestures finally is configured to control the
become less suitable in the interaction between actions of a wheeled robot in an indoor space. In [6]
human and computer since it is more natural for is described a motion tracking system based on
Human-Human Communication (HHC) to integrated accelerometers which is tested to control
communicate with his/her body. Especially, hands the navigation in virtual spaces.
and fingers have a significant role in HHC. They used
to accompany or even replace spoken communication 2. Color models
in some cases, such as in sign languages, where hands Gonzales and Woods (2001) define the color
completely replace spoken communication. model (color space or color system) as a specification
Nowadays, computer vision gives a lot of potential of a coordinate system and a subspace within that
for motion capturing without to put limits to the user system where each color is represented by a single
and this offers the possibility to create more natural point. The RGB color model is the “most commonly
gestural interfaces.[14] The ability to recognize hand used and popular color space”. (Sarifuddin 2005, p.
gestures is indispensable and important for successful 1) In this model each color is represented using the
interpersonal social interaction. Vision-based hand three additive primaries colors: red, green and blue.
gesture recognition, enabling computers to [3], [7] Combined additively the spectral components
understand hand gestures as humans do, is an of these colors will produce a resultant color. [3]
important technology for intelligent HCI. Therefore, According to the definitions mentioned above the
visual analysis of hand gestures has attracted much RGB model uses as a coordinate system a Cartesian
attention in the last decade, and much progress has coordinate system and as a color subspace a cube (fig.
been made. [13] Reference [8] introduces a gesture 1) which has three corners colored in red, green and
recognition system based on image blocking where blue and the other corners are colored in cyan,
the gestures are recognized using a suggested magenta, yellow, black and white. Black is the origin
brightness factor matching algorithm. In [13] is of RGB system, and white is situated diagonally
presented the realization of a real-time hand tracking opposite the black color. Each color of this model is
system in dynamic environments for a wheelchair defined as a point on the surface of the RGB cube or
using a mean shift embedded particle filter. In [9] the as a point within the RGB cube. [7] [1] The straight
authors propose a method for the recognition of the line which join the black corner with white corner
gestures of two hands using the depth map of the define the gray scale level [7] (each points from this
frame. The depth map is calculated in hardware level line has equal values for the three components RGB).
by a VisionST stereo camera. In [11] is proposed and According to “this space presents some
implemented an application for objects manipulating limitations: (i) the presence of a negative part in the
in virtual environment using the hand gestures. In spectra, which does not allow the representation of
[15] the authors present a gesture recognition system certain colors by a superposition of the three spectra,
46
(ii) the difficulty to determine color features like the is composed by three blocks:
presence or the absence of a given color, and (iii) the - Image acquisition block;
inability of the Euclidean distance to correctly - Hand tracking and gesture recognition block;
capture color differences in the RGB - Mouse cursor control.
space.”(Sarifuddin 2005, p. 1) the authors of the
present paper have considered a perceptual color Image acquisition
space [4]. More accurate, they chose the HSV color
model (fig. 2) because the color components are Frame
relate to the human perception of color, saturation,
and luminance [5].
Hand tracking & hand gesture
recognition
Mouse event
Mouse cursor control
Fig. 3: System architecture
The image acquisition block consists of a web
Fig. 1. RGB color model [7][1] camera which captures images of the human hand.
The hand tracking and hand gesture recognition block
group the methods used to detect the human hand and
interpret the hand gestures. The mouse cursor control
block is responsible with the forwarding of the
appropriate mouse event to the operating system.
The system overview is presented in fig. 4.
Fig. 2: HSV color model [4]
The V, S and H components of the HSV model
are computed by [1], [5], [10]
V = max (R, G, B) (1)
0, if V 0
S V X (2)
, otherwise
V
where Fig. 4: The system overview
X = min (R, G, B) (3)
The web camera (A4Tech PK-635M), denoted by
0, if V X (1), captures the image of the human hand placed on a
G B hand pad (3) and sends it on the computer (2). In
60 o 360 o mod 360 o , if V R
V X computer, the image of the hand is processed and
H B R (4) based on the gesture detected a command for the
60 o 120 o , if V G mouse cursor will be generated. The ring (4) is used
V X
o R G for an easiest spatial detection of the hand on the
60 240 o , if V B plane of the hand pad. In fig. 5 is presented the
V X
picture of the hand and the ring attached to the hand
where R, G and B are the components of the RGB
during the system operation.
models with values on the interval [0,1].
3. The system architecture
The system architecture is presented in fig.3 and
47
Fig. 5: Hand and the ring Fig. 6: Default hand gesture
4. The proposed procedure
To achieve the mouse cursor control the following
general algorithm was applied:
Step 1. Acquire the current image of the hand pad
area;
Step 2. Remove the noise from the acquired
image;
Step 3. Detect the position of the hand and sets the
mouse cursor (x,y) coordinates;
Step 4. Detect the hand pixels;
Step 5. Based on the detected hand gesture send
the corresponding command to the mouse cursor; Fig. 7: Left click hand gesture
Step 6. Jump to Step 1.
The noise removal from the Step 2 is done by
applying a blurring filter with a 7x7 Gaussian mask.
The hand (x,y) position (detect on Step 3) on the
hand pad plane is achieved by the detection of the
ring (x,y) position. The ring has a blue color. To
detect the pixels of the ring the authors converted the
acquired image from RGB model in HSV model.
After conversion the HSV image was threshold by the
hue and saturation ranges of the ring area. After the
threshold a morphological closing operation was
applied. The morphological closing operation was Fig. 8: Right click hand gesture
achieve by applying first, a dilate operation and
second, an erode operation [2]. The (x,y) coordinates 5. Experimental results
of the ring are computed by
The source code of the software part of the system
N ring
1 was written in C language using the OpenCV [10],
x xi library.
N ring i 1 The system was tested for different levels of the
(1)
1
N ring
daylight illuminance on the hand pad. For each
y yi considered level of illuminance the authors perform
N ring i 1 20 hand gestures. In fig. 6 are presented the number
where Nring is the total number of the pixels of the detected hand gestures for each considered
corresponding to the ring area. level of illuminance.
The pixels corresponding to the hand area (Step 4)
was achieved in the same way as the detections the The number of detected hand gestures
pixels corresponding to the ring. The same HSV 25
image was threshold by the hue and saturation ranges
20
20
No. of gestures
of the hand skin (Caucasian skin color) area. After the 15
11
13
threshold, the previously mentioned morphological 10 8
closing operation was applied. 5
The detection of the hand gesture was done by 0
3 6 10 25
counting the pixels corresponding to the hand. The Illuminance [lx]
authors define three hand gestures (default, right click
and left click) which are presented in fig. 6 – 8 and Fig. 6: The numbers of hand gestures detected for different
have different numbers of pixels. levels of the daylight illuminance on the hand pad plane
48
6. Conclusions [6] Farella E., Benini L., Riccò B. and Acquaviva
As a conclusion to this paper, we proved that with A. (2007), MOCA: A Low-Power, Low-Cost
few hand gestures and image processing algorithms, Motion Capture System Based on Integrated
we achieved full control over the mouse cursor Accelerometers, Advances in Multimedia, vol.
eliminating the mouse completely. 2007, Article ID 82638, pp. 1-11.
From the tests we’ve made, the conclusion is that doi:10.1155/2007/82638. [Online]. Available:
the application is best to be used in the highest http://www.hindawi.com/journals/am/2007/082
daylight or in a room with a high artificial light level. 638/abs/
With every research we came up, there are some [7] Gonzalez R. C. and Woods R. E. (2001), Digital
benefits and disadvantages which pop out during the Image Processing, Prentice Hall.
research process. [8] Hasan M. M. and K. Misra P. (2011), Brightness
The benefits from our research are as follow: factor matching for gesture recognition system
- ability to adapt for various skin colors; using scaled normalization, International
- ability to adapt after different hand gestures Journal of Computer Science & Information
based on the number of pixels of each gesture made. Technology, Vol 3, No 2, pp. 35-46.
The disadvantage of this research is that this [9] Igorevich R. R., Park P., Choi J. and Min D.
application can’t be used every minute of every day (2013), Two Hand Gesture Recognition Using
because daylight varies from an illuminance value to Stereo Camera, International Journal of
another in few seconds based on the weather outside. Computer and Electrical Engineering, Vol. 5,
For example, if a cloud is passing by and blocks No. 1, pp. 69-72.
the sun light, the illuminance is changed drastically [10] OpenCV documentation (2013). [Online].
making the detection process more inaccurate and Available: http://docs.opencv.org/index.html
there for, less hand gestures recognized. [11] Rautaray S. S., Agrawal A. (2012), Real time
As we presented initially, this research can be hand gesture recognition system for dynamic
moved on to a higher ideology and that is to fully applications, International Journal of UbiComp,
gain control over a PC or gadget, not just by hand Vol.3, No.1, pp. 21-31.
gestures, but with full body language (eyes, face [12] Sarifuddin M. and Missaoui R. (2005), A New
expressions, mouth movements, arms gestures). All Uniform Color Space with Associated Color
you need is a low-cost web camera and other image Similarity Measure for Content-Based Image
processing algorithms. and Video Retrieval, Multimedia Information
Motto: That’s one small click for man, one giant Retrieval Workshop, 28th annual ACM SIGIR
gesture for mankind. conference on Information Retrieval, Salvador,
Brasil, pp.1-8. [Online]. Available:
References http://mmis.doc.ic.ac.uk/mmir2005/CameraRea
dyMissaoui.pdf
[1] Bardoşi F. I., Grif H. S. and Răţoi O. (2009),
[13] Shan C., Tan T. and Wei Y. (2007), Real-time
Mouse Controlled by Hand Gestures Using
hand tracking using a mean shift embedded
Digital Image Processing, International
particle filter, Pattern Recognition, 40, pp.
Conference Interdisciplinarity in Engineering
1958–1970.
INTER-ENG 2009, Tg. Mureş, Romania, pp.
[14] Tsagaris A. and Manitsaris S. (2013), Colour
258-263.
space comparison for skin detection in finger
[2] Bradski G. and A. Kaehler A. (2008), Learning
gesture recognition, International Journal of
OpenCV, O’Reilly, US.
Advances in Engineering & Technology, Vol. 6,
[3] Crane R. (1997), A simplified approach to
Issue 4, pp. 1431-1441.
image processing: classical and modern
[15] Wang X., Tarrío P., Bernardos A. M., Metola E.
techniques in C, Prentice-Hall.
and Casar J. R. (2012), User-independent
[4] De Campos T. E. (2006), 3D Visual Tracking of
accelerometer-based gesture recognition for
Articulated Objects and Hands. PhD. Thesis,
mobile devices, Advances in Distributed
University of Oxford.
Computing and Artificial Intelligence Jornual,
[5] El-Hafeez T. A. (2010), A New System for
Vol.1, No. 3, pp. 11-25. [Online]. Available:
Extracting and Detecting Skin Color Regions
http://oa.upm.es/16199/1/INVE_MEM_2012_1
from PDF Documents, International Journal on
32745.pdf
Computer Science and Engineering, Vol. 02,
No. 09, pp. 2838-2846.
49