Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
ON
BACHELOR OF ENGINEERING
IN
ELECTRONICS AND TELECOMMUNICATION
BY
Student Name Exam Seat No
Gaurav Munjewar 71804123C
Geetesh G Kongre 71726431K
CERTIFICATE
This is to certify that the Project Stage-I entitled
Submitted by,
towards the partial fulfilment of the degree of Bachelor of Engineering in Electronics and
Telecommunication as awarded by the Savitribai Phule Pune University, at NBN Sinhgad
School of Engineering.
CONTENTS
ACKNOWLEDGEMENT i
ABSTRACT ii
Contents iii
1. INTRODUCTION 1
1.1 Background 2
1.2 Objectives 2
1.3 Problem Statement 4
1.4 Organization of Report 6
2. LITERATURE SURVEY 7
2.1 Review 8
2.2 Summary of Review 9
3. PROPOSED METHODOLOGY 18
3.1 Introduction 19
3.2 Block Diagram 19
3.3 Hardware Specifications 20
3.4 Software Specifications 21
4. HARDWARE IMPLEMENTATION 23
4.1 Introduction 24
4.2 Working Blocks 25
5. SOFTWARE IMPLEMENTATION 26
5.1 Algorithms 27
5.2 Flowchart (Working of Modules) 27
REFERENCES 30
3
Virtual Mouse
CHAPTER 1
INTRODUCTION
4
Virtual Mouse
1.1 Background
The most efficient and expressive way of human communication is through hand
gesture, which is a universally accepted language. It is pretty much expressive such that
the dumb and deaf people could understand it. In this work, real-time hand gesture
system is proposed. Experimental setup of the system uses fixed position low-cost web
camera high definition recording feature mounted on the top of monitor of computer or a
fixed camera on a laptop, which captures snapshot using Red Green Blue [RGB] colour
space from fixed distance. This work is divided into four stages such as image pre-
processing, region extraction, feature extraction, feature matching. Recognition and the
interpretation of sign language is one of the major issues for the communication with
dump and deaf people. In this project an effective hand gesture segmentation technique
has been proposed based on the pre-processing, background subtraction and edge
detection techniques. Pre-processing is defined as procedure of formulating data for
another process. The main objective of the pre-processing process is to transform the data
into a form that can be more effectively and effortlessly processed. In the proposed work,
the pre-processing techniques are created on the basis of different types of combinations
from the subsequent hand gesture image processing operations such as capturing image,
removing noise, background subtraction, and edge detection and these image processing
methods are discussed.
Within the past few years, as computer technology continues to develop, people want
more compact electronic devices. Human Computing Interaction (HCI), particularly
gesture recognition and object recognition, is becoming increasingly important. In our
project, we introduce a method for controlling the mouse system using a video device
(Mouse tasks). In today's world, most cell phones communicate with the user via touch
screen technology. However, this technology is still prohibitively expensive for use on
desktops and laptop computers.
Generally, a gesture is a symbol for physical or emotional behaviour. It consists of both
body and hand gestures. Gestures can be used to communicate between humans and
computers. Human-computer interaction (HCI) began in the early 1980s as a field of
study and practice. The name "virtual mouse" conveys a clear idea about our project. The
virtual mouse establishes a virtual connection between the user and the machine without
the use of any hardware. This gesture recognition system can capture and track the
5
Virtual Mouse
fingertips of a person wearing a color cap with a webcam, and the system detects the
hand's color and movements and moves the cursor along with it.
1.2 Objectives
The main objective of the proposed AI virtual mouse system is to develop an alternative
to the regular and traditional mouse system to perform and control the mouse functions,
and this can be achieved with the help of a web camera that captures the hand gestures
and hand tip and then processes these frames to perform the particular mouse function
such as left click, right click, and scrolling function.
This Virtual Mouse Hand Recognition application uses a simple color cap on the finger
without the additional requirement of the hardware for the controlling of the cursor using
simple gestures and hand control. This is done using vision based hand gesture
recognition with inputs from a webcam.
6
Virtual Mouse
7
Virtual Mouse
CHAPTER 2
LITERATURE SURVEY
8
Virtual Mouse
The current system is comprised of a generic mouse and trackpad monitor control system,
as well as the absence of a hand gesture control system. The use of a hand gesture to
access the monitor screen from a distance is not possible. Even though it is primarily
attempting to implement, the scope is simply limited in the virtual mouse field.
The existing virtual mouse control system consists of simple mouse operations using a
hand recognition system, in which we can control the mouse pointer, left click, right
click, and drag, and so on. The use of hand recognition in the future will not be used.
Even though there are a variety of systems for hand recognition, the system they used is
static hand recognition, which is simply a recognition of the shape made by the hand and
the definition of action for each shape made, which is limited to a few defined actions
and causes a lot of confusion. As technology advances, there are more and more
alternatives to using a mouse.
9
Virtual Mouse
For the purpose of detection of hand gestures and hand tracking, the MediaPipe
framework is used, and OpenCV library is used for computer vision . The algorithm
makes use of the machine learning concepts to track and recognize the hand gestures and
hand tip.
MediaPipe
Single-shot detector model is used for detecting and recognizing a hand or palm in real
time. The single-shot detector model is used by the MediaPipe. First, in the hand
detection module, it is first trained for a palm detection model because it is easier to train
10
Virtual Mouse
OpenCV
We created our own dataset to implement a hand gesture recognition (HGR) system for
multimedia interaction. We have recorded each video stream of duration time
approximately 10 seconds at the rate of 30 frames per seconds and at the resolution of
1280×720 using digital camera of 8 megapixels. We have performed the experiments in
three different sessions. The images (i.e. frames) taken from the recorded video streams
at various distances and places are used to classify these sessions .Each session consists
of 300 images of 6 different classes where each class having 50 images. We'll utilise
some samples of photos from different sessions to calculate hand gesture recognition
accuracy. The whole system is designed using image processing and computer vision
techniques implemented in Matlab-2013under Windows 8 operating system. The
hardware required for processing the hand motion recognition system was a 64-bit PC
with a 2.40GHz processor and 2 GB RAM.
11
Virtual Mouse
12
Virtual Mouse
CHAPTER 3
PROPOSED
METHODOLOGY
13
Virtual Mouse
3.1 Introduction
The most efficient and expressive way of human communication is through hand gesture,
which is a universally accepted language. It is pretty much expressive such that the dumb
and deaf people could understand it. In this work, real-time hand gesture system is
proposed. Experimental setup of the system uses fixed position low cost web camera high
definition recording feature mounted on the top of monitor of computer or a fixed camera
on a laptop, which captures snapshot using Red Green Blue [RGB] color space from
fixed distance. This work is divided into four stages such as image preprocessing, region
extraction, feature extraction, feature matching. Recognition and the interpretation of sign
language is one of the major issues for the communication with dump and deaf people. In
this project an effective hand gesture segmentation technique has been proposed based on
the preprocessing, background subtraction and edge detection techniques. Pre-processing
is defined as procedure of formulating data for another process. The main objective of the
pre-processing process is to transform the data into a form that can be more effectively
and effortlessly processed. In the proposed work, the pre-processing techniques are
created on the basis of different types of combinations from the subsequent hand gesture
image processing operations such as capturing image, removing noise, background
subtraction, and edge detection and these image processing methods are discussed.
Within the past few years, as computer technology continues to develop, people want
more compact electronic devices. Human Computing Interaction (HCI), particularly
gesture recognition and object recognition, is becoming increasingly important. In our
project, we introduce a method for controlling the mouse system using a video device
(Mouse tasks). In today's world, most cell phones communicate with the user via touch
screen technology. However, this technology is still prohibitively expensive for use on
desktops and laptop computers. Generally, a gesture is a symbol for physical or
emotional behaviour. It consists of both body and hand gestures. Gestures can be used to
communicate between humans and computers. Human-computer interaction (HCI) began
in the early 1980s as a field of study and practice. The name "virtual mouse" conveys a
clear idea about our project. The virtual mouse establishes a virtual connection between
the user and the machine without the use of any hardware. This gesture recognition
system can capture and track the fingertips of a person wearing a color cap with a
14
Virtual Mouse
webcam, and the system detects the hand's color and movements and moves the cursor
along with it.
Fig 3.2.(a)
15
Virtual Mouse
16
Virtual Mouse
17
Virtual Mouse
CHAPTER 4
HARDWARE
IMPLEMENTATION
18
Virtual Mouse
4.1 Introduction
In our project, we introduce a method for controlling the mouse system using a video
device (Mouse tasks). In today's world, most cell phones communicate with the user via
touch screen technology. However, this technology is still prohibitively expensive for use
on desktops and laptop computers. Generally, a gesture is a symbol for physical or
emotional behaviour. It consists of both body and hand gestures. Gestures can be used to
communicate between humans and computers. Human-computer interaction (HCI) began
in the early 1980s as a field of study and practice. The name "virtual mouse" conveys a
clear idea about our project. The virtual mouse establishes a virtual connection between
the user and the machine without the use of any hardware. This gesture recognition
system can capture and track the fingertips of a person wearing a color cap with a
webcam, and the system detects the hand's color and movements and moves the cursor
along with it.
1.Capturing real time video using Web-Camera: We will need a sensor to detect the user's
hand movements in order for the system to work. As a sensor, the computer's webcam is
used. The webcam records real-time video at a fixed frame rate and resolution determined
by the camera's hardware. If necessary, the system's frame rate and resolution can be
changed.
2.Converting the video captured into HSV format: The video has also been converted into
HSV (hue, saturation, meaning, also called HSB), an alternative representation of the
RGB color model created by computer graphics researchers to better reflect the
perception of colored characteristics by human vision.
19
Virtual Mouse
3) Each image frame is processed separately: Following the capture of the video, it goes
through a brief pre-processing stage before being processed one frame at a time.
5) Calibrate the color ranges: The device enters the calibration mode after the above
steps, which assigns each colour according to the HSV rule to its colour hue, saturation or
value values. Every colour already has its predetermined values. For accurate colour
detection, the user can adjust the ranges. In the diagram below you can clearly see the
variety of values used to detect every colour cap.
6) Calculate the image's centroid by locating the image's region. To guide the mouse
pointer, the user must first choose a point whose coordinates can be sent to the cursor.
The device can monitor cursor movement using these coordinates. As the object travels
around the frame, these coordinates change over time.
7) Tracking the mouse pointer. After determining the coordinates, the mouse driver is
accessed and the coordinates are sent to the cursor. The cursor positions itself in the
required position using these coordinates. As a result, the mouse moves proportionally
across the screen as the user moves his hands across the camera's field of view.
8) Simulating the mouse actions. To create the control actions in simulation mode, the
user must make hand gestures. The computation time is reduced due to the use of colour
pointers.
20
Virtual Mouse
CHAPTER 5
SOFTWARE
IMPLEMENTATION
21
Virtual Mouse
5.1 Introduction
To access camera & tracking all hand motion, python is very easy & accurate to use.
Python comes with lots of build in libraries which makes code short and easily
understandable. Python version required for building of this application is 3.7 Open
CV Library: OpenCV are also included in the making of this program. OpenCV
(Open-Source Computer Vision) is a library of programming functions for real time
computer vision. OpenCV have the utility that can read image pixels value, it also has
the ability to create real time eye tracking and blink detection. Tkinter: The tkinter
package is the standard Python interface to the Tk GUI toolkit. Both Tk and tkinter
are available on most Unix platforms, as well as on Windows systems. To make UI
for application we used Tkinter.
22
Virtual Mouse
We want to show through the application that a hand gesture-based system is available
on behalf of a keyboard or mouse-based system. Thus, we develop practical application
with hand gesture recognition as shown in Figure 12. When the user stands in front of
the RGB-D camera, video frames are captured and the hand is detected. When the user
interacts with a computer, a deep learning technique is used for hand gesture
recognition internally in the system. Therefore, users can easily use the hand gestures to
interact and control applications
CHAPTER 6
RESULTS AND
DISCUSSION
23
Virtual Mouse
6.1 Results
We compare our results on the hand gesture dataset against another state-of-the-art
method in Table 1. Overall, DL algorithms showed higher accuracy than the traditional
ML model. The 3DCNN model achieves 92.6% accuracy, while SVM and CNN are
60.50% and 64.28%, respectively, which are 32.1% and 28.3% lower than the 3DCNN
result. The results showed that the 3DCNN network outperforms SVM and 2DCNN in
video classification on our hand gesture dataset. One reason for this result is that learning
multiple frames in 3DCNN provide the flexibility of a decision boundary.
24
Virtual Mouse
In terms of the training time and testing time of convolutional neural networks,
the training time of the 3DCNN showed higher run-time than SVM and 2DCNN.
However, all architectures could recognize the hand gesture in less than 5.29 s, which
shows the feasibility of implementing neural networks to real-time systems. Table 2
shows the results of hand gesture recognition by our proposed method when using
ensemble learning. It can easily be seen that the ensemble learning method outperforms
the single 3DCNN in terms of video classification with an ensemble of 15 3DCNN
models, which achieves 97.12% accuracy. The experimental results indicate that the
3DCNN ensemble with five models is substantially more accurate than the individual
3DCNN on the dataset. However, when the number of models increases to 10 and 15,
there is no significant change. On the other hand, ensemble learning with many models
also required a long training time, which is related to the expensive computational cost of
a computer. The confusion matrix of the 3DCNN with 15 model’s ensemble is shown in
Table 3. As expected, the highest accuracy was obtained in the easier gestures of ‘swipe
left (SL)’, ‘swipe right (SR)’, and ‘meaningless (M)’, while the lowest accuracy was
obtained in the harder gestures ‘zoom in (ZI)’ and ‘zoom out (ZO)’. The accuracy was
reduced in the ‘drag’ gesture because, with the starting point and ending point, the
gesture was sometimes confused with others owing to fast hand movement
25
Virtual Mouse
6.2 Discussions
This paper presented a new approach to hand gesture recognition using a combination of
geometry algorithm and a deep learning method to achieve fingertip detection and gesture
recognition tasks. This approach exhibited not only highly accurate gesture estimates, but
also suitability for practical applications. The proposed method has many advantages, for
example, working well in changing light levels or with complex backgrounds, accurate
detection of hand gestures at a longer distance, and recognizing the hand gestures of
multiple people. The experimental results indicated that this approach is a promising
technique for hand-gesture-based interfaces in real time. For future work with hand
gesture recognition, we intend to expand our system to handle more hand gestures and
apply our method in more other practical application
26
Virtual Mouse
CONCLUSION &
FUTURE SCOPE
27
Virtual Mouse
Conclusion: -
The main objective of the AI virtual mouse system is to control the mouse cursor
functions by using the hand gestures instead of using a physical mouse. The proposed
system can be achieved by using a webcam or a built-in camera which detects the hand
gestures and hand tip and processes these frames to perform the particular mouse
functions.
From the results of the model, we can come to a conclusion that the proposed AI virtual
mouse system has performed very well and has a greater accuracy compared to the
existing models and also the model overcomes most of the limitations of the existing
systems. Since the proposed model has greater accuracy, the AI virtual mouse can be
used for real-world applications, and also, it can be used to reduce the spread of COVID-
19, since the proposed mouse system can be used virtually using hand gestures without
using the traditional physical mouse. The model has some limitations such as small
decrease in accuracy in right click mouse function and some difficulties in clicking and
dragging to select the text. Hence, we will work next to overcome these limitations by
improving the finger tip detection algorithm to produce more accurate results.
The model has some limitations such as small decrease in accuracy in right click mouse
function and some difficulties in clicking and dragging to select the text. A Web camera
28
Virtual Mouse
is running on the mouse cursor. This will also lead to new levels of human-computer
interaction (HCI), which does not require physical contact with the device. This machine
can perform all mouse tasks cantered on colour recognition. This device is capable of
being useful for interacting with contactless input modes. For people who don't use a
touchpad, it's helpful. The architecture of the device proposed would dramatically change
people's interactions with computers. Everyone is compatible with the Webcam, the
microphone, and the mouse. It would eliminate the need for a mouse completely. It can
also be used in gaming or any other independent application. Free movement, left-click,
right-click, drag/select, scroll-up, scroll-down are all operations that can be performed
using only gestures in this Multi-Functional system. The majority of the applications
necessitate additional hardware, which can be quite costly. The goal was to develop this
technology as cheaply as possible while also using a standardized operating system.
Future Scope:-
After this COVID-19 situation, it is not safe to use the devices by touching
them because it may result in a possible situation of spread of the virus by
touching the devices, so the proposed AI virtual mouse can be used to
overcome these problems since hand gesture and hand Tip detection is used to
control the PC mouse functions by using a webcam or a built-in camera.
The proposed AI virtual mouse has some limitations such as small decrease in accuracy
of the right click mouse function and also the model has some difficulties in executing
clicking and dragging to select the text. These are some of the limitations of the proposed
AI virtual mouse system, and these limitations will be overcome in our future work.
Furthermore, the proposed method can be developed to handle the keyboard
functionalities along with the mouse functionalities virtually which is another future
scope of Human-Computer Interaction (HCI).
29
Virtual Mouse
mouse and also for the persons who have problems in their hands and are not
able to control a physical mouse.
REFERENCES
30
Virtual Mouse
References
[1] Abhik Banerjee, Abhirup Ghosh, Koustuvmoni Bharadwaj,” Mouse Control using a
Web Camera based on Colour Detection”, IJCTT, vol.9, Mar 2014
[2] Angel, Neethu. S,” Real Time Static & Dynamic Hand Gesture Recognition”,
International Journal of Scientific & Engineering Research Volume 4, Issue3, March-
2013.
[3] Chen-Chiung Hsieh and Dung-Hua Liou,” A Real Time Hand Gesture Recognition
System Using Motion History Imageries, 2010
[4] Hojoon Park, “A Method for Controlling the Mouse Movement using a Real Time
Camera”, Brown University, Providence, RI, USA, Department of computer science,
2008.
31