0% found this document useful (0 votes)
205 views31 pages

Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication

The document presents a stage 1 project report on developing a virtual mouse control system using hand gesture recognition. It outlines the objectives, problem statement, and organization of the report for a project that will use a webcam to capture hand gestures and movements to control mouse functions like left click, right click, and scrolling without requiring a physical mouse. The report will be submitted for partial fulfillment of a Bachelor's degree in Electronics and Telecommunication engineering.

Uploaded by

Gaurav Munjewar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views31 pages

Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication

The document presents a stage 1 project report on developing a virtual mouse control system using hand gesture recognition. It outlines the objectives, problem statement, and organization of the report for a project that will use a webcam to capture hand gestures and movements to control mouse functions like left click, right click, and scrolling without requiring a physical mouse. The report will be submitted for partial fulfillment of a Bachelor's degree in Electronics and Telecommunication engineering.

Uploaded by

Gaurav Munjewar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

A PROJECT STAGE-I REPORT

ON

“Virtual Mouse Control Using Hand Class Gesture”


SUBMITTED TOWARDS THE PARTIAL FULFILLMENT
OF THE REQUIREMENTS OF DEGREE OF

BACHELOR OF ENGINEERING
IN
ELECTRONICS AND TELECOMMUNICATION
BY
Student Name Exam Seat No
Gaurav Munjewar 71804123C
Geetesh G Kongre 71726431K

UNDER THE GUIDANCE OF


PROF. S. P. DESHMUKH

DEPARTMENT OF ELECTRONICS AND TELECOMMUNICATION


STES’s NBN SINHGAD TECHNICAL INSTITUTES CAMPUS
NBN SINHGAD SCHOOL OF ENGINEERING
AMBEGAON (BK),
PUNE – 411041
2021-22
NBN SINHGAD TECHNICAL INSTITUTES CAMPUS NBN SINHGAD
SCHOOL OF ENGINEERING, PUNE DEPARTMENT OF
ELECTRONICS AND TELECOMMUNICATION ENGINEERING

CERTIFICATE
This is to certify that the Project Stage-I entitled

“Virtual Mouse Control Using Hand Class Gesture”

Submitted by,

Student Name Exam Seat No


Gaurav Munjewar 71804123C
Geetesh G Kongre 71726431K

towards the partial fulfilment of the degree of Bachelor of Engineering in Electronics and
Telecommunication as awarded by the Savitribai Phule Pune University, at NBN Sinhgad
School of Engineering.

Prof. S. P. Deshmukh Dr. Makarand . M. Jadhav Dr. Shivprasad P. Patil


Guide Head of Principal
Department
Virtual Mouse

CONTENTS

Ch. Title Page no


No.

ACKNOWLEDGEMENT i
ABSTRACT ii
Contents iii

1. INTRODUCTION 1
1.1 Background 2
1.2 Objectives 2
1.3 Problem Statement 4
1.4 Organization of Report 6

2. LITERATURE SURVEY 7
2.1 Review 8
2.2 Summary of Review 9

3. PROPOSED METHODOLOGY 18
3.1 Introduction 19
3.2 Block Diagram 19
3.3 Hardware Specifications 20
3.4 Software Specifications 21

4. HARDWARE IMPLEMENTATION 23
4.1 Introduction 24
4.2 Working Blocks 25

5. SOFTWARE IMPLEMENTATION 26
5.1 Algorithms 27
5.2 Flowchart (Working of Modules) 27

6. RESULTS & DISCUSSION 28

7. CONCLUSION & FUTURE SCOPE 29

REFERENCES 30

3
Virtual Mouse

CHAPTER 1

INTRODUCTION

4
Virtual Mouse

1.1 Background
The most efficient and expressive way of human communication is through hand
gesture, which is a universally accepted language. It is pretty much expressive such that
the dumb and deaf people could understand it. In this work, real-time hand gesture
system is proposed. Experimental setup of the system uses fixed position low-cost web
camera high definition recording feature mounted on the top of monitor of computer or a
fixed camera on a laptop, which captures snapshot using Red Green Blue [RGB] colour
space from fixed distance. This work is divided into four stages such as image pre-
processing, region extraction, feature extraction, feature matching. Recognition and the
interpretation of sign language is one of the major issues for the communication with
dump and deaf people. In this project an effective hand gesture segmentation technique
has been proposed based on the pre-processing, background subtraction and edge
detection techniques. Pre-processing is defined as procedure of formulating data for
another process. The main objective of the pre-processing process is to transform the data
into a form that can be more effectively and effortlessly processed. In the proposed work,
the pre-processing techniques are created on the basis of different types of combinations
from the subsequent hand gesture image processing operations such as capturing image,
removing noise, background subtraction, and edge detection and these image processing
methods are discussed.
Within the past few years, as computer technology continues to develop, people want
more compact electronic devices. Human Computing Interaction (HCI), particularly
gesture recognition and object recognition, is becoming increasingly important. In our
project, we introduce a method for controlling the mouse system using a video device
(Mouse tasks). In today's world, most cell phones communicate with the user via touch
screen technology. However, this technology is still prohibitively expensive for use on
desktops and laptop computers.
Generally, a gesture is a symbol for physical or emotional behaviour. It consists of both
body and hand gestures. Gestures can be used to communicate between humans and
computers. Human-computer interaction (HCI) began in the early 1980s as a field of
study and practice. The name "virtual mouse" conveys a clear idea about our project. The
virtual mouse establishes a virtual connection between the user and the machine without
the use of any hardware. This gesture recognition system can capture and track the

5
Virtual Mouse

fingertips of a person wearing a color cap with a webcam, and the system detects the
hand's color and movements and moves the cursor along with it.

1.2 Objectives
The main objective of the proposed AI virtual mouse system is to develop an alternative
to the regular and traditional mouse system to perform and control the mouse functions,
and this can be achieved with the help of a web camera that captures the hand gestures
and hand tip and then processes these frames to perform the particular mouse function
such as left click, right click, and scrolling function.
This Virtual Mouse Hand Recognition application uses a simple color cap on the finger
without the additional requirement of the hardware for the controlling of the cursor using
simple gestures and hand control. This is done using vision based hand gesture
recognition with inputs from a webcam.

6
Virtual Mouse

1.3 Problem Statement

• To design motion tracking mouse which detect finger movements gestures


instead of physical mouse.
• To design an application (.exe file) with user friendly user interface which
provides feature for accessing motion tracking mouse feature.
• The camera should detect all the motions of hand and performs the
operation of mouse.
• Implement such code where motion tracker mouse has drag & drop feature
along with scrolling feature.
• User Interface must be Simple & easy to understand.
• Physical mouse is subjected to mechanical wear and tear.
• Physical mouse requires special hardware and surface to operate.
• Physical mouse is not easily adaptable to different environments and its
performance varies depending on the environment.
• Mouse has limited functions even in present operational environments.
• All wired mouse and wireless mouse have its own lifespan.
• Implement such code where camera can recognize each and every finger
movement & responds according to it:

7
Virtual Mouse

CHAPTER 2

LITERATURE SURVEY

8
Virtual Mouse

2.1 Literature Survey

The current system is comprised of a generic mouse and trackpad monitor control system,
as well as the absence of a hand gesture control system. The use of a hand gesture to
access the monitor screen from a distance is not possible. Even though it is primarily
attempting to implement, the scope is simply limited in the virtual mouse field.

The existing virtual mouse control system consists of simple mouse operations using a
hand recognition system, in which we can control the mouse pointer, left click, right
click, and drag, and so on. The use of hand recognition in the future will not be used.
Even though there are a variety of systems for hand recognition, the system they used is
static hand recognition, which is simply a recognition of the shape made by the hand and
the definition of action for each shape made, which is limited to a few defined actions
and causes a lot of confusion. As technology advances, there are more and more
alternatives to using a mouse.

9
Virtual Mouse

2.1 Review on various Machine Learning Techniques

For the purpose of detection of hand gestures and hand tracking, the MediaPipe
framework is used, and OpenCV library is used for computer vision . The algorithm
makes use of the machine learning concepts to track and recognize the hand gestures and
hand tip.

MediaPipe

It is a framework which is used for applying in a machine learning pipeline, and it is an


open source framework of Google. The MediaPipe framework is useful for cross
platform development since the framework is built using the time series data. The
MediaPipe framework is multimodal, where this framework can be applied to various
audios and videos . The Media Pipe framework is used by the developer for building and
analyzing the systems through graphs, and it also been used for developing the systems
for the application purpose. The steps involved in the system that uses Media Pipe are
carried out in the pipeline configuration. The pipeline created can run in various
platforms allowing scalability in mobile and desktops. The MediaPipe framework is
based on three fundamental parts; they are performance evaluation, framework for
retrieving sensor data, and a collection of components which are called calculators, and
they are reusable. A pipeline is a graph which consists of components called calculators,
where each calculator is connected by streams in which the packets of data flow through.
Developers are able to replace or define custom calculators anywhere in the graph
creating their own application. The calculators and streams combined create a data-flow
diagram; the graph is created with MediaPipe where each node is a calculator and the
nodes are connected by streams 

Single-shot detector model is used for detecting and recognizing a hand or palm in real
time. The single-shot detector model is used by the MediaPipe. First, in the hand
detection module, it is first trained for a palm detection model because it is easier to train

10
Virtual Mouse

palms. Furthermore, the no maximum suppression works significantly better on small


objects such as palms or fists [13]. A model of hand landmark consists of locating 21
joint or knuckle co-ordinates in the hand region.

 OpenCV

OpenCV is a computer vision library which contains image-processing algorithms for


object detection [14]. OpenCV is a library of python programming language, and real-
time computer vision applications can be developed by using the computer vision library.
The OpenCV library is used in image and video processing and also analysis such as face
detection and object detection .

2.2 Summary of Review

We created our own dataset to implement a hand gesture recognition (HGR) system for
multimedia interaction. We have recorded each video stream of duration time
approximately 10 seconds at the rate of 30 frames per seconds and at the resolution of
1280×720 using digital camera of 8 megapixels. We have performed the experiments in
three different sessions. The images (i.e. frames) taken from the recorded video streams
at various distances and places are used to classify these sessions .Each session consists
of 300 images of 6 different classes where each class having 50 images. We'll utilise
some samples of photos from different sessions to calculate hand gesture recognition
accuracy. The whole system is designed using image processing and computer vision
techniques implemented in Matlab-2013under Windows 8 operating system. The
hardware required for processing the hand motion recognition system was a 64-bit PC
with a 2.40GHz processor and 2 GB RAM.

11
Virtual Mouse

12
Virtual Mouse

CHAPTER 3

PROPOSED
METHODOLOGY

13
Virtual Mouse

3.1 Introduction
The most efficient and expressive way of human communication is through hand gesture,
which is a universally accepted language. It is pretty much expressive such that the dumb
and deaf people could understand it. In this work, real-time hand gesture system is
proposed. Experimental setup of the system uses fixed position low cost web camera high
definition recording feature mounted on the top of monitor of computer or a fixed camera
on a laptop, which captures snapshot using Red Green Blue [RGB] color space from
fixed distance. This work is divided into four stages such as image preprocessing, region
extraction, feature extraction, feature matching. Recognition and the interpretation of sign
language is one of the major issues for the communication with dump and deaf people. In
this project an effective hand gesture segmentation technique has been proposed based on
the preprocessing, background subtraction and edge detection techniques. Pre-processing
is defined as procedure of formulating data for another process. The main objective of the
pre-processing process is to transform the data into a form that can be more effectively
and effortlessly processed. In the proposed work, the pre-processing techniques are
created on the basis of different types of combinations from the subsequent hand gesture
image processing operations such as capturing image, removing noise, background
subtraction, and edge detection and these image processing methods are discussed.
Within the past few years, as computer technology continues to develop, people want
more compact electronic devices. Human Computing Interaction (HCI), particularly
gesture recognition and object recognition, is becoming increasingly important. In our
project, we introduce a method for controlling the mouse system using a video device
(Mouse tasks). In today's world, most cell phones communicate with the user via touch
screen technology. However, this technology is still prohibitively expensive for use on
desktops and laptop computers. Generally, a gesture is a symbol for physical or
emotional behaviour. It consists of both body and hand gestures. Gestures can be used to
communicate between humans and computers. Human-computer interaction (HCI) began
in the early 1980s as a field of study and practice. The name "virtual mouse" conveys a
clear idea about our project. The virtual mouse establishes a virtual connection between
the user and the machine without the use of any hardware. This gesture recognition
system can capture and track the fingertips of a person wearing a color cap with a

14
Virtual Mouse

webcam, and the system detects the hand's color and movements and moves the cursor
along with it.

3.2 Block Diagram

Fig 3.2.(a)

15
Virtual Mouse

3.3 Hardware Specifications

Table 1. Hardware Specifications

Name of Property Details


Application : Computer Desktop or Laptop The computer desktop or a
laptop will be utilized to run the visual software in order to
display what webcam had captured. A notebook which is a
small, lightweight and inexpensive laptop computer is
proposed to increase mobility of the application.

Operating : System will be using Processor: Core2Duo Main Memory: 2


Conditions GB RAM (Minimum) Hard Disk: 512 GB (Minimum) Display:
14" Monitor (For more comfort) Webcam is utilized for
image processing, the webcam will continuously taking
image in order for the program to process the image and
find pixel position.

Display : Computer Desktop or Laptop

16
Virtual Mouse

3.4 Software Specifications

Table 2. Software Specifications

Name of Property Details


Application : To access camera & tracking all hand motion, python is very
easy & accurate to use. Python comes with lots of build in
libraries which makes code short and easily understandable.
Python version required for building of this application is
3.7 Open CV Library.

Software version : OpenCV (Open-Source Computer Vision) is a library of


programming functions for real time computer vision.
OpenCV have the utility that can read image pixels value, it
also has the ability to create real time eye tracking and blink
detection.

Ram : 2 GB RAM (Minimum). recomn

17
Virtual Mouse

CHAPTER 4

HARDWARE
IMPLEMENTATION

18
Virtual Mouse

4.1 Introduction
In our project, we introduce a method for controlling the mouse system using a video
device (Mouse tasks). In today's world, most cell phones communicate with the user via
touch screen technology. However, this technology is still prohibitively expensive for use
on desktops and laptop computers. Generally, a gesture is a symbol for physical or
emotional behaviour. It consists of both body and hand gestures. Gestures can be used to
communicate between humans and computers. Human-computer interaction (HCI) began
in the early 1980s as a field of study and practice. The name "virtual mouse" conveys a
clear idea about our project. The virtual mouse establishes a virtual connection between
the user and the machine without the use of any hardware. This gesture recognition
system can capture and track the fingertips of a person wearing a color cap with a
webcam, and the system detects the hand's color and movements and moves the cursor
along with it.

4.2 Working of Blocks

1.Capturing real time video using Web-Camera: We will need a sensor to detect the user's
hand movements in order for the system to work. As a sensor, the computer's webcam is
used. The webcam records real-time video at a fixed frame rate and resolution determined
by the camera's hardware. If necessary, the system's frame rate and resolution can be
changed.

2.Converting the video captured into HSV format: The video has also been converted into
HSV (hue, saturation, meaning, also called HSB), an alternative representation of the
RGB color model created by computer graphics researchers to better reflect the
perception of colored characteristics by human vision.

19
Virtual Mouse

3) Each image frame is processed separately: Following the capture of the video, it goes
through a brief pre-processing stage before being processed one frame at a time.

4) Conversion of each frame to a greyscale image: A grayscale image has a lower


computational complexity than a coloured image. It also aids in faster colour calibration
without the use of external noise. All the necessary operations were carried out after the
image was converted to grayscale.

5) Calibrate the color ranges: The device enters the calibration mode after the above
steps, which assigns each colour according to the HSV rule to its colour hue, saturation or
value values. Every colour already has its predetermined values. For accurate colour
detection, the user can adjust the ranges. In the diagram below you can clearly see the
variety of values used to detect every colour cap.
6) Calculate the image's centroid by locating the image's region. To guide the mouse
pointer, the user must first choose a point whose coordinates can be sent to the cursor.
The device can monitor cursor movement using these coordinates. As the object travels
around the frame, these coordinates change over time.

7) Tracking the mouse pointer. After determining the coordinates, the mouse driver is
accessed and the coordinates are sent to the cursor. The cursor positions itself in the
required position using these coordinates. As a result, the mouse moves proportionally
across the screen as the user moves his hands across the camera's field of view.

8) Simulating the mouse actions. To create the control actions in simulation mode, the
user must make hand gestures. The computation time is reduced due to the use of colour
pointers.

20
Virtual Mouse

CHAPTER 5

SOFTWARE
IMPLEMENTATION

21
Virtual Mouse

5.1 Introduction
To access camera & tracking all hand motion, python is very easy & accurate to use.
Python comes with lots of build in libraries which makes code short and easily
understandable. Python version required for building of this application is 3.7 Open
CV Library: OpenCV are also included in the making of this program. OpenCV
(Open-Source Computer Vision) is a library of programming functions for real time
computer vision. OpenCV have the utility that can read image pixels value, it also has
the ability to create real time eye tracking and blink detection. Tkinter: The tkinter
package is the standard Python interface to the Tk GUI toolkit. Both Tk and tkinter
are available on most Unix platforms, as well as on Windows systems. To make UI
for application we used Tkinter.

Fig 5. 1 (a) : Flowchart of Virtual Mouse.

22
Virtual Mouse

We want to show through the application that a hand gesture-based system is available
on behalf of a keyboard or mouse-based system. Thus, we develop practical application
with hand gesture recognition as shown in Figure 12. When the user stands in front of
the RGB-D camera, video frames are captured and the hand is detected. When the user
interacts with a computer, a deep learning technique is used for hand gesture
recognition internally in the system. Therefore, users can easily use the hand gestures to
interact and control applications

CHAPTER 6

RESULTS AND
DISCUSSION

23
Virtual Mouse

6.1 Results
We compare our results on the hand gesture dataset against another state-of-the-art
method in Table 1. Overall, DL algorithms showed higher accuracy than the traditional
ML model. The 3DCNN model achieves 92.6% accuracy, while SVM and CNN are
60.50% and 64.28%, respectively, which are 32.1% and 28.3% lower than the 3DCNN
result. The results showed that the 3DCNN network outperforms SVM and 2DCNN in
video classification on our hand gesture dataset. One reason for this result is that learning
multiple frames in 3DCNN provide the flexibility of a decision boundary.

24
Virtual Mouse

In terms of the training time and testing time of convolutional neural networks,
the training time of the 3DCNN showed higher run-time than SVM and 2DCNN.
However, all architectures could recognize the hand gesture in less than 5.29 s, which
shows the feasibility of implementing neural networks to real-time systems. Table 2
shows the results of hand gesture recognition by our proposed method when using
ensemble learning. It can easily be seen that the ensemble learning method outperforms
the single 3DCNN in terms of video classification with an ensemble of 15 3DCNN
models, which achieves 97.12% accuracy. The experimental results indicate that the
3DCNN ensemble with five models is substantially more accurate than the individual
3DCNN on the dataset. However, when the number of models increases to 10 and 15,
there is no significant change. On the other hand, ensemble learning with many models
also required a long training time, which is related to the expensive computational cost of
a computer. The confusion matrix of the 3DCNN with 15 model’s ensemble is shown in
Table 3. As expected, the highest accuracy was obtained in the easier gestures of ‘swipe
left (SL)’, ‘swipe right (SR)’, and ‘meaningless (M)’, while the lowest accuracy was
obtained in the harder gestures ‘zoom in (ZI)’ and ‘zoom out (ZO)’. The accuracy was
reduced in the ‘drag’ gesture because, with the starting point and ending point, the
gesture was sometimes confused with others owing to fast hand movement

25
Virtual Mouse

6.2 Discussions
This paper presented a new approach to hand gesture recognition using a combination of
geometry algorithm and a deep learning method to achieve fingertip detection and gesture
recognition tasks. This approach exhibited not only highly accurate gesture estimates, but
also suitability for practical applications. The proposed method has many advantages, for
example, working well in changing light levels or with complex backgrounds, accurate
detection of hand gestures at a longer distance, and recognizing the hand gestures of
multiple people. The experimental results indicated that this approach is a promising
technique for hand-gesture-based interfaces in real time. For future work with hand
gesture recognition, we intend to expand our system to handle more hand gestures and
apply our method in more other practical application

26
Virtual Mouse

CONCLUSION &
FUTURE SCOPE

27
Virtual Mouse

Conclusion: -
The main objective of the AI virtual mouse system is to control the mouse cursor
functions by using the hand gestures instead of using a physical mouse. The proposed
system can be achieved by using a webcam or a built-in camera which detects the hand
gestures and hand tip and processes these frames to perform the particular mouse
functions.

From the results of the model, we can come to a conclusion that the proposed AI virtual
mouse system has performed very well and has a greater accuracy compared to the
existing models and also the model overcomes most of the limitations of the existing
systems. Since the proposed model has greater accuracy, the AI virtual mouse can be
used for real-world applications, and also, it can be used to reduce the spread of COVID-
19, since the proposed mouse system can be used virtually using hand gestures without
using the traditional physical mouse. The model has some limitations such as small
decrease in accuracy in right click mouse function and some difficulties in clicking and
dragging to select the text. Hence, we will work next to overcome these limitations by
improving the finger tip detection algorithm to produce more accurate results.

The model has some limitations such as small decrease in accuracy in right click mouse
function and some difficulties in clicking and dragging to select the text. A Web camera

28
Virtual Mouse

is running on the mouse cursor. This will also lead to new levels of human-computer
interaction (HCI), which does not require physical contact with the device. This machine
can perform all mouse tasks cantered on colour recognition. This device is capable of
being useful for interacting with contactless input modes. For people who don't use a
touchpad, it's helpful. The architecture of the device proposed would dramatically change
people's interactions with computers. Everyone is compatible with the Webcam, the
microphone, and the mouse. It would eliminate the need for a mouse completely. It can
also be used in gaming or any other independent application. Free movement, left-click,
right-click, drag/select, scroll-up, scroll-down are all operations that can be performed
using only gestures in this Multi-Functional system. The majority of the applications
necessitate additional hardware, which can be quite costly. The goal was to develop this
technology as cheaply as possible while also using a standardized operating system.

Future Scope:-
 After this COVID-19 situation, it is not safe to use the devices by touching
them because it may result in a possible situation of spread of the virus by
touching the devices, so the proposed AI virtual mouse can be used to
overcome these problems since hand gesture and hand Tip detection is used to
control the PC mouse functions by using a webcam or a built-in camera.

 The proposed AI virtual mouse has some limitations such as small decrease in accuracy
of the right click mouse function and also the model has some difficulties in executing
clicking and dragging to select the text. These are some of the limitations of the proposed
AI virtual mouse system, and these limitations will be overcome in our future work.
 Furthermore, the proposed method can be developed to handle the keyboard
functionalities along with the mouse functionalities virtually which is another future
scope of Human-Computer Interaction (HCI).

 AI Virtual Mouse. Developed an AI-based approach for controlling the mouse


movement using Python and OpenCV with real-time camera that detects hand
landmarks, tracks gesture patterns instead of a physical mouse.
 The proposed AI virtual mouse system can be used to overcome problems in
the real world such as situations where there is no space to use a physical

29
Virtual Mouse

mouse and also for the persons who have problems in their hands and are not
able to control a physical mouse.

REFERENCES

30
Virtual Mouse

References

[1] Abhik Banerjee, Abhirup Ghosh, Koustuvmoni Bharadwaj,” Mouse Control using a
Web Camera based on Colour Detection”, IJCTT, vol.9, Mar 2014
[2] Angel, Neethu. S,” Real Time Static & Dynamic Hand Gesture Recognition”,
International Journal of Scientific & Engineering Research Volume 4, Issue3, March-
2013.
[3] Chen-Chiung Hsieh and Dung-Hua Liou,” A Real Time Hand Gesture Recognition
System Using Motion History Imageries, 2010
[4] Hojoon Park, “A Method for Controlling the Mouse Movement using a Real Time
Camera”, Brown University, Providence, RI, USA, Department of computer science,
2008.

31

You might also like