PROJECT REPORT ON__Sign Language Recognition And__Interpretation Using Python
PROJECT REPORT ON__Sign Language Recognition And__Interpretation Using Python
A Report
Submitted By
Aavash Koirala (304, 12, MG)
Hritik Shrestha (316, 12, MG)
Anurag Gaida (309, 12, MG)
Kiran Adhikari (320, 12, MG)
Email id: [email protected]
[email protected]
[email protected]
Submitted To
Department of Computer Science
Khwopa Secondary School
~1~
ACKNOWLEDGEMENT
As per the requirement of partial fulfillment of the degree of Plus Two (Computer Science), under
Khwopa Secondary School second year students are required to prepare a field report. So, I have
chosen Sign Language Recognition And Interpretation. Apart from the efforts of team, the success of
any project depends largely on the encouragement and guidelines of many others. We take this
opportunity to express our gratitude to the team, teachers who have been instrumental in the
successful completion of this project. Special thanks to my team for their hard work and
determination.
We are eternally grateful to our teacher UMESH DUWAL, RABIN PHASIKAWA for their
willingness to give us valuable advice and direction under which we executed this project.
COMPUTER SCIENCE. Your invaluable contributions have played a vital role in bringing this
project to this point. Thank you for your commitment and assistance.
~i~
DEPARTMENT OF MANAGEMENT
APPROVAL SHEET
We, the undersigned, have examined the field report entitled Sign Language Recognition and
KIRAN ADHIKARI, a candidate for the degree of Plus Two (Computer Science). We hereby
(Signature……………….…)
Er. Rabin Phasikawa
Designation: Project Supervisor
Department of Computer Science
Khwopa Secondary School
Dekocha-06, Bhaktapur
(Signature………………….……)
Umesh Duwal
Designation: Head of Department
Department of Computer Science
Khwopa Secondary School,
Dekocha-06, Bhaktapur
~ ii ~
CERTIFICATE OF AUTHORSHIP
I certify that the work in this report has not previously been submitted for a degree nor has it been
submitted as part of requirements for a degree except as fully acknowledged within the text. I also
Any help that I have received in my research work and the preparation of the report itself has been
acknowledged. In addition, I certify that all information sources and literature used are indicated in
This report has not been submitted to any other organization/institution for the award.
(Signature………..……...)
AAVASH KOIRALA
(Grade-12, Sec: -MG, Roll no: -304)
(Signature…………..…...)
HRITIK SHRESTHA
(Grade-12, Sec-MG, Roll-316)
(Signature………………...)
ANURAG GAIDA
(Grade-12, Sec: -MG, Roll no: -309)
(Signature………..……...)
KIRAN ADHIKARI
(Grade-12, Sec: -MG, Roll no: -320)
~ iii ~
ABSTRACT
This project focuses on the development of a Sign Language Recognition and Interpretation
system using Python. The objective is to bridge communication gaps between the hearing-
impaired community and the broader population by creating a strong and efficient solution.
Manipulating computer vision techniques and machine learning algorithms, the system
interprets sign language gestures, signs captured through a camera and translates them into
language dataset, employing real-time gesture recognition, and integrating natural language
processing for accurate interpretation. The outcomes of this project aim to enhance
accessibility, promote, and provide a technological approach for effective communication for
the hearing-impaired.
project presented here represent the impactful harmony between technology and inclusivity.
By using the power of Python, computer vision, and deep learning, we have developed a
system that recognizes sign language gestures & also interprets them in real-time, fostering
effective communication between individuals with hearing impairments and the broader
community. The user-friendly interface and strong model underscore the potential of
technology to break down barriers and create a more inclusive society. As we continue to
innovate in the realm of technologies, this project stands as a proof to the positive impact that
thoughtful and accessible technology solutions can change the life of individuals with diverse
communication needs.
~ iv ~
TABLE OF CONTENT
ACKNOWLEDGEMENTS....................................................................................................................i
APPROVAL SHEET…………………………………………………………………………….……ii
CERTIFICATE OF AUTHORSHIP……………………………………………………...………….iii
ABSTRACT.........................................................................................................................................iv
TABLE OF CONTENT……………………………………………………………………………….v
CHAPTER 1 .........................................................................................................................................1
INTRODUCTION ........................................................................................................................1
1.1OBJECTIVES.….………………………………………………………………..…………..2
CHAPTER 2 .........................................................................................................................................3
ADVANTAGES............................................................................................................................3
DISADVANTAGES.......................................................................................................................4
CHAPTER 3 .........................................................................................................................................5
CHAPTER 4 .........................................................................................................................................7
CHAPTER 5 .........................................................................................................................................8
DATASETS….……………..………………………………………..………….……......8
CHAPTER 6………………………………………………………………………….……..………...9
~v~
CHAPTER 7………………………………………………………………………..……………….10
SYSTEM ARCHITECTURE…………………..………………………………………………10
CHAPTER 8……………………………………………………………………………...………….11
CHAPTER 9……………………………………………………………………………………..…..12
RESEARCH METHODOLOGY……………………………………………………………….12
CHAPTER 10………………………………………………………………………………………..13
CHAPTER 11 .....................................................................................................................................19
REFERENCES ...................................................................................................................................20
CODING……………………………………………………………………………………………..22
EXPECTED OUTPUT……………………………………………………………………...……….34
~ vi ~
LIST OF FIGURES
Figure 1.1:-CHARTS……………………………………………………………………….………...1
~ vii ~
CHAPTER 1
INTRODUCTION
Sign language, a rich and expressive form of communication primarily used
by the Deaf and Hard of Hearing community, serves as a bridge for those who navigate a world
without sound. Despite its significance, there exists a communication gap between individuals
capable in sign language and those who are not. Recognizing the importance of complete
communication, this project examines into the development of a Sign Language Recognition and
Interpretation system using Python. Through the combination of computer vision and machine
learning techniques, we aim to create a powerful tool that not only recognizes diverse sign language
gestures but also interprets them, thereby facilitating unified communication between individuals
with hearing impairments and the broader community.
~1~
Python, with its versatility and wide libraries, serves as an ideal programming language
for this project. The project involves the utilization of computer vision techniques to capture and
process video input, extracting meaningful features from sign language gestures. Additionally, the
implementation of machine learning algorithms, particularly deep learning models, plays a critical
role in training the system to exactly recognize and interpret these gestures. The essential flexibility
of Python allows for unified integration of these components, resulting in a solid and efficient
system.
The scope of this project extends beyond the technical aspects, highlighting the social
impact and broader implications of raising inclusivity (;dfj]zLtf). Effective communication is
fundamental to human connection and understanding, and this project aims to empower individuals
with hearing impairments by providing them with a tool that enables them to express themselves
freely and interact with others without barriers. Moreover, by creating a system that can interpret
sign language in real-time, we imagine a future where communication is not only accessible but also
rapid, promoting a broader and understanding society.
1.1 OBJECTIVES
Here are the main objectives of building a Sign Language Recognition and Interpretation system:
✓ To develop a robust system for recognizing and interpreting sign language gestures
accurately.
✓ To ensure the system can process video input in real-time to provide timely interpretations.
✓ To integrate a user-friendly interface for seamless interaction and display of interpreted sign
language.
✓ To design the system to be accessible to users with varying abilities and accommodate
different sign language.
~2~
CHAPTER 2
IMPORTANCE
The Sign Language Recognition and Interpretation project using Python hold
immense importance in promoting inclusivity and breaking communication barriers for individuals
with hearing impairments. Here are key points highlighting its significance:
1. Accessibility Enhancement:
- The project plays a key role in making information accessible to the deaf and hard-of-hearing
community, allowing them to engage in real-time conversations without relying solely on traditional
sign language interpreters.
2. Facilitating Education:
- In educational settings, the system can bridge communication gaps between students with hearing
impairments and their peers and teachers, facilitating a more complete learning environment.
- This project demonstrates the potential of technology to create innovative solutions for
individuals with various communication needs, paving the way for further advancements in assistive
technologies that provide to a broad range of disabilities.
5. Community Integration:
- By facilitating communication between individuals with hearing impairments and the broader
community, the project promotes understanding, empathy, and integration, raising a society that
embraces diversity.
~3~
DISADVANTAGES
1. Limited Accuracy:
- Achieving high accuracy in recognizing a wide range of sign language gestures can be
challenging. Variability in signing styles, lighting conditions, and the difficulty of certain gestures
may lead to inaccuracies.
- Depending on the system's design and complexity, there may be limitations in recognizing a vast
vocabulary of sign language gestures. This could be a challenge for users who rely on a broader
range of expressions.
3. Cultural Sensitivity:
- Sign language can vary across cultures and regions. A system designed for one sign language
may not be applicable or effective for others, necessitating adaptation or customization for different
user groups.
- Traditional sign language involves physical nuances and expressions that may be challenging to
capture accurately in a digital format. The loss of these subtleties can impact the richness of
communication.
~4~
CHAPTER 3
The Sign Language Recognition and Interpretation system operates within the
broader context of assistive technologies, aiming to enhance communication for individuals with
hearing impairments. As a component of the assistive technology landscape, its product perspective
is rooted in inclusivity, accessibility, and technological innovation. Integrated seamlessly into users'
daily lives, the system serves as a bridge between the deaf and hearing communities, fostering a
more inclusive society. Its design considers the user's perspective, offering a user-friendly interface
that prioritizes ease of use, real-time interpretation, and customization options. The product aligns
with the evolving landscape of assistive technologies, contributing to a future where technology
plays a vital role in breaking down communication barriers and empowering individuals with diverse
needs.
❖ User Interfaces
▪ This section follows specific interaction protocols to handle user inputs, gestures, or
commands effectively. This could involve defining how the system interprets different user
actions and responds accordingly.
▪ Real-time interpretation results are displayed prominently, ensuring immediate feedback for
users.
❖ Hardware Interfaces
▪ Laptop/Desktop/PC
▪ The system typically relies on a camera interface to capture video input, either from a
webcam or an external camera. High-quality cameras with good resolution and frame rates
enhance the accuracy of gesture recognition.
▪ The output of the system, such as real-time interpretation, textual feedback, or graphical
elements, is presented through monitors, touchscreens, or other visual output devices.
~5~
3.1.2 System Requirements
❖ Hardware requirement
▪ Minimum 8 GB RAM for efficient handling of datasets and real-time processing
▪ High-resolution display (Full HD or higher) for clear presentation of interpreted
gestures.
▪ Multi-core processor (e.g., Intel Core i5 or equivalent).
▪ Built-in or external microphone for clear audio input (if speech mixture or recognition
is included).
❖ Software requirement
▪ Windows 10, MacOS, or Linux.
▪ OpenCV.
▪ Visual Studio Code or Atom.
~6~
CHAPTER 4
DATASET FLOW
• Data Collection
1
• Data Processing
2
• Augmentation
4
• Model Training
5
• Model Evaluation
6
• Model Deployment
7
• Sign Interpretation
9
Figure 2 Dataset Flow Diagram
~7~
Creating a dataset flow diagram for sign language recognition involves
multiple steps, including data collection, preprocessing, and splitting into training and testing sets.
Below is a simplified dataset flow diagram:
1. Data Collection:
- Collect a dataset of sign language gestures. Ensure that it is labeled with the corresponding signs.
2. Data Preprocessing:
- Preprocess the images to make them suitable for training. This may include resizing,
normalization, and other image preprocessing techniques.
- Divide the dataset into training and testing sets to evaluate the model's performance accurately.
4. Augmentation:
- Optionally, perform data augmentation to increase the diversity of the training set. This may
include random rotations, flips, and other transformations.
5. Model Training:
- Train the Convolutional Neural Network (CNN) using the preprocessed and augmented dataset.
6. Model Evaluation:
- Evaluate the trained model on the testing set to assess its accuracy and generalization
performance.
7. Model Deployment:
- Implement hand tracking using a library like OpenCV or MediaPipe to detect and track the user's
hand in real-time.
9. Sign Interpretation:
- Based on the detected hand landmarks, map them to the corresponding sign and provide an
interpretation.
~8~
CHAPTER 5
DATASETS
In this chapter, we discuss the dataset and present the datasets used in this project, namely the
AMERICAN SIGN LANGUAGE (ASL) dataset and a new dataset. We are going to discuss some
basics dataset of Sign Language which can be used for the interpretation of sign, gestures etc. Due to
increasing number of the hearing impairment, we discovered that this project may be able to help
them to maintain themselves in the society without any barriers.
As per my knowledge, there are several publicly available datasets for sign language recognition.
However, it's essential to check for updates and new datasets, as the field of computer vision and
machine learning is continually evolving. Here are some datasets commonly used for sign language
recognition:
There exist several datasets that have been used for Sign Language Recognition and Interpretation,
as shown in Table 2.
DATASETS GESTURE
American Sign Language (ASL) 4500
MSRC-12 Kinect Gesture 6244
German Sign Language (DGS) <300
Chinese Sign Language (CSL) 5000
Figure 3 Available Datasets
ASL originated in the early 19th century in the American School for the Deaf (ASD)
in Hartford, Connecticut, from a situation of language contact. Since then, ASL use has been
broadcasted widely by schools for the deaf and Deaf community organizations. Despite its wide use,
no accurate count of ASL users has been taken. Reliable estimates for American ASL users range
from 250,000 to 500,000 persons, including a number of children of deaf adults and other hearing
individuals. The Microsoft Research Cambridge-12 (MSRC-12) Kinect gesture data set consists of
sequences of human movements, represented as body-part locations, and the associated gesture to be
recognized by the system. German Sign Language or Deutsche Gebärdensprache (DGS), is the sign
language of the deaf community in Germany, Luxembourg, and in the German-speaking community
of Belgium. It is unclear how many use German Sign Language as their main language; Gallaudet
University estimated 50,000 as of 1986. The language has evolved through use in deaf communities
over hundreds of years.
Among all those Sign language we used ASL in our project because it made our project fast, easy
and accurate.
~9~
Dataset
We have used multiple datasets and trained multiple models to achieve good accuracy.
Image Pre-processing is the use of algorithms to perform operations on Figure 2.1 Data Processing
images. It is important to Pre-process the images before sending the images
for model training. For example, all the images should have the same size of 200x200 pixels. If not,
the model cannot be trained.
➢ Read Images.
➢ Resize or reshape all the images to the same
➢ Remove noise.
➢ All the image pixels arrays are converted to 0 to 255 by dividing the image array by 255.
~ 10 ~
SIGN LANGUAGE
11%
18%
Deaf People
Undeaf People
Sign Language User
71%
5
Numbers in Billion
Deaf
3
Undeaf
Sign Language user
2
0
2018 2019 2020 2021 2022 2023
~ 11 ~
CHAPTER 6
- Development of algorithms and models for recognizing and interpreting sign language gestures.
This includes recognizing individual signs, understanding sequences of signs to form words or
sentences, and potentially distinguishing between different signers.
2. Gesture-to-Text Translation:
- Translating sign language gestures into written text, enabling communication between sign
language users and those who may not understand sign language.
3. Real-Time Interaction:
- Implementing real-time systems for sign language recognition, enabling instant communication
between signers and non-signers using technology like webcams or depth sensors.
4. Educational Tools:
- Developing educational tools and applications to assist in learning and practicing sign language.
This can include interactive games, tutorials, and feedback systems.
5. Multimodal Approaches:
- Exploring multimodal approaches that combine visual information with other modalities, such as
facial expressions or body language, to improve the accuracy and richness of sign language
interpretation.
- Ensuring that technology designed for sign language recognition is inclusive, user-friendly, and
respects the cultural and linguistic aspects of the signing community
~ 12 ~
CHAPTER 7
SYSTEM ARCHITECTURE
The system architecture for an American Sign Language (ASL) recognition
system involves the coordination of components across the frontend, backend, and machine learning
model deployment. Here's a high-level overview of the system architecture:
❖ Frontend Architecture:
- Provides an interactive interface for users to interact with the ASL recognition system.
2. Communication:
- Communicates with the backend through API requests, sending captured video frames for
processing.
2. Model Training:
- Can be performed offline, and the trained model is then deployed to the backend.
This architecture provides a foundation for building a scalable, maintainable, and efficient ASL
recognition system.
~ 13 ~
CHAPTER 8
❖ Features:
1. Real-Time Recognition:
2. User-Friendly Interface:
- Natural and easy-to-use frontend interface for capturing and processing ASL signs.
- Capable of recognizing signs from various sign languages, not limited to a specific region.
4. Educational Tools:
- Provides educational features to assist users in learning and practicing ASL gestures.
❖ Functionalities:
1. Image Processing:
- Utilizes OpenCV for image processing, including hand detection and tracking.
- Deploys a trained ASL recognition model using frameworks like Tensor Flow, Serving or Flask.
- Includes documentation for API usage and logs relevant information for monitoring and
debugging.
~ 14 ~
CHAPTER 9
RESEARCH METHODOLOGY
When conducting research on sign language recognition and interpretation using Python, we will
typically follow a structured research methodology. Here is a general research methodology outline:
Clearly clear the problem you aim to address, such as improving sign language communication
through automatic recognition and interpretation.
2. Research Objectives:
Clearly outline the specific goals and objectives of your research. This could include improving
accuracy, exploring real-time processing, or addressing specific challenges in sign language
recognition.
3. Data Collection:
Describe how you collect your sign language dataset. This might involve using existing datasets,
capturing your own data, or a combination of both
4. Preprocessing:
Detail the steps taken to preprocess the data, including resizing images, normalizing pixel values,
and any other necessary transformations. Ensure that your data is suitable for input into your chosen
model.
5. Training:
Discuss the training process, including how you split your dataset into training and testing sets, the
number of periods, and any hyperparameter tuning. Mention the optimization algorithm and loss
function used.
6. Evaluation:
Present the metrics used to evaluate your model's performance. Accuracy, precision, recall, and F1-
score are common metrics. Compare your results to existing approaches and discuss any limitations.
Present your results, including any visualizations or graphs. Discuss the strengths and weaknesses of
your approach and compare it to other methods in the literature.
~ 15 ~
CHAPTER 10
Figure 5 A to Z Signs
~ 16 ~
Figure 7 Basic Uses of Signs
~ 17 ~
Figure 8 Uses of Sign Languages in Modern World
~ 18 ~
CHAPTER 11
system that can able to understand sign language and translate that to the corresponding text. There
are still many shortages of our system like this system can detect 0-9 digits and A-Z alphabets hand
gestures but doesn’t cover body gestures and other dynamic gestures. We are sure and it can be
The development of a real-time interface is fundamental, incorporating video frame capture, model
calculation, and user-friendly display. Interpretation logic, mapping model outputs to meaningful
gestures or translated text, enhances the system's usability. Optional features, like integrating text-to-
speech, further extend accessibility. Difficult testing, ongoing optimization, and user feedback loops
refine the system's accuracy and responsiveness. Once achieving satisfactory performance, the
system is ready for deployment, contributing to inclusive communication by making sign language
more accessible. This comprehensive approach ensures a healthy and effective solution, raising a
bridge between sign language users and the broader community. And we believe that AI can take
over this technology in the upcoming days, which will help all the people suffering from the hearing
problems.
~ 19 ~
REFRENCES
1. https://www.wikipedia.org
2. https://www.google.com
3. https://www.dummies.com/article/academics-the-arts/language-language-arts/learning-
languages/american-sign-language/signing-for-dummies-cheat-sheet-208315/
4. https://www.ai-media.tv/knowledge-hub/insights/sign-language-alphabets/
5. https://www.mathplanet.com/education/programming?gclid=CjwKCAiA75itBhA6EiwAkho9e0fj
0J2VzcnJx08uuL4akrZBZQKOsHgUOTENM_YHn1DIqxzVHbJsVBoC-AkQAvD_BwE!/
6. https://opencv.org/
7. https://www.tensorflow.org/
8. https://pytorch.org/
9. https://medium.com/@20it105/sign-language-recognition-using-python-74ef7ea43181
10. https://www.who.int/
11. https://www-i6.informatik.rwth-aachen.de/aslr/database-rwth-boston-104.php
12. https://docs.opencv.org/4.x/
13. https://www.nature.com/articles/s41598-023-43852-x
14. https://towardsdatascience.com/sign-language-recognition-with-advanced-computer-vision-
7b74f20f3442
15. https://books.google.com.np/books?id=HHetDwAAQBAJ&printsec=frontcover&redir_esc=yv=
onepage&q&f=false
~ 20 ~
16."Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron
18."Recent Advances in Deep Learning for Speech and Sign Language Processing: A Review" by
Jiawei Zhang, Lijun Deng, et al. (Published in IEEE Access)
19."Real-Time American Sign Language Recognition from Video Sequences Using Convolutional
Neural Networks" by Jonathan Michaux, Thibault Lefebvre, and Denis Hamad
20."Deep Sign: Deep Learning for Sign Language Recognition" by Mohammed Elsahili, Salah
Brahim, and Mohamed Atri
24."Sign Language Recognition (SLR) using Machine Learning: A Review" by Brijendra Singh,
Babita Pandey
~ 21 ~
CODING
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import time
offset = 20
imgSize = 300
folder = "Data/Y"
counter = 0
while True:
success, img = cap.read()
hands, img = detector.findHands(img)
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
~ 22 ~
imgCropShape = imgCrop.shape
aspectRatio = h / w
if aspectRatio > 1:
k = imgSize / h
wCal = math.ceil(k * w)
imgResize = cv2.resize(imgCrop, (wCal, imgSize))
imgResizeShape = imgResize.shape
wGap = math.ceil((imgSize - wCal) / 2)
imgWhite[:, wGap:wCal + wGap] = imgResize
else:
k = imgSize / w
hCal = math.ceil(k * h)
imgResize = cv2.resize(imgCrop, (imgSize, hCal))
imgResizeShape = imgResize.shape
hGap = math.ceil((imgSize - hCal) / 2)
imgWhite[hGap:hCal + hGap, :] = imgResize
cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)
cv2.imshow("Image", img)
key = cv2.waitKey(1)
if key == ord("s"):
counter += 1
cv2.imwrite(f'{folder}/Image_{time.time()}.jpg', imgWhite)
print(counter)
~ 23 ~
1.2 Test Code
import cv2
import numpy as np
import math
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)
offset = 20
imgSize = 300
folder = "Data/C"
counter = 0
labels = ["A", "B", "C", "D", "E", "F", "G", "H", "X", "Y"]
while True:
imgOutput = img.copy()
~ 24 ~
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
imgCropShape = imgCrop.shape
aspectRatio = h / w
if aspectRatio > 1:
k = imgSize / h
wCal = math.ceil(k * w)
imgResizeShape = imgResize.shape
print(prediction, index)
~ 25 ~
else:
k = imgSize / w
hCal = math.ceil(k * h)
imgResizeShape = imgResize.shape
cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)
cv2.imshow("Image", imgOutput)
cv2.waitKey(1)
~ 26 ~
EXPLATION OF DATA COLLECTION CODE
The provided code captures hand images from a webcam using the `cvzone` library for hand
tracking. It creates a dataset for sign language gestures by saving the segmented hand images into a
specified folder. Here's an explanation of the Data Collection code:
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import time
The necessary libraries are imported, including OpenCV for computer vision, `cvzone` for hand
tracking, NumPy for numerical operations, and other standard libraries.
The code initializes the webcam using OpenCV and sets up a hand detector to track hands in the
video stream.
offset = 20
imgSize = 300
Variables `offset` and `imgSize` are defined to add a border around the captured hand and set the
size of the final cropped hand image.
~ 27 ~
folder = "Data/Y"
counter = 0
The `folder` variable specifies the directory where the captured images will be saved. The `counter`
variable is used to keep track of the number of captured images.
while True:
success, img = cap.read()
hands, img = detector.findHands(img)
The code continuously captures video frames from the webcam, detects hands in the frames using
the `HandDetector` object, and retrieves the list of detected hands.
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
If hands are detected, the code extracts the bounding box coordinates (`x`, `y`, `w`, `h`) of the
detected hand. It then creates a white canvas (`imgWhite`) and crops the hand region from the
original frame (`imgCrop`).
~ 28 ~
# Image resizing and centering
# ...
cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)
The code resizes and centers the cropped hand image to fit into a square canvas (`imgWhite`). It then
displays both the cropped hand image (`imgCrop`) and the resized image (`imgWhite`) in separate
windows.
cv2.imshow("Image", img)
key = cv2.waitKey(1)
if key == ord("s"):
counter += 1
cv2.imwrite(f'{folder}/Image_{time.time()}.jpg', imgWhite)
print(counter)
The original frame with hand tracking annotations is displayed in a separate window. If the 's' key is
pressed, the current hand image (`imgWhite`) is saved as a JPEG file in the specified folder with a
unique timestamped filename.
~ 29 ~
EXPLATION OF TEST CODE
This code uses the `cvzone` library to perform hand detection and gesture classification in real-time.
import cv2
import numpy as np
import math
The necessary libraries are imported, including OpenCV (`cv2`), `cvzone` for hand tracking and
gesture classification, NumPy for numerical operations, and the `math` library.
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)
The code initializes the webcam using OpenCV, sets up a hand detector to track hands in the video
stream, and loads a pre-trained classifier model and labels for gesture classification using the
~ 30 ~
offset = 20
imgSize = 300
folder = "Data/C"
counter = 0
labels = ["A", "B", "C", "D", "E", "F", "G", "H", "X", "Y"]
Variables `offset`, `imgSize`, `folder`, and `counter` are defined. The `offset` is used to add a border
around the captured hand, `imgSize` sets the size of the final cropped hand image, `folder` specifies
the directory where the captured images will be saved, and `counter` is used to keep track of the
while True:
imgOutput = img.copy()
The code continuously captures video frames from the webcam, detects hands in the frames using
~ 31 ~
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
If hands are detected, the code extracts the bounding box coordinates (`x`, `y`, `w`, `h`) of the
detected hand. It then creates a white canvas (`imgWhite`) and crops the hand region from the
# ...
print(prediction, index)
The code resizes and centers the cropped hand image to fit into a square canvas (`imgWhite`). It then
uses the `getPrediction` method from the `Classifier` object to classify the hand gesture and obtain
~ 32 ~
# Drawing bounding boxes, labels, and rectangles on the output image
# ...
cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)
cv2.imshow("Image", imgOutput)
cv2.waitKey(1)
The code then displays the cropped hand image (`imgCrop`), the resized hand image (`imgWhite`),
and the original frame with bounding boxes, labels, and rectangles drawn on it (`imgOutput`). The
~ 33 ~
EXPECTED OUTPUT
~ 34 ~
~ 35 ~
~ 36 ~
CODES IN VISUAL STUDIO CODE
~ 37 ~