0% found this document useful (0 votes)
8 views

PROJECT REPORT ON__Sign Language Recognition And__Interpretation Using Python

The project report details the development of a Sign Language Recognition and Interpretation system using Python, aimed at bridging communication gaps for the hearing-impaired community. It employs computer vision and machine learning techniques to interpret sign language gestures in real-time, enhancing accessibility and inclusivity. The report outlines the project's objectives, significance, challenges, and the technological framework necessary for its implementation.

Uploaded by

dorionsuwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

PROJECT REPORT ON__Sign Language Recognition And__Interpretation Using Python

The project report details the development of a Sign Language Recognition and Interpretation system using Python, aimed at bridging communication gaps for the hearing-impaired community. It employs computer vision and machine learning techniques to interpret sign language gestures in real-time, enhancing accessibility and inclusivity. The report outlines the project's objectives, significance, challenges, and the technological framework necessary for its implementation.

Uploaded by

dorionsuwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

PROJECT REPORT ON

Sign Language Recognition And


Interpretation Using Python

A Report

Submitted By
Aavash Koirala (304, 12, MG)
Hritik Shrestha (316, 12, MG)
Anurag Gaida (309, 12, MG)
Kiran Adhikari (320, 12, MG)
Email id: [email protected]
[email protected]
[email protected]

Submitted To
Department of Computer Science
Khwopa Secondary School

In Partial Fulfillment of elective subject COMPUTER SCIENCE under section


Management offered in
Plus Two Second Year
In the
Faculty of Management
Khwopa Secondary School
Bhaktapur, Nepal

~1~
ACKNOWLEDGEMENT

As per the requirement of partial fulfillment of the degree of Plus Two (Computer Science), under

Khwopa Secondary School second year students are required to prepare a field report. So, I have

chosen Sign Language Recognition And Interpretation. Apart from the efforts of team, the success of

any project depends largely on the encouragement and guidelines of many others. We take this

opportunity to express our gratitude to the team, teachers who have been instrumental in the

successful completion of this project. Special thanks to my team for their hard work and

determination.

We are eternally grateful to our teacher UMESH DUWAL, RABIN PHASIKAWA for their

willingness to give us valuable advice and direction under which we executed this project.

Additionally, I am grateful for the support and guidance provided by DEPARTMENT OF

COMPUTER SCIENCE. Your invaluable contributions have played a vital role in bringing this

project to this point. Thank you for your commitment and assistance.

Aavash Koirala (304, 12, MG)

Hritik Shrestha (316, 12, MG)

Anurag Gaida (309, 12, MG)

Kiran Adhikari (320, 12, MG)

~i~
DEPARTMENT OF MANAGEMENT

FACULTY OF COMPUTER SCIENCE

KHWOPA SECONDARY SCHOOL

APPROVAL SHEET

We, the undersigned, have examined the field report entitled Sign Language Recognition and

Interpretation presented by AAVASH KOIRALA, HRITIK SHRESTHA, ANURAG GAIDA,

KIRAN ADHIKARI, a candidate for the degree of Plus Two (Computer Science). We hereby

certify that the field report is worthy of acceptance.

(Signature……………….…)
Er. Rabin Phasikawa
Designation: Project Supervisor
Department of Computer Science
Khwopa Secondary School
Dekocha-06, Bhaktapur

(Signature………………….……)
Umesh Duwal
Designation: Head of Department
Department of Computer Science
Khwopa Secondary School,
Dekocha-06, Bhaktapur

~ ii ~
CERTIFICATE OF AUTHORSHIP

I certify that the work in this report has not previously been submitted for a degree nor has it been

submitted as part of requirements for a degree except as fully acknowledged within the text. I also

certify that the field report has been written by me.

Any help that I have received in my research work and the preparation of the report itself has been

acknowledged. In addition, I certify that all information sources and literature used are indicated in

the reference section of the thesis.

This report has not been submitted to any other organization/institution for the award.

(Signature………..……...)
AAVASH KOIRALA
(Grade-12, Sec: -MG, Roll no: -304)

(Signature…………..…...)
HRITIK SHRESTHA
(Grade-12, Sec-MG, Roll-316)

(Signature………………...)
ANURAG GAIDA
(Grade-12, Sec: -MG, Roll no: -309)

(Signature………..……...)
KIRAN ADHIKARI
(Grade-12, Sec: -MG, Roll no: -320)

~ iii ~
ABSTRACT

This project focuses on the development of a Sign Language Recognition and Interpretation

system using Python. The objective is to bridge communication gaps between the hearing-

impaired community and the broader population by creating a strong and efficient solution.

Manipulating computer vision techniques and machine learning algorithms, the system

interprets sign language gestures, signs captured through a camera and translates them into

textual language. The implementation involves training a model on a comprehensive sign

language dataset, employing real-time gesture recognition, and integrating natural language

processing for accurate interpretation. The outcomes of this project aim to enhance

accessibility, promote, and provide a technological approach for effective communication for

the hearing-impaired.

In conclusion, the Sign Language Recognition and Interpretation

project presented here represent the impactful harmony between technology and inclusivity.

By using the power of Python, computer vision, and deep learning, we have developed a

system that recognizes sign language gestures & also interprets them in real-time, fostering

effective communication between individuals with hearing impairments and the broader

community. The user-friendly interface and strong model underscore the potential of

technology to break down barriers and create a more inclusive society. As we continue to

innovate in the realm of technologies, this project stands as a proof to the positive impact that

thoughtful and accessible technology solutions can change the life of individuals with diverse

communication needs.

~ iv ~
TABLE OF CONTENT
ACKNOWLEDGEMENTS....................................................................................................................i

APPROVAL SHEET…………………………………………………………………………….……ii

CERTIFICATE OF AUTHORSHIP……………………………………………………...………….iii

ABSTRACT.........................................................................................................................................iv

TABLE OF CONTENT……………………………………………………………………………….v

LIST OF FIGURES ............................................................................................................................vii

CHAPTER 1 .........................................................................................................................................1

INTRODUCTION ........................................................................................................................1

1.1OBJECTIVES.….………………………………………………………………..…………..2

CHAPTER 2 .........................................................................................................................................3

ADVANTAGES............................................................................................................................3

DISADVANTAGES.......................................................................................................................4

CHAPTER 3 .........................................................................................................................................5

SOFTWARE REQUIREMENT SPECIFICATION…….............................................................5

3.1 PRODUCT PERSPECTIVE………………….……………………………………..5

3.2 PRODUCT FUNCTION………………………..……………………………….…..6

CHAPTER 4 .........................................................................................................................................7

DATASET FLOW DIAGRAM......................................................................................................7

CHAPTER 5 .........................................................................................................................................8

DATASETS….……………..………………………………………..………….……......8

CHAPTER 6………………………………………………………………………….……..………...9

SCOPE OF THE PROJECT………………...…………………………………..……………......9

~v~
CHAPTER 7………………………………………………………………………..……………….10

SYSTEM ARCHITECTURE…………………..………………………………………………10

CHAPTER 8……………………………………………………………………………...………….11

FEATURES AND FUNCTIONALITIES……………….…………………………...…….……11

CHAPTER 9……………………………………………………………………………………..…..12

RESEARCH METHODOLOGY……………………………………………………………….12

CHAPTER 10………………………………………………………………………………………..13

BASIC HAND SIGN USED IN AMERICAN SIGN LANGUAGE (ASL)…………………...13

CHAPTER 11 .....................................................................................................................................19

CONCLUSIONS AND FUTURE WORK..............................................................................19

REFERENCES ...................................................................................................................................20

CODING……………………………………………………………………………………………..22

1.1 DATA COLLECION CODE………………………………………………………………..22

1.2 TEST CODE………………………………………………………………………………..24

EXPLANATION OF DATA COLLECTION CODE……………………………………………….27

EXPLANATION OF TEST CODE…………………………….…………………………...……….30

EXPECTED OUTPUT……………………………………………………………………...……….34

CODES IN VISUAL STUDIO CODE……………………………………………………...……….37

~ vi ~
LIST OF FIGURES

Figure 1.1:-CHARTS……………………………………………………………………….………...1

Figure 1:- Dataset Flow Diagram. ........................................................................................................7

Figure 2:- Available Datasets…………………………………………………………………………9

Figure 2.1:- Data Processing……………………………………………………………………..….10

Figure 3:- SIGN LANGUAGE USES IN THE 2023 WORLD……………………………………..11


Figure 4:- Sign Language Users……………………………………………………………………..11
Figure 5:- A to Z Signs……………………………………………………………………………....16

Figure 6:- 0 to 9 Signs……………………………………………………………………………….16

Figure 7:- Basic Use of Signs………………………………………………………..………………17

Figure 8:- Use of Sign Language in Modern world………………………………………………….18

Figure 9:-Artificial Intelligence(AI)………………..………………………………………………..19

~ vii ~
CHAPTER 1
INTRODUCTION
Sign language, a rich and expressive form of communication primarily used
by the Deaf and Hard of Hearing community, serves as a bridge for those who navigate a world
without sound. Despite its significance, there exists a communication gap between individuals
capable in sign language and those who are not. Recognizing the importance of complete
communication, this project examines into the development of a Sign Language Recognition and
Interpretation system using Python. Through the combination of computer vision and machine
learning techniques, we aim to create a powerful tool that not only recognizes diverse sign language
gestures but also interprets them, thereby facilitating unified communication between individuals
with hearing impairments and the broader community.

The motivation for this project


stems from the deep impact that technology can
have on breaking down barriers and developing
inclusivity. With the advancement of deep learning
and computer vision, the potential for developing
healthy and accurate sign language recognition
systems has become increasingly promising. By Figure 1.1 CHARTS
harnessing these technologies, we seek to contribute to the creation of an
environment where communication is accessible to all, regardless of their ability in sign language.

In recent years, there has been a growing recognition of the importance of


inclusivity in technological solutions. The Deaf and Hard of Hearing community, which establishes
a significant portion of the global population, faces unique challenges in accessing information and
communicating effectively in a mostly hearing world. Traditional methods of linking this
communication gap, such as relying on interpreters or written communication, have limitations in
terms of real-time interaction and freedom. A technologically advanced solution, capable of
recognizing and interpreting sign language gestures, can significantly enhance the communication
experience for individuals with hearing impairments, promoting independence and nurturing a sense
of belonging.

~1~
Python, with its versatility and wide libraries, serves as an ideal programming language
for this project. The project involves the utilization of computer vision techniques to capture and
process video input, extracting meaningful features from sign language gestures. Additionally, the
implementation of machine learning algorithms, particularly deep learning models, plays a critical
role in training the system to exactly recognize and interpret these gestures. The essential flexibility
of Python allows for unified integration of these components, resulting in a solid and efficient
system.

The scope of this project extends beyond the technical aspects, highlighting the social
impact and broader implications of raising inclusivity (;dfj]zLtf). Effective communication is
fundamental to human connection and understanding, and this project aims to empower individuals
with hearing impairments by providing them with a tool that enables them to express themselves
freely and interact with others without barriers. Moreover, by creating a system that can interpret
sign language in real-time, we imagine a future where communication is not only accessible but also
rapid, promoting a broader and understanding society.

As we embark on this journey to develop a Sign Language Recognition and


Interpretation system using Python, we are driven by the belief that technology can be a powerful
force for positive change. This project stands as a witness to our commitment to leveraging
innovative solutions to address real-world challenges/problems, with the ultimate goal of creating a
more inclusive and connected world for everyone, regardless of their hearing abilities.

1.1 OBJECTIVES

Here are the main objectives of building a Sign Language Recognition and Interpretation system:

✓ To develop a robust system for recognizing and interpreting sign language gestures
accurately.
✓ To ensure the system can process video input in real-time to provide timely interpretations.
✓ To integrate a user-friendly interface for seamless interaction and display of interpreted sign
language.
✓ To design the system to be accessible to users with varying abilities and accommodate
different sign language.

~2~
CHAPTER 2

IMPORTANCE

The Sign Language Recognition and Interpretation project using Python hold
immense importance in promoting inclusivity and breaking communication barriers for individuals
with hearing impairments. Here are key points highlighting its significance:

1. Accessibility Enhancement:

- The project plays a key role in making information accessible to the deaf and hard-of-hearing
community, allowing them to engage in real-time conversations without relying solely on traditional
sign language interpreters.

2. Facilitating Education:

- In educational settings, the system can bridge communication gaps between students with hearing
impairments and their peers and teachers, facilitating a more complete learning environment.

3. Independence and Autonomy:

- The system promotes independence by empowering individuals with hearing impairments to


communicate freely, reducing dependence on interpreters and enhancing their overall quality of life.

4. Technological Innovation in Assistive Technologies:

- This project demonstrates the potential of technology to create innovative solutions for
individuals with various communication needs, paving the way for further advancements in assistive
technologies that provide to a broad range of disabilities.

5. Community Integration:

- By facilitating communication between individuals with hearing impairments and the broader
community, the project promotes understanding, empathy, and integration, raising a society that
embraces diversity.

~3~
DISADVANTAGES

While a Sign Language Recognition and Interpretation system offers numerous


benefits, it is essential to consider potential disadvantages and challenges. Here are some aspects to
be mindful of:

1. Limited Accuracy:

- Achieving high accuracy in recognizing a wide range of sign language gestures can be
challenging. Variability in signing styles, lighting conditions, and the difficulty of certain gestures
may lead to inaccuracies.

2. Limited Gesture Vocabulary:

- Depending on the system's design and complexity, there may be limitations in recognizing a vast
vocabulary of sign language gestures. This could be a challenge for users who rely on a broader
range of expressions.

3. Cultural Sensitivity:

- Sign language can vary across cultures and regions. A system designed for one sign language
may not be applicable or effective for others, necessitating adaptation or customization for different
user groups.

4. Lack of Physical Interaction:

- Traditional sign language involves physical nuances and expressions that may be challenging to
capture accurately in a digital format. The loss of these subtleties can impact the richness of
communication.

~4~
CHAPTER 3

SOFTWARE REQUIREMENT SPECIFICATION

3.1 Product Perspective

The Sign Language Recognition and Interpretation system operates within the
broader context of assistive technologies, aiming to enhance communication for individuals with
hearing impairments. As a component of the assistive technology landscape, its product perspective
is rooted in inclusivity, accessibility, and technological innovation. Integrated seamlessly into users'
daily lives, the system serves as a bridge between the deaf and hearing communities, fostering a
more inclusive society. Its design considers the user's perspective, offering a user-friendly interface
that prioritizes ease of use, real-time interpretation, and customization options. The product aligns
with the evolving landscape of assistive technologies, contributing to a future where technology
plays a vital role in breaking down communication barriers and empowering individuals with diverse
needs.

3.1.1 System Interfaces

❖ User Interfaces
▪ This section follows specific interaction protocols to handle user inputs, gestures, or
commands effectively. This could involve defining how the system interprets different user
actions and responds accordingly.
▪ Real-time interpretation results are displayed prominently, ensuring immediate feedback for
users.
❖ Hardware Interfaces
▪ Laptop/Desktop/PC
▪ The system typically relies on a camera interface to capture video input, either from a
webcam or an external camera. High-quality cameras with good resolution and frame rates
enhance the accuracy of gesture recognition.

▪ The output of the system, such as real-time interpretation, textual feedback, or graphical
elements, is presented through monitors, touchscreens, or other visual output devices.

~5~
3.1.2 System Requirements
❖ Hardware requirement
▪ Minimum 8 GB RAM for efficient handling of datasets and real-time processing
▪ High-resolution display (Full HD or higher) for clear presentation of interpreted
gestures.
▪ Multi-core processor (e.g., Intel Core i5 or equivalent).

▪ Built-in or external microphone for clear audio input (if speech mixture or recognition
is included).
❖ Software requirement
▪ Windows 10, MacOS, or Linux.
▪ OpenCV.
▪ Visual Studio Code or Atom.

~6~
CHAPTER 4

DATASET FLOW

• Data Collection
1

• Data Processing
2

• Split into Training & Testing Sets


3

• Augmentation
4

• Model Training
5

• Model Evaluation
6

• Model Deployment
7

• Real-time Hand Tracking


8

• Sign Interpretation
9
Figure 2 Dataset Flow Diagram

~7~
Creating a dataset flow diagram for sign language recognition involves
multiple steps, including data collection, preprocessing, and splitting into training and testing sets.
Below is a simplified dataset flow diagram:

Let's break down each step:

1. Data Collection:

- Collect a dataset of sign language gestures. Ensure that it is labeled with the corresponding signs.

2. Data Preprocessing:

- Preprocess the images to make them suitable for training. This may include resizing,
normalization, and other image preprocessing techniques.

3. Split into Training and Testing Sets:

- Divide the dataset into training and testing sets to evaluate the model's performance accurately.

4. Augmentation:

- Optionally, perform data augmentation to increase the diversity of the training set. This may
include random rotations, flips, and other transformations.

5. Model Training:

- Train the Convolutional Neural Network (CNN) using the preprocessed and augmented dataset.

6. Model Evaluation:

- Evaluate the trained model on the testing set to assess its accuracy and generalization
performance.

7. Model Deployment:

- Deploy the trained model for real-time sign language recognition.

8. Real-time Hand Tracking:

- Implement hand tracking using a library like OpenCV or MediaPipe to detect and track the user's
hand in real-time.

9. Sign Interpretation:

- Based on the detected hand landmarks, map them to the corresponding sign and provide an
interpretation.

~8~
CHAPTER 5

DATASETS
In this chapter, we discuss the dataset and present the datasets used in this project, namely the
AMERICAN SIGN LANGUAGE (ASL) dataset and a new dataset. We are going to discuss some
basics dataset of Sign Language which can be used for the interpretation of sign, gestures etc. Due to
increasing number of the hearing impairment, we discovered that this project may be able to help
them to maintain themselves in the society without any barriers.

As per my knowledge, there are several publicly available datasets for sign language recognition.
However, it's essential to check for updates and new datasets, as the field of computer vision and
machine learning is continually evolving. Here are some datasets commonly used for sign language
recognition:

3.1 Available Datasets and Limitations

There exist several datasets that have been used for Sign Language Recognition and Interpretation,
as shown in Table 2.

DATASETS GESTURE
American Sign Language (ASL) 4500
MSRC-12 Kinect Gesture 6244
German Sign Language (DGS) <300
Chinese Sign Language (CSL) 5000
Figure 3 Available Datasets

ASL originated in the early 19th century in the American School for the Deaf (ASD)
in Hartford, Connecticut, from a situation of language contact. Since then, ASL use has been
broadcasted widely by schools for the deaf and Deaf community organizations. Despite its wide use,
no accurate count of ASL users has been taken. Reliable estimates for American ASL users range
from 250,000 to 500,000 persons, including a number of children of deaf adults and other hearing
individuals. The Microsoft Research Cambridge-12 (MSRC-12) Kinect gesture data set consists of
sequences of human movements, represented as body-part locations, and the associated gesture to be
recognized by the system. German Sign Language or Deutsche Gebärdensprache (DGS), is the sign
language of the deaf community in Germany, Luxembourg, and in the German-speaking community
of Belgium. It is unclear how many use German Sign Language as their main language; Gallaudet
University estimated 50,000 as of 1986. The language has evolved through use in deaf communities
over hundreds of years.

Among all those Sign language we used ASL in our project because it made our project fast, easy
and accurate.

~9~
Dataset
We have used multiple datasets and trained multiple models to achieve good accuracy.

3.1.1 ASL Alphabet


The data is a collection of images of the alphabet from the American Sign Language, separated into
29 folders that represent the various classes. The training dataset consists of 87000 images which are
200x200 pixels. There are 29 classes of which 26 are English alphabets A-Z and the rest 3 classes
are SPACE, DELETE, and, NOTHING. These 3 classes are very important and helpful in real-time
applications.

3.1.2Sign Language Gesture Images Dataset


The dataset consists of 37 different hand sign gestures which include A-Z alphabet gestures, 0-9
number gestures, and also a gesture for space which means how the deaf (hard
hearing) and dumb people represent space between two letters or two words
while communicating. Each gesture has 1500 images which are 50x50 pixels,
so altogether there are 37 gestures which means there 55,500 images for all
gestures. Convolutional Neural Network (CNN) is well suited for this dataset
for model training purposes and gesture prediction.

3.2 Data Pre-processing


An image is nothing more than a 2-dimensional array of numbers or pixels
which are ranging from 0 to 255.Typically, 0 means black, and 255 means
white. Image is defined by mathematical function f(x,y) where ‘x’ represents
horizontal and ‘y’ represents vertical in a coordinate plane. The value of f(x,
y) at any point is giving the pixel value at that point of an image.

Image Pre-processing is the use of algorithms to perform operations on Figure 2.1 Data Processing
images. It is important to Pre-process the images before sending the images
for model training. For example, all the images should have the same size of 200x200 pixels. If not,
the model cannot be trained.

The steps we have taken for image Pre-processing are:

➢ Read Images.
➢ Resize or reshape all the images to the same
➢ Remove noise.
➢ All the image pixels arrays are converted to 0 to 255 by dividing the image array by 255.

~ 10 ~
SIGN LANGUAGE

11%
18%

Deaf People
Undeaf People
Sign Language User

71%

Figure 3 SIGN LANGUAGE USES IN THE 2023 WORLD

5
Numbers in Billion

Deaf
3
Undeaf
Sign Language user
2

0
2018 2019 2020 2021 2022 2023

Figure 4 Sign Language Users

~ 11 ~
CHAPTER 6

SCOPE OF THE PROJECT


The scope of ASL in computer vision and machine learning is continually
evolving as researchers and developers work towards creating more accurate, robust, and inclusive
solutions for sign language users. Advances in this field have the potential to significantly improve
communication and accessibility for individuals who rely on sign language as their primary means of
expression. Here are some key aspects of the scope:

1. Sign Language Recognition:

- Development of algorithms and models for recognizing and interpreting sign language gestures.
This includes recognizing individual signs, understanding sequences of signs to form words or
sentences, and potentially distinguishing between different signers.

2. Gesture-to-Text Translation:

- Translating sign language gestures into written text, enabling communication between sign
language users and those who may not understand sign language.

3. Real-Time Interaction:

- Implementing real-time systems for sign language recognition, enabling instant communication
between signers and non-signers using technology like webcams or depth sensors.

4. Educational Tools:

- Developing educational tools and applications to assist in learning and practicing sign language.
This can include interactive games, tutorials, and feedback systems.

5. Multimodal Approaches:

- Exploring multimodal approaches that combine visual information with other modalities, such as
facial expressions or body language, to improve the accuracy and richness of sign language
interpretation.

6. Inclusive Technology Development:

- Ensuring that technology designed for sign language recognition is inclusive, user-friendly, and
respects the cultural and linguistic aspects of the signing community

~ 12 ~
CHAPTER 7

SYSTEM ARCHITECTURE
The system architecture for an American Sign Language (ASL) recognition
system involves the coordination of components across the frontend, backend, and machine learning
model deployment. Here's a high-level overview of the system architecture:

❖ Frontend Architecture:

1. User Interface (UI):

- Provides an interactive interface for users to interact with the ASL recognition system.

2. Communication:

- Communicates with the backend through API requests, sending captured video frames for
processing.

❖ Machine Learning Model:

1. ASL Recognition Model:

- Developed using a deep learning framework like Tensor Flow or PyTorch.

- Trained on a dataset containing labeled ASL gestures.

2. Model Training:

- Can be performed offline, and the trained model is then deployed to the backend.

This architecture provides a foundation for building a scalable, maintainable, and efficient ASL
recognition system.

~ 13 ~
CHAPTER 8

FUCTION & FUNCTIONALITIES


Here's a brief overview of key features and functionalities for an American Sign Language (ASL)
recognition system:

❖ Features:

1. Real-Time Recognition:

- Recognizes ASL gestures in real-time, providing instant feedback.

2. User-Friendly Interface:

- Natural and easy-to-use frontend interface for capturing and processing ASL signs.

3. Multiple Sign Language Support:

- Capable of recognizing signs from various sign languages, not limited to a specific region.

4. Educational Tools:

- Provides educational features to assist users in learning and practicing ASL gestures.

❖ Functionalities:

1. Image Processing:

- Utilizes OpenCV for image processing, including hand detection and tracking.

2. Machine Learning Model Deployment:

- Deploys a trained ASL recognition model using frameworks like Tensor Flow, Serving or Flask.

3. Documentation and Logging:

- Includes documentation for API usage and logs relevant information for monitoring and
debugging.

~ 14 ~
CHAPTER 9

RESEARCH METHODOLOGY
When conducting research on sign language recognition and interpretation using Python, we will
typically follow a structured research methodology. Here is a general research methodology outline:

1. Define the Problem:

Clearly clear the problem you aim to address, such as improving sign language communication
through automatic recognition and interpretation.

2. Research Objectives:

Clearly outline the specific goals and objectives of your research. This could include improving
accuracy, exploring real-time processing, or addressing specific challenges in sign language
recognition.

3. Data Collection:

Describe how you collect your sign language dataset. This might involve using existing datasets,
capturing your own data, or a combination of both

4. Preprocessing:

Detail the steps taken to preprocess the data, including resizing images, normalizing pixel values,
and any other necessary transformations. Ensure that your data is suitable for input into your chosen
model.

5. Training:

Discuss the training process, including how you split your dataset into training and testing sets, the
number of periods, and any hyperparameter tuning. Mention the optimization algorithm and loss
function used.

6. Evaluation:

Present the metrics used to evaluate your model's performance. Accuracy, precision, recall, and F1-
score are common metrics. Compare your results to existing approaches and discuss any limitations.

7. Results and Discussion:

Present your results, including any visualizations or graphs. Discuss the strengths and weaknesses of
your approach and compare it to other methods in the literature.

~ 15 ~
CHAPTER 10

BASIC HAND SIGN USED IN

AMERICAN SIGN LANGUAGE (ASL)

Figure 5 A to Z Signs

Figure 6:- 0 to 9 signs

~ 16 ~
Figure 7 Basic Uses of Signs

~ 17 ~
Figure 8 Uses of Sign Languages in Modern World

~ 18 ~
CHAPTER 11

CONCLUSIONS AND FUTURE WORK


In conclusion, we were successfully able to develop a practical and meaningful

system that can able to understand sign language and translate that to the corresponding text. There

are still many shortages of our system like this system can detect 0-9 digits and A-Z alphabets hand

gestures but doesn’t cover body gestures and other dynamic gestures. We are sure and it can be

improved and developed in the future.

Creating a sign language

recognition and interpretation system using Python is a

multifaceted process. Beginning with dataset collection

and preprocessing, the journey extends to selecting and

training a suitable machine learning model. Figure 9 Artifical Intilligence

The development of a real-time interface is fundamental, incorporating video frame capture, model

calculation, and user-friendly display. Interpretation logic, mapping model outputs to meaningful

gestures or translated text, enhances the system's usability. Optional features, like integrating text-to-

speech, further extend accessibility. Difficult testing, ongoing optimization, and user feedback loops

refine the system's accuracy and responsiveness. Once achieving satisfactory performance, the

system is ready for deployment, contributing to inclusive communication by making sign language

more accessible. This comprehensive approach ensures a healthy and effective solution, raising a

bridge between sign language users and the broader community. And we believe that AI can take

over this technology in the upcoming days, which will help all the people suffering from the hearing

problems.

~ 19 ~
REFRENCES

1. https://www.wikipedia.org

2. https://www.google.com

3. https://www.dummies.com/article/academics-the-arts/language-language-arts/learning-

languages/american-sign-language/signing-for-dummies-cheat-sheet-208315/

4. https://www.ai-media.tv/knowledge-hub/insights/sign-language-alphabets/

5. https://www.mathplanet.com/education/programming?gclid=CjwKCAiA75itBhA6EiwAkho9e0fj

0J2VzcnJx08uuL4akrZBZQKOsHgUOTENM_YHn1DIqxzVHbJsVBoC-AkQAvD_BwE!/

6. https://opencv.org/

7. https://www.tensorflow.org/

8. https://pytorch.org/

9. https://medium.com/@20it105/sign-language-recognition-using-python-74ef7ea43181

10. https://www.who.int/

11. https://www-i6.informatik.rwth-aachen.de/aslr/database-rwth-boston-104.php

12. https://docs.opencv.org/4.x/

13. https://www.nature.com/articles/s41598-023-43852-x

14. https://towardsdatascience.com/sign-language-recognition-with-advanced-computer-vision-

7b74f20f3442

15. https://books.google.com.np/books?id=HHetDwAAQBAJ&printsec=frontcover&redir_esc=yv=

onepage&q&f=false

~ 20 ~
16."Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron

17."Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville

18."Recent Advances in Deep Learning for Speech and Sign Language Processing: A Review" by
Jiawei Zhang, Lijun Deng, et al. (Published in IEEE Access)

19."Real-Time American Sign Language Recognition from Video Sequences Using Convolutional
Neural Networks" by Jonathan Michaux, Thibault Lefebvre, and Denis Hamad

20."Deep Sign: Deep Learning for Sign Language Recognition" by Mohammed Elsahili, Salah
Brahim, and Mohamed Atri

21."Sign Language Recognition with Convolutional Neural Networks: A Thorough Review" by


A. Chaudhary, A. Sahasrabudhe

22."American Sign Language Alphabet Recognition Using Microsoft Kinect" by R. A. Beksi, W.


Ismail

23."Sign Language Recognition: A Comparative Review of Datasets and Methods" by I. S. Cheah,


C. Quek, A. F. M. Hani

24."Sign Language Recognition (SLR) using Machine Learning: A Review" by Brijendra Singh,
Babita Pandey

~ 21 ~
CODING

1.1 Data collection code

import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import time

cap = cv2.VideoCapture(0, cv2.CAP_DSHOW) # For webcam


detector = HandDetector(maxHands=1) # Detecting number of hands

offset = 20
imgSize = 300

folder = "Data/Y"
counter = 0

while True:
success, img = cap.read()
hands, img = detector.findHands(img)
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255

imgCrop = img[y - offset: y + h + offset, x - offset:x + w + offset]

~ 22 ~
imgCropShape = imgCrop.shape

aspectRatio = h / w

if aspectRatio > 1:
k = imgSize / h
wCal = math.ceil(k * w)
imgResize = cv2.resize(imgCrop, (wCal, imgSize))
imgResizeShape = imgResize.shape
wGap = math.ceil((imgSize - wCal) / 2)
imgWhite[:, wGap:wCal + wGap] = imgResize

else:
k = imgSize / w
hCal = math.ceil(k * h)
imgResize = cv2.resize(imgCrop, (imgSize, hCal))
imgResizeShape = imgResize.shape
hGap = math.ceil((imgSize - hCal) / 2)
imgWhite[hGap:hCal + hGap, :] = imgResize

cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)

cv2.imshow("Image", img)
key = cv2.waitKey(1)
if key == ord("s"):
counter += 1
cv2.imwrite(f'{folder}/Image_{time.time()}.jpg', imgWhite)
print(counter)

~ 23 ~
1.2 Test Code

import cv2

from cvzone.HandTrackingModule import HandDetector

from cvzone.ClassificationModule import Classifier

import numpy as np

import math

cap = cv2.VideoCapture(0)

detector = HandDetector(maxHands=1)

classifier = Classifier("Model/keras_model.h5", "Model/labels.txt")

offset = 20

imgSize = 300

folder = "Data/C"

counter = 0

labels = ["A", "B", "C", "D", "E", "F", "G", "H", "X", "Y"]

while True:

success, img = cap.read()

imgOutput = img.copy()

hands, img = detector.findHands(img)

~ 24 ~
if hands:

hand = hands[0]

x, y, w, h = hand['bbox']

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255

imgCrop = img[y - offset:y + h + offset, x - offset:x + w + offset]

imgCropShape = imgCrop.shape

aspectRatio = h / w

if aspectRatio > 1:

k = imgSize / h

wCal = math.ceil(k * w)

imgResize = cv2.resize(imgCrop, (wCal, imgSize))

imgResizeShape = imgResize.shape

wGap = math.ceil((imgSize - wCal) / 2)

imgWhite[:, wGap:wCal + wGap] = imgResize

prediction, index = classifier.getPrediction(imgWhite, draw=False)

print(prediction, index)

~ 25 ~
else:

k = imgSize / w

hCal = math.ceil(k * h)

imgResize = cv2.resize(imgCrop, (imgSize, hCal))

imgResizeShape = imgResize.shape

hGap = math.ceil((imgSize - hCal) / 2)

imgWhite[hGap:hCal + hGap, :] = imgResize

prediction, index = classifier.getPrediction(imgWhite, draw=False)

cv2.rectangle(imgOutput, (x - offset, y - offset - 50),

(x - offset + 90, y - offset - 50 + 50), (255, 0, 255), cv2.FILLED)

cv2.putText(imgOutput, labels[index], (x, y - 26), cv2.FONT_HERSHEY_COMPLEX, 1.7,

(255, 255, 255), 2)

cv2.rectangle(imgOutput, (x - offset, y - offset),

(x + w + offset, y + h + offset), (255, 0, 255), 4)

cv2.imshow("ImageCrop", imgCrop)

cv2.imshow("ImageWhite", imgWhite)

cv2.imshow("Image", imgOutput)

cv2.waitKey(1)

~ 26 ~
EXPLATION OF DATA COLLECTION CODE
The provided code captures hand images from a webcam using the `cvzone` library for hand
tracking. It creates a dataset for sign language gestures by saving the segmented hand images into a
specified folder. Here's an explanation of the Data Collection code:

import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import time

The necessary libraries are imported, including OpenCV for computer vision, `cvzone` for hand
tracking, NumPy for numerical operations, and other standard libraries.

cap = cv2.VideoCapture(0, cv2.CAP_DSHOW) # For webcam


detector = HandDetector(maxHands=1) # Detecting number of hands

The code initializes the webcam using OpenCV and sets up a hand detector to track hands in the
video stream.

offset = 20
imgSize = 300

Variables `offset` and `imgSize` are defined to add a border around the captured hand and set the
size of the final cropped hand image.

~ 27 ~
folder = "Data/Y"
counter = 0

The `folder` variable specifies the directory where the captured images will be saved. The `counter`
variable is used to keep track of the number of captured images.

while True:
success, img = cap.read()
hands, img = detector.findHands(img)

The code continuously captures video frames from the webcam, detects hands in the frames using
the `HandDetector` object, and retrieves the list of detected hands.

if hands:
hand = hands[0]
x, y, w, h = hand['bbox']

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255

imgCrop = img[y - offset: y + h + offset, x - offset:x + w + offset]

If hands are detected, the code extracts the bounding box coordinates (`x`, `y`, `w`, `h`) of the
detected hand. It then creates a white canvas (`imgWhite`) and crops the hand region from the
original frame (`imgCrop`).

~ 28 ~
# Image resizing and centering
# ...

cv2.imshow("ImageCrop", imgCrop)
cv2.imshow("ImageWhite", imgWhite)

The code resizes and centers the cropped hand image to fit into a square canvas (`imgWhite`). It then
displays both the cropped hand image (`imgCrop`) and the resized image (`imgWhite`) in separate
windows.

cv2.imshow("Image", img)
key = cv2.waitKey(1)
if key == ord("s"):
counter += 1
cv2.imwrite(f'{folder}/Image_{time.time()}.jpg', imgWhite)
print(counter)

The original frame with hand tracking annotations is displayed in a separate window. If the 's' key is
pressed, the current hand image (`imgWhite`) is saved as a JPEG file in the specified folder with a
unique timestamped filename.

~ 29 ~
EXPLATION OF TEST CODE

This code uses the `cvzone` library to perform hand detection and gesture classification in real-time.

Here's an explanation of the key parts of the code:

import cv2

from cvzone.HandTrackingModule import HandDetector

from cvzone.ClassificationModule import Classifier

import numpy as np

import math

The necessary libraries are imported, including OpenCV (`cv2`), `cvzone` for hand tracking and

gesture classification, NumPy for numerical operations, and the `math` library.

cap = cv2.VideoCapture(0)

detector = HandDetector(maxHands=1)

classifier = Classifier("Model/keras_model.h5", "Model/labels.txt")

The code initializes the webcam using OpenCV, sets up a hand detector to track hands in the video

stream, and loads a pre-trained classifier model and labels for gesture classification using the

`Classifier` class from `cvzone`.

~ 30 ~
offset = 20

imgSize = 300

folder = "Data/C"

counter = 0

labels = ["A", "B", "C", "D", "E", "F", "G", "H", "X", "Y"]

Variables `offset`, `imgSize`, `folder`, and `counter` are defined. The `offset` is used to add a border

around the captured hand, `imgSize` sets the size of the final cropped hand image, `folder` specifies

the directory where the captured images will be saved, and `counter` is used to keep track of the

number of captured images. `labels` is a list of gesture labels.

while True:

success, img = cap.read()

imgOutput = img.copy()

hands, img = detector.findHands(img)

The code continuously captures video frames from the webcam, detects hands in the frames using

the `HandDetector` object, and retrieves the list of detected hands.

~ 31 ~
if hands:

hand = hands[0]

x, y, w, h = hand['bbox']

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255

imgCrop = img[y - offset:y + h + offset, x - offset:x + w + offset]

If hands are detected, the code extracts the bounding box coordinates (`x`, `y`, `w`, `h`) of the

detected hand. It then creates a white canvas (`imgWhite`) and crops the hand region from the

original frame (`imgCrop`).

# Image resizing and centering

# ...

prediction, index = classifier.getPrediction(imgWhite, draw=False)

print(prediction, index)

The code resizes and centers the cropped hand image to fit into a square canvas (`imgWhite`). It then

uses the `getPrediction` method from the `Classifier` object to classify the hand gesture and obtain

the prediction result and index.

~ 32 ~
# Drawing bounding boxes, labels, and rectangles on the output image

# ...

cv2.imshow("ImageCrop", imgCrop)

cv2.imshow("ImageWhite", imgWhite)

cv2.imshow("Image", imgOutput)

cv2.waitKey(1)

The code then displays the cropped hand image (`imgCrop`), the resized hand image (`imgWhite`),

and the original frame with bounding boxes, labels, and rectangles drawn on it (`imgOutput`). The

loop continues until the user presses a key.

~ 33 ~
EXPECTED OUTPUT

~ 34 ~
~ 35 ~
~ 36 ~
CODES IN VISUAL STUDIO CODE

~ 37 ~

You might also like