0% found this document useful (0 votes)
44 views81 pages

Blackbook

The document describes a project to help blind people by creating a wearable visual aid system. The system uses speech commands and identifies objects to help blind people navigate and be aware of their surroundings. It aims to transform the visual world into audio to inform blind people about objects. The system uses a Raspberry Pi for its low cost and feasibility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views81 pages

Blackbook

The document describes a project to help blind people by creating a wearable visual aid system. The system uses speech commands and identifies objects to help blind people navigate and be aware of their surroundings. It aims to transform the visual world into audio to inform blind people about objects. The system uses a Raspberry Pi for its low cost and feasibility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION, MUMBAI

“ Blind People Helping Hand ”

A Project Report

Submitted by:

Sr. No. Name of Student Exam Seat


No.
1) Neve Sujal Rajesh 386320
2) Khairnar Megha Himmat 386313
3) Gamane Amrisha Gokul 386303

In partial fulfillment for the award of the Diploma Engineering


in the course Computer Technology at

Department of Computer Technology


K. K. WAGH POLYTECHNIC, NASHIK
Academic Year 2022-23
MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION, MUMBAI
A
Project Report
on

“ Blind People Helping Hand ”

Submitted by:

Sr. No. Name of Student Exam Seat


No.
1) Neve Sujal Rajesh 386320
2) Khairnar Megha Himmat 386313
3) Gamane Amrisha Gokul 386303

Under the Guidance of:


Name of Guide: Mr. S.H.Sangale
Designation : Lecturer in Computer Technology

Department of Computer Technology


K. K. WAGH POLYTECHNIC, NASHIK
Academic Year 2022-23
K. K. Wagh Education Society’s

K. K. WAGH POLYTECHNIC
HirabaiHaridasVidyanagari, Amrutdham, Panchavati, Nashik-422003, Maharastra

Certificate
This is certify that :

Name of Student Class Enrolment Exam Seat


No. No.
1. Neve Sujal Rajesh TYCM-II 2000780311 386320
2. Khairnar Megha Himmat TYCM-II 2000780303 386313
3. Gamane Amisha Gokul TYCM-II 2000780293 386303

From the institute - K. K. Wagh Polytechnic, Nashik has completed the Project (Capstone
Project Planning and Execution (CPE)) for their final year having title
“Blind People Helping Hand” during the Academic Year 2022-23 in the partial fulfillment
of Diploma in Computer Technology. The project is completed by in a group consisting of 3
persons under the guidance of the Faculty Guide.

Date : 18 / 04 / 2023
Place : Nashik

Prof. S. H. Sangale Prof.G. B. Katkade


Internal Faculty Guide HOD- Computer Technology

Seal of
Institute

( Sign of Industry Guide/ Mentor) Prof. P. T. Kadave


Name: Mr. Gauresh Suryawanshi Principal – K. K. Wagh Polytechnic, Nashik
Designation: CEO
Contact: 8600033181
Email ID : [email protected]
Page 3: Sponsor’s Certificate:
Page 4: Achievement / Participation Certificate
ACKNOWLEDGEMENT

With deep sense of gratitude, we would like to thanks all the people who have lit our path with
their kind guidance. We are very grateful to these intellectuals who did their best to help during
our project work.
It is our proud privilege to express deep sense of gratitude to, Prof. P. T. Kadave,
Principal, K. K. Wagh Polytechnic, Nashik for his comments and kind permission to complete
this project. We remain indebted to Prof. G.B. Katkade, Head of Computer Technology
Department for his timely suggestion and valuable guidance.
The special gratitude goes our external guide Gauresh Suryawanshi CEO at our Industry
Sponsor R3 Systems Private Limited Nashik and Internal Guide Mr. S.H. Sangale staff
members, technical staff members, of Computer Technology Department for his expensive,
excellent and precious guidance in completion of this work. We thanks to all the class
colleagues for their appreciable help for our working project.

With various industry owners or lab technicians to help, it has been our endeavor to
through out our work to cover the entire project work.

We also thankful to our parents who providing their wishful support for our project
completion successfully. Lastly we thanks to our all friends and the people who are directly or
indirectly related to our project work.

Names of Students
1) Neve Sujal Rajesh Class: TYCM-II
2) Khairnar Megha Himmat Class: TYCM-II
3) Gamane Amrisha Gokul Class: TYCM-II
Institute Vision : -Strive to empower students with Quality Technical Education.

Institute Mission :- Committed to develop students as Competent and Socially Responsible


Diploma Engineers by inculcating learning to learn skills, values and ethics, entrepreneurial attitude,
safe and eco-friendly outlook and innovative thinking to fulfill aspirations of all the stakeholders
and contribute in the development of Organization, Society and Nation.

Department Vision :-To impart quality technical education for development of technocrats.

Department Mission:-

M1- To provide quality in education and facilities for students to help them to achieve higher
academic career growths.
M2- To impart education to meet the requirements of the industry and society by technological
solutions.
M3- Develop technical & soft skill through co–curricular and extra-curricular activities for
improving personality.

Program Educational Objectives: -

PEO1: Provide socially responsible, environment friendly solutions to Computer engineering


related broad-based problems adapting professional ethics.
PEO2: Adapt state-of-the-art Computer engineering broad-based technologies to work in multi-
disciplinary work environments.

PEO3: Solve broad-based problems individually and as a team member communicating effectively
in the world of work.

Program Specific Outcome:-(Version – 1.2)

PSO 1: Computer Software and Hardware Usage: Use state-of-the-art technologies for operation
and application of computer software and hardware.

PSO 2: Computer Engineering Maintenance: Maintain computer engineering related software


and hardware systems.
Program Outcomes:-
PO1. Basic and Discipline specific knowledge: Apply knowledge of basic mathematics, science and
engineering fundamentals and engineering specialization to solve the engineering problems.

PO2. Problem analysis: Identify and analyze well-defined engineering problems using codified
standard methods.

PO3. Design/ development of solutions: Design solutions for well-defined technical problems and
assist with the design of systems components or processes to meet specified needs.

PO4. Engineering Tools, Experimentation and Testing: Apply modern engineering tools and
appropriate technique to conduct standard tests and measurements.

PO5. Engineering practices for society, sustainability and environment: Apply appropriate
technology in context of society, sustainability, environment and ethical practices.

PO6. Project Management: Use engineering management principles individually, as a team member
or a leader to manage projects and effectively communicate about well-defined engineering
activities.

PO7. Life-long learning: Ability to analyse individual needs and engage in updating in the context
of technological changes.
ABSTRACT
Blind people are facing many problems in their day-to-day life. They find it very difficult to read books
as well as to recognize the object in front of them. From the survey of WHO,30 million peoples are
permanently blind and 285 billion peoples with vision impairment. if you notice them, you can very well know
about it they can’t walk without the help of other. one has to ask guidance to reach their destination. they have
to face more struggles in their life daily life. Visually impaired people find difficulties detecting obstacles in
front of them, during walking in the street, which makes it dangerous. A lots of blind people suffer in their
own lives because of their vision loss. Vision is one of the five important senses in the human body. People
with Vision loss have their own disability. Many countries around the world provide special assistance to these
needed people to improve their life quality as good as possible. They provide them with special equipment for
their disability to improve their daily life like voice message service, electronic stick that guide them while
moving around and other specialized equipment. Blind people can't even walk without any aid. Many times
they rely on others for help. Several technologies for the assistance of visually impaired people have been
developed. This paper proposes a system for blind people. The proposed system aims to create a wearable
visual aid for visually impaired people in which speech commands are accepted by the user. Its functionality
addresses the identification of objects .This will help the visually impaired person to manage day-to-day
activities through his/her surroundings. To help the blind people the visual world has to be transformed into
the audio world with the potential to inform them about objects. Therefore, we propose to aid the visually
impaired by introducing a system that is most feasible, compact, and cost effective. So, we implied a system
that makes use of Raspberry Pi. Raspberry Pi is used to implement artificial vision using python language on
the Open CV platform. The system consists of a webcam interfaced with raspberry pi. Pi Camera detects and
find the type of object with the help of ultrasonic sensor. Ultrasonic sensor to detect the real time hurdles while
walking on the roads. The ultrasonic sensor used in this project plays a vital role. It detects the object in front
of this. When object is detected a indication sound is given to the user via earphone. While they hear this sound
they can know an obstacle in front of them. The system is intended to provide Raspberry Pi. The proposed
system detects an object around them and sends feedback in the form of speech, warning messages via
earphone. This new system may solve some of major problems of blind persons that are still existing. The aim
of all these systems is to help the user to detect object without the help of a second person. This system is to
provide a low cost and efficient obstacle detection aid for blind which gives a sense of artificial vision by
providing information about the environmental scenario of static and dynamic object around them, so that they
can walk independently. This paper combines the functionality of Object detection. The objective is to develop
user friendly application.

Keywords: Ultrasonic Sensor, Raspberry Pi (R-pi)), Object detection, Pi Camera, Speech Output, Earphone
Table of Contents
Sr.No. Name of topic Page no.
Certificate
Sponsor’s Certificate
Achievements / Participation Certificates
Acknowledgement
Abstract
Table of contents
1 Introduction
1.1 Introduction
1.2 Problem Definition
1.3 Purpose
1.4 Need
1.5 Objectives
1.6 Scope
1.7 System Architecture
2 Literature Survey
2.1 Literature Survey
2.2 Existing System
2.3 Proposed System
3 Project Requirement Analysis and Feasibility
3.1 Requirements Specification
3.1.1 Functional Requirements
3.1.2 Non-Functional Requirements
3.1.3 Hardware Requirements
3.1.4 Software Requirements
4 Project Design and Analysis
4.1 Process Model
4.2 Breakdown Structure
4.3 DFD Diagram
4.4 Component Design Diagram
4.5 ER Diagram
4.6 UML Diagrams
5 Module Analysis
5.1 Module Analysis
5.1.1 Object Detection
5.1.2 Object Recognition
6 Project Cost and Time Estimation
6.1 Cost Estimation
6.2 Time Estimation
7 Future Scope
8 Features of the project
9 Applications of the project
10 Advantages and Disadvantages
10.1 Advantages
10.2 Disadvantages
11 Conclusion
12 References

Index of Figures
Fig.No. Name of figure Page no.
1.1 System Architecture
4.1 Waterfall Model
4.2 Breakdown Structure Diagram
4.3 DFD Level 0 Diagram
4.4 DFD Level 1 Diagram
4.5 DFD Level 2 Diagram
4.6 Component Diagram
4.7 ER Diagram
4.8 Class Diagram
4.9 Activity Diagram
4.10 Usecase Diagram
4.11 Sequence Diagram
4.12 Collaboration Diagram
4.13 State Chart Diagram
4.14 Deployment Diagram

Index of Tables
Tab.No. Title of Table Page no.
6.1 Cost Estimation Table
6.2 Time Estimation Table
1. INTRODUCTION

The World Health Organization (WHO) estimates that 253 million people live with visual impairment.
217 million of those have moderate to severe vision impairment and 37 million are blind. Traditionally, Tools
like stick were used from decades to avoid obstacles. Keeping pet dogs or guide dogs were one of few
techniques used by the visually impaired. Some took help of their family and friends for assistance. However,
these techniques and tools had drawbacks. People with trained dogs needed money and time to feed and train
the dogs. In today’s technologically advanced world the above techniques cannot suffice the needs of the
visually impaired people. According to statistics from the World Health Organization (WHO), at least 285
million people are visually impaired or blindness. Blind people generally have to rely on white canes, guide
dogs, screen-reading software, magnifiers, and glasses for navigation and surrounding object detection.
Therefore, to help blind people, the visual world has to be transformed into the audio world with the potential
to inform them about objects. In this paper, we propose a real-time object detection system to help visually
impaired people in their daily life. This system consists of a Raspberry Pi. We propose a camera-based assistive
speech output of object name to help blind persons detect object. The R-pi based object detection and reader
comes as a proposed solution to enable visually impaired people to find difficulties in detecting obstacles and
dangers in front of them during walking and to identify the world around. Blind people have high difficulty
in doing their daily routines. Due to low vision or blindness, they suffer from an inferiority complex and also
it affects their economic conditions because of less efficiency in doing the work and the cost of the treatment.
The major difficulty faced by blind people while navigating or traveling in the unknown surrounding. The
ingenious device for blinds is a contraption which helps the blinds people to navigate with speed and
confidence by detecting the nearby objects and obstacles using the help of pi camera, ultrasonic sensor and
notify them with Speech warning sound along with voice alert. A portative user friendly device is flourished
that can identify the obstacles in the path using ultrasonic sensors. If the obstacle is close, then raspberry pi
sends a signal to sound a and also sends the voice command through the earphones. The proposed system
detects the obstacle images which are present in outdoor and indoor with the help of a camera. when any
objects or obstacles come in range of an ultrasonic sensor and it make notify Speech warning messages
activated when any obstacle is detected via earphone. Since the running of daily life of blind people is very
difficult. This project helps them to run their life as usual. They can make this project as a gadget or a device
in their hands which detects the obstacle. This project is more efficient than the existing system with cheaper
and accurate one.
1.1 Literature Survey:
Many researchers have contributed to this field. Various combinations of existing technologies have been
used. Braille systems, screen magnifiers, etc. went through some developments but later faced technical issues.

1. Ayat A. Nada, was proposed, Stick solution use different technologies like infrared, ultrasonic sensor and
laser but they still have drawbacks. In the present study we introduce, light pressure, low-cost, adaptable,
fast response and low power utilization. Smart stick based infrared technology. A combination of infrared
sensors can reveal stair-cases and other obstacle presence in the user path, within a range of two meters.
The tentative results carry out good accuracy and the stick is able to identify all of disincentives.

2. S. Innet, N.Ritnoom was proposed that blind people use a white stick as a tool for directing them when
they move or walk. In spite of, the white stick is helpful, it cannot give a high assurance that it can assure
blind people away from all level of hurdles. Several researchers have been obessed in establishing
electronic devices to protect blind people away from obstacles with a higher guarantee. This study
introduces an hurdles restraint alternative by using an electronic stick that serves as a tool for blind people
in walking. It exploits an infrared sensor for detecting hurdles along the roadway. With all level of hurdles,
the infrared stick facilitates to identify all type of earthly available in the course such as concrete, wood,
metal, glass, and human being. The outcome also shows that the stick detects obstacles in range of 80 cm
which is the same as the length of white stick. The twig is designed to be small and light, so that blind
people can carry it comfortably.

3. Ross Girshick , we propose a Fast Region-based Convolutional Network method (Fast R-CNN) for object
detection. Fast R-CNN frames on previous work to accurately distribute object proposals using deep
convolutional networks. Correlated to previous work, Fast R-CNN uses several innovations to improve
training and testing speed while also increasing detection accuracy. Fast R-CNN tracks the very deep
VGG16 network 9x faster than R-CNN, is 213x faster at test-time, and achieves a higher mAP on PASCAL
VOC 2012. Compared to SPP net, Fast R-CNN trains VGG16 3x faster, tests 10x faster, and is more
accurate. Fast R-CNN is implemented in Python and C++.

4. Multiple Distance Sensors Based Smart Stick for Visually Impaired People :Amit Kumar proposed this
system. In this system a novel low-cost yet durable and accurate smart stick to assist visually impaired
people while they walk in indoor/outdoor unstructured environments. There is a large group of people who
have difficulties in their daily routine work due to losing their eyesight. Walking with confidence is one of
them which may have different challenges in different environments/countries. We have considered the
Indian context where outdoor environments are often clustered and noisy. Keeping these challenges in
mind, a new smart stick is developed which is capable of detecting obstacles of any height in front or
slightly sideways of the person. The stick gives a fair idea about the distance and the location of obstacles
through vibration in hand and audio in the ear of the person. The wireless connection has been set up using
Bluetooth between theearphone and the stick. Different frequencies of the generated vibration and different
tracks of the audio alert the person about the distance of the obstacle. Real-time experiments have been
conducted in different environments by different people to observe the accuracy of the stick and results are
quite encouraging.

5. In this system, an electronic aid to visually impaired people is designed which helps them to voyage to the
destination like normal people. The aiding system is built into a walking stick that shall be carried by a
visually impaired person. The aiding system acts like a reproduction vision. Sensors with most accurate
outputs are used in this work. The intelligent algorithm is used in the software so that it is more user-
friendly. A suitable walking stick is designed with all the stuff built-in. The canopy people will able to
budge from one place to another lacking other help. If such a system is developed, it will act as a basic
stand for the invention of more such devices for the canopy people in the potential which will be cost-
effective. And as far as the localization is anxious it will be able to provide accurate information on the
position of the canopy if in case they lost with help from the GPS. It will be a real boon for the blind. The
developed prototype gives good results in detecting obstacles paced at distance in front of the user.These
works report the designing of the multi-sensor blind stick. This will be useful for visually impaired peoples.
The canopy stick consists of tricky features which detect obverse and top part of the obstacles, water
stagnated/manholes on the ground. Due to these features it is the best tool for blind and visually impaired
people for on foot on the road. It is unforced, cost-effective, configurable and simple to handle smart
supervision systems. The system is planned implemented, tested and verified. The results indicate that the
structure is expert and inimitable in its impending in specifying the source and space of the obstacles.

6. Navigation assistance for visually impaired (NAVI) refers to systems that can assist or guide people with
vision loss, ranging from partially sighted to blind, using sound commands. In this paper, a new system for
NAVI is presented based on visual and range information. Instead of using several sensors, we choose one
device, a consumer RGB-D camera, and take advantage of both range and visual information. In particular,
the main contribution is the combination of depth information with image intensities, resulting in the robust
expansion of the range-based floor segmentation. On one hand, depth information, which is reliable but
limited to a short-range, is enhanced with the long-range visual information. On the other hand, the difficult
and prone-to-error image processing is eased and improved with depth information. The proposed system
detects and classifies the main structural elements of the scene providing the user with obstacle-free paths
to navigate safely across unknown scenarios. The proposed system has been tested on a wide variety of
scenarios and data sets, giving successful results and showing that the system is robust and works in
challenging indoor environments.

7. Laviniu Tepelea, Loan Gavrilut & Alexandru Gacsadi proposed the assistance system. In the system
toassist people with visual impairments, the smartphone proves to be very useful, but it requires sensory
modules external to the phone to detect obstacles and find a safe way. The Android application we have
made, offers not only a travel guide, but also other daily assistive functions such as reading a poster or
article, making phone calls, findingthe date, time, and battery level. The special interface created for the
blind has proved its efficiency, and the communication of the relevant information verbally transmitted
through the TTS to the earphones to one ear leads to a correct understanding of the message and leaves the
user the opportunity to receive other useful information from the environment. External platforms
communicate data from sensors to the phone via Bluetooth and Wi-Fi where sensorial data fusion is made,
and at the appropriate time, the relevant information is communicated to the user, warning of the existence
of an obstacle at a certain level. The accelerator sensor can detect when the person with visual impairment
is falling, and a phone call to a favorite number is made, for example, the emergency service, and the light
sensor detects the need to move from outdoor guidance to indoor guidance and vice versa. Experimental
tests made with the assistive system have proven its usefulness, but they have also revealed that further
testing is needed to find the optimum obstacle detection distance, both inside buildings and in the outdoor
environment. In the future, more powerful and cheaper smartphones will be made, which will lead to more
effective assistance. Finally, this aiding system for visually impaired, based on a smartphone, but also using
other small external sensory modules, proves to be a viable, portable, low-cost, small-scale solution. More
important, it does not require many hours of training.

8. Kabalan Chaccour & Georges Badr proposed this system, In this system a novel design for an indoor
navigation system for visually impaired and blind people. The proposed approach has a simple architecture
that allows the subject to be fully independent in his home or work. The system provides navigation
assistance and obstacle avoidance functionalities in indoor premises algorithms. Unlike other systems in
the related work, the subject needs only to hold his smartphone during his displacement and doesn't require
any particular skills to be operated. The complexity of the system resides in the computer vision processing
algorithm seamlessly to the user. Future development is planned for the system expansion to add more
functionality. On the application level, we are working to automatically run the application when motion
is sensed from the subject (chair rise, walking, etc.). It can also revert to its sleep mode in static conditions
to minimize the battery consumption. Battery level can also be communicated loudly through voice
messages to avoid critical situations. The application may also offer gait analysis for elderly and visually
impaired subjects and may prevent the subject from a potential fall. On the remote processing system level,
enhanced image processing algorithms may be implemented to detect specific objects in the environment.
The time between voice messages must be adequately chosen to avoid flooding the ears and disturbance.
These issues will be addressed in future research activities.

9. This system presents a smart guiding device for visually impaired users, which can help them move safely
and efficiently in a complicated indoor environment. The depth image and the multi-sensor fusion based
algorithms solve the problems of small and transparent obstacle avoiding. Three main auditory cues for
the blind users were developed and tested in different scenarios, and results show that the beep sound-
based guiding instructions are the most efficient and well-adapted. For weak sighted users, visual
enhancement based on the AR technique was adopted to integrate the traversable direction into the
binocular images and it helps the users to walk more quickly and safely. The computation is fast enough
for the detection and display of obstacles. Experimental results show that the proposed smart guiding
glasses can improve the traveling experience of the visually impaired people. The sensors used in this
system are simple and at low cost, making it possible to be widely used in the consumer market.

10. Zoran Zivkovic et.al proposes Improved Adaptive Gaussian Mixture Model for Background Subtraction
Background subtraction is a common computer vision task. We analyze the usual pixel-level approach.
We develop an efficient adaptive algorithm using Gaussian mixture probability density. Recursive
equations are used to constantly update the parameters and but also to simultaneously select the appropriate
number of components for each pixel.
1.2 Existing System
In existing system, Blind people generally use either the typical white cane or the guide dog to travel. The
white cane is a widely used mobility aid that helps blind people to navigate in their surroundings. Although
the white stick gives a warning about few meters before the obstacle, for a normal Walking speed, the time to
react is very short. The idea of designing and manufacturing ultrasonic sensor combines the properties of
sound monition and that benefit the blind and vibrating alert feature, which benefit from the experience of
deafness. Sensor can detect obstacles within the designed range to avoid the blind person through the issuance
of distinctive sound or vibration can be issued by the sense of the deaf by putting his finger on the button at
the top of the device vibrate when there is a risk. This system involves more manual work and it does not
provide better result. The existing system doesn’t provide proper navigation and is not much effective. The
major drawbacks that existing system was unable to resolve were:
• Viewpoint variation: An object viewed from different angles may look completely different.
• Deformation: If many objects are present then it can be deformed (number of objects cannot be
correctly identified)
• Intra class variation: An object can be relatively same.

1.3 Proposed System

The proposed system is to build an customized kit for blind people. This kit acts as a voice assistant.
This kit is used to help the blind people to access most important features of the text to speech. The System
will have custom Object Detection This system will speak out all the object detected. This system in all is a
voice assistant for blind people. The proposing system uses Raspberry pi, it is a small processing device which
works as computer at relatively low cost. Blind and visually impaired people find difficulties in detecting
obstacles during walking in the street. The system is intended to provide artificial vision and object detection,
real time assistance via making use of Raspberry Pi. The system consists of pi camera module, ultrasonic
sensors, and Earphone to receive the instruction through audio, Voice output works through TTS (text to
speech). The proposed system is equipped with pi camera module, ultrasonic sensors. This system detects an
object around them and finds the type of the object sends feedback in the form of speech that is warning
messages via earphone. The aim of the overall system is to provide a low cost, efficient navigation and obstacle
detection aid for blind which gives a sense of artificial vision by providing information about the natural
scenario of static and dynamic object around them, so that they can walk independently. The proposed system
helps us to deal with the limitations of the existing system. The proposed system can view and detect object
form different angles. It will also help us to identify the correct number of objects present in front of Pi-camera.
The same class of the objects can be detected and recognized incorrectly which will the visually impaired
person to know the accurate type of object.
2. ANALYSIS AND FEASIBILITY

2.1 Analysis

1) Problem Statement: The problem statement for the project is to develop a system that can assist blind
people in detecting and recognizing objects in their environment. The system should be able to capture
images or video from a camera or sensor, process the data, and provide speech feedback to the user about
the objects detected.

2) Technology and Tools: The project requires a range of technologies and tools, including sensor technology,
image and video processing, object detection and recognition algorithms, and user interface design. Some
of the specific technologies and tools that can be used for the project include OpenCV, Raspberry Pi.

3) Hardware Requirements: The project requires hardware components such as cameras, ultrasonic sensor,
Raspberry-Pi , R-Pi camera, and headphones that can be integrated into a compact and portable device.
The device is designed to be lightweight, easy to carry, and durable.

4) Object Detection and Recognition Algorithms: In the proposed system we have selected object detection
and recognition algorithm that can accurately detect and recognize objects in the environment. This
algorithm include pre-trained classifiers such as OpenCV algorithm trained using machine learning
techniques.

5) User Interface Design: The project simple and compact design of a user interface that can be used by blind
people to interact with the system. The user interface should be designed with accessibility in mind and
should provide feedback in a way that is easy to understand.

6) Testing and Evaluation: The project is tested and evaluated of the system to ensure that it meets the
requirements and is accurate and reliable. The testing also involves real-world scenarios and user feedback
to identify areas for improvement.
2.2 Feasibility Study:

A feasibility study is a high-level capsule version of the entire System analysis and Design Process The
study begins by classifying the problem definition. Feasibility is to determine if it's worth doing. Once an
acceptance problem definition has been generated, the analyst develops a logical model of the system. A
search for alternatives is analyzed carefully. There are 3 parts in feasibility study.

1) Operational Feasibility

2) Technical Feasibility

3) Economical Feasibility

2.2.1 Operational feasibility:

Operational feasibility for blind person object detection and recognition would involve assessing whether
the system can be implemented effectively and efficiently in the intended operating environment. This
assessment includes evaluating the system's ability to meet the needs of the users and stakeholders and
identifying any potential obstacles that could hinder the system's successful implementation.

• The following are some factors that need to be considered when assessing operational feasibility:

• User acceptance: Blind people are the primary users of the system, and their feedback and acceptance of
the system are critical. The system should be designed to meet their needs and preferences, and user testing
should be conducted to ensure that the system meets their expectations.

• System integration: The system must integrate seamlessly with the existing infrastructure and environment
in which it will be used. For example, if the system is to be used in a crowded city environment, it must
be able to detect and recognize objects accurately in such an environment.

• Training and support: Blind people using the system will need adequate training and support to learn how
to use the system effectively. The system is designed to be user-friendly and easy to understand, and
training materials and support should be provided to ensure that users can use the system effectively.

• Maintenance and support: The system will require regular maintenance and support to ensure that it
remains operational and continues to meet the needs of the users. The support infrastructure should be
established to ensure that users can access support when needed.

• Resource availability: The system will require sufficient resources such as connectivity, and computing
resources to operate effectively. The availability of these resources should be assessed to ensure that they
are sufficient to support the system's operation.

• Legal and regulatory compliance: The system complies with relevant legal and regulatory requirements,
such as data privacy and security laws. These requirements should be assessed, and compliance measures
should be incorporated into the system design and operation.
• In conclusion, operational feasibility for blind person object detection and recognition would require a
comprehensive assessment of the system's ability to operate effectively and efficiently in the intended
operating environment. The system must meet the needs of the users, stakeholders, and comply with
relevant legal and regulatory requirements. Adequate training, support, and maintenance infrastructure
must be established to ensure the system's successful implementation and long-term operation.

2.2.2 Technical Feasibility:


• Technical feasibility for blind person object detection and recognition involves evaluating the technical
aspects of the system to determine whether it is possible to develop and implement such a system. This
assessment includes evaluating the available technology, hardware, software, and algorithms needed to
build the system.
• The following are some factors that need to be considered when assessing technical feasibility:

• Hardware: The system would require a camera or sensor to capture images of the surrounding environment.
The hardware should be capable of capturing high-quality images in different lighting conditions and be
lightweight and portable enough to be carried by a blind person.

• Software: The software used for object detection and recognition should be able to accurately identify
objects in real-time, even in complex environments such as crowded or cluttered spaces.

• Algorithms: The algorithms used for object detection and recognition should be accurate and efficient,
capable of processing large amounts of data quickly, and scalable to different environments and scenarios.

• Data processing and storage: The system would require a processing unit and storage to process the image
data and store the results. The processing unit should be fast enough to process the data in real-time, and
the storage should be reliable and secure.

• User Interface: The user interface of the system should be designed to be accessible to blind people, with
options for audio feedback and tactile interfaces.

• Connectivity: The system should have a reliable and stable connectivity infrastructure to send data and
receive updates.

• Compatibility: The system should be compatible with different platforms, devices, and operating systems
to ensure that it can be easily integrated into existing infrastructure and environments.

2.2.3 Economical Feasibility:


Establishing the cost-effectiveness of the proposed system i.e. if the benefits do not outweigh the costs
then it is not worth going ahead. In the fast paced world today there is a great need of online social networking
facilities. Thus the benefits of this project in the current scenario make it economically feasible. The purpose
of the economic feasibility assessment is to determine the positive economic benefits to the organization that
the proposed system will provide. It includes quantification and identification of all the benefits expected. This
assessment typically involves a cost/benefits analysis. The proposed system is an affordable hardware kit
which is available for visually impaired people. It’s benefits and features do not outweigh the cost of hardware
kit.
3 PROJECT REQUIREMETS

3.1 About Proposed Project:


A proposed project for blind person object detection and recognition would involve developing a system
that can detect and recognize objects in the environment and provide the user with relevant information about
those objects. The proposed system would use advanced computer vision algorithms, sensors, and other
technologies to identify objects and obstacles around the user.

The system would consist of two main modules:

• Object detection module: This module is responsible for detecting objects in the environment using
ultrasonic sensors and also R-pi camera. OpenCV algorithm is used to analyze the data and identify objects
in real-time.

• Object recognition module: This module is responsible for recognizing objects and providing relevant
information to the user. The system use machine learning algorithms to classify objects and provide
information about their size, shape, color, and location.

The proposed system would also involve developing a user-friendly interface that blind people can use to
access the information provided by the system. The interface is in the form of a hardware kit which is compact
sized and can be easily fitted anywhere.

The proposed system is to build a customized kit for blind people. This kit acts as a voice assistant. This kit is
used to help the blind people to access most important features of the text to speech. The System will have
custom Object Detection This system will speak out all the object detected. This system in all is a voice
assistant for blind people. The proposing system uses Raspberry pi; it is a small processing device which works
as computer at relatively low cost. Blind and visually impaired people find difficulties in detecting obstacles
during walking in the street. The system is intended to provide artificial vision and object detection, real time
assistance via making use of Raspberry Pi. The system consists of pi camera module, ultrasonic sensors, and
Earphone to receive the instruction through audio, Voice output works through TTS (text to speech). The
proposed system is equipped with pi camera module, ultrasonic sensors. This system detects an object around
them and finds the type of the object sends feedback in the form of speech that is warning messages via
earphone. The aim of the overall system is to provide a low cost, efficient navigation and obstacle detection
aid for blind which gives a sense of artificial vision by providing information about the natural scenario of
static and dynamic object around them, so that they can walk independently. The proposed system helps us to
deal with the limitations of the existing system. The proposed system can view and detect object form different
angles. It will also help us to identify the correct number of objects present in front of Pi-camera. The same
class of the objects can be detected and recognized incorrectly which will the visually impaired person to know
the accurate type of object.
3.2 Area of Implementation:
Internet of Things (IoT)

3.3 Software / System Requirement Specification:

3.3.1 User interface:


1. Operating System: Windows XP and later versions
2. Front & Back End: HTML, CSS.
3. Programming Language: Python.
4. Database: MySQL /Firebase.

3.3.2 Hardware interface:


1. Processo22r – i3
2. Hard Disk – 5 GB
3. Memory – 1GB RAM
4. Earphone
5. Pi Camera
6. R-Pi
7. Ultrasonic Sensor

3.3.3 Software interface:


1. Operating System: Windows XP and later versions
2. Programming Language: Python

3.3.4 Communication interface:


This project supports all types of web browsers. Active internet connection
3.4 Functional Requirements:
• Object Detection:
1. REQ-1: This system should capture object from the camera and extracts the object from the object
with clear perception.
2. REQ-2: The system should convert text to speech with help of Text-to-Speech engine and the text
should be read for the user.
3. REQ-3: It normally converts a normal text into speech.
4. REQ-4: Image processing should be done with the object present in front of camera
5. REQ-5: Convert the information to voice using headphone
6. REQ-6: Connect the mini camera with a Raspberry Pi.
7. REQ-7: Programming the Raspberry Pi using Python language which will be powerful for processing.

• Object Recognition:
1. REQ-1: The system should process and analyze the camera records using Raspberry Pi in real time.
2. REQ-2: The object should get detected and recognize the type of object in front of the blind.
3. REQ-3: Design and build an alarm system to notify the user about the recognized objects using voice
messages.

3.5 Non – Functional Requirements:


a) Performance requirements:
1. The system gives advice or alerts user immediately.
2. The System gives accurate results.
3. Interactive, minimal delays, safe info transmission

b) Safety requirements:
1. Nobody will be harm while developing the system.
2. Easy to use.
3. System embedded with management procedures and validation procedures
c) Security requirements:
1. The system keeps all Users information’s with high security.
2. Identify all user, authenticate/verify user credentials, authorize user/third party, audit for user’s
usability, backup, server clustering, system policies
d) Software quality attributes:

1. Predictability
2. Accuracy
3. Maintainability
4. Usability
5. Modifiability
6. Interoperability
7. Efficiency
3.6 Hardware Requirements:

1. Processor – i3
2. Hard Disk – 5 GB
3. Memory – 1GB RAM
4. Earphone
5. Pi Camera
6. R-Pi
7. Ultrasonic Sensor

3.7 Software Requirements:

1. Operating System: Windows XP and later versions


2. Front & Back End: HTML, CSS.
3. Programming Language: Python.
4. Database: MySQL /Firebase.
4. PROJECT DESIGN AND IMPLEMENTATIONS

4.1 Breakdown Structure:

A Work Breakdown Structure (WBS) is a hierarchical outline of the tasks required to complete a
project. The WBS “breaks down” the structure of a project into manageable deliverables. Each deliverable is
assigned a task, or series of tasks that can be further broken down into subtasks to meet the needs of the
project. Breaking work into smaller tasks is a common productivity technique used to make the work more
manageable and approachable. The advantage of using a WBS as part of project lifecycle management is
that is takes large, complex projects and breaks them into smaller, more manageable tasks that can be
assigned to specific people or teams to deliver. The figure below shows the actual tasks that will be done by
the components or characters. The tasks performed by the proposed system are shown in the diagram given
below.

Fig 4.1: Breakdown Structure


4.2 Data Flow Diagrams:

Data objects represented by labeled arrows and transformation are represented by circles also called as
bubbles. DFD is presented in a hierarchical fashion i.e. the first data flow model represents the system as a
whole. Subsequent DFD refine the context diagram (level 0 DFD), providing increasing details with each
subsequent level. The DFD enables the software engineer to develop models of the information domain &
functional domain at the same time. As the DFD is refined into greater levels of details, the analyst perform
an implicit functional decomposition of the system. At the same time, the DFD refinement results in a
corresponding refinement of the data as it moves through the process that embody the applications. A context-
level DFD for the system the primary external entities produce information for use by the system and consume
information generated by the system. The labeled arrow represents data objects or object hierarchy. The
context diagram is the most abstract data flow representation of a system. It represents the entire system as a
single bubble and. The various external entities with which the system interacts and the data flows occurring
between the system and the external entities are also represented. The name context diagram is well justified
because it represents the context in which the system is to exist i.e. the external entities (users) that would
interact with the system and specific data items they would be receiving from the system.

4.2.1 DFD level 0:


Level 0 DFDs, also known as context diagrams, are the most basic data flow diagrams. They provide
a broad view that is easily digestible but offers little detail. Level 0 data flow diagrams show a single process
node and its connections to external entities. For instance, the figure shown below illustrates the Blind People
Helping Hand process with the flow of information between Input and Output.

Fig 4.2: DFD Level 0


4.2.2 DFD level 1:
The Level 0 DFD is broken down into more specific, Level 1 DFD. Level 1 DFD depicts basic modules
in the system and flow of data among various modules. Level 1 DFDs are still a general overview, but they
go into more detail than a context diagram. In level 1 DFD, the single process node from the context diagram
is broken down into sub-processes. As these processes are added, the diagram will need additional data
flows and data stores to link them together. Level 1 DFD also mentions basic processes and sources of
information.
• It provides a more detailed view of the Context Level Diagram.
• Here, the main functions carried out by the system are highlighted as we break into its sub-processes.
In level 1 DFD, the single process node from the context diagram is broken down into sub-processes. As these
processes are added, the diagram will need additional data flows and data stores to link them together. In the
Blind People Helping Hand Application, process with the flow of information between User and System. This
can include Object Detection & generate voice output.

Fig 4.3: DFD Level 1


4.2.3 DFD level 2:

DFD Level 2 simply break processes down into more detailed sub-processes. DFD Level 2 goes one
step deeper into parts of 1-level DFD. It can be used to plan or record the specific/necessary detail about the
system’s functioning The level 2 diagram below expands on the Blind People Helping Hand process to include
more granular processes involved, such as the detection and subsequent connected data flows of Device
connected to R-Pi is Earphone, Pi Camera, Ultrasonic Sensor

Fig 4.4: DFD Level 2


4.3 ER Diagram:
The Entity-Relationship (ER) model was originally proposed by Peter in 1976 [Chen76] as a way to unify
the network and relational database views. Simply stated the ER model is a conceptual data model that views
the real world as entities and relationships. A basic component of the model is the Entity-Relationship diagram
which is used to visually represents data objects. Since Chen wrote his paper the model has been extended and
today it is commonly used for database design For the database designer, the utility of the ER model is:

• It is simple and easy to understand with a minimum of training. Therefore, the model can be used by the
database designer to communicate the design to the end user.
• In addition, the model can be used as a design plan by the database developer to implement a data model
in a specific database management software.
As shown in figure below Object detect, People , Ultrasonic sensor, Preference model are entities of Blind People
Helping Hand Application. The mentioned entities shows the relationship between the entities.

Fig 4.5 ER Diagram


4.4 Component design diagram:
A component diagram, also known as a UML component diagram, describes the organization and
wiring of the physical components in a system. When dealing with documentation of complex systems,
component UML diagrams can help break down the system into smaller components. Sometimes it is hard to
depict the architecture of a system because it might encompass several departments or it might employ
different technologies. Component diagrams are often drawn to help model implementation details and double-
check that every aspect of the system's required functions is covered by planned development. In the first
version of UML, components included in these diagrams were physical: documents, database table, files, and
executables, all physical elements with a location. In the world of UML 2, these components are less physical
and more conceptual stand-alone design elements such as a business process that provides or requires
interfaces to interact with other constructs in the system. The physical elements described in UML 1, like files
and documents, are now referred to as artifacts. A UML 2 component may contain multiple physical artifacts
if they naturally belong together. In Blind People Helping Hand Application is a R-pi technology based
application employed by several Blind Peoples for Detecting Object or obstacle in Route. It is made up of
three different layers: the Capture Image, the Object Detection and the Voice Output. The image shows how
a component diagram can help us get a simplified top-level view of a more complex system.

Fig 4.6: Component diagram


4.5 UML Diagrams:
A diagram is the graphical presentation of a set of elements, most often rendered as a connected graph
of vertices and arcs . you draw diagram to visualize a system from different perspective, so a diagram is a
projection into a system. For all but most trivial systems, a diagram represents an elided view of the elements
that make up a system. The same element may appear in all diagrams, only a few diagrams , or in no diagrams
at all. In theory, a diagram may contain any combination of things and relationships. In practice, however, a
small number of common combinations arise, which are consistent with the five most useful views that
comprise the architecture of a software-intensive system. For this reason, the UML includes nine
such diagrams:

1. Class Diagram
2. Activity Diagram
3. Use case Diagram
4. Sequence Diagram
5. Collaboration Diagram
6. State chart Diagram
7. Deployment Diagram

Advantages of UML Diagrams:

• It’s Flexible & Well Known


• Effective Communication of the Software Architecture is Essential
• UML Is Easy to Understand
• Readability and Re-usability of UML Tools
• Serves as a Visual Representation between Classes and Entities
• Helps to Plan a Program before the Programming Takes Place

Disadvantages of UML Diagrams:

• Diagrams Can Get Overwhelming


• Too Much Emphasis on Design
• Sometimes unclear
• Time consuming
4.5.1 Class Diagram:
A Class is a category or group of things that has similar attributes and common behavior. A Rectangle
is the icon that represents the class it is divided into three areas. The upper most area contains the name, the
middle; area contains the attributes and the lowest areas show the operations. Class diagrams provides the
representation that developers work from. Class diagrams help on the analysis side, too Class is nothing but a
structure that contains both variables and methods. The Class Diagram shows a set of classes, interfaces, and
collaborations and their relating ships. There is most common diagram in modeling the object oriented
systems and are used to give the static view of a system. It shows the dependency between the classes that
can be used in our system. The interactions between the modules or classes of our projects are shown
below. Each block contains Class Name, Variables and Methods. The Blind People Application Class diagram
shows a basic class diagram. The ‘Capture’ class and the ‘Image Proc’ class both inherit from the more general
class, ‘R-Pi Camera’. The inheritance is shown using the blank-headed arrow. The other class in the diagram
is the ‘people’ class. The diagram is quite self-explanatory and it clearly shows the different classes and how
they are interrelated.

Fig 4.7 : Class diagram


4.5.2 Activity Diagram:
Activity diagram are the flowcharts, showing the flow of control from activity to activity. Activity
diagrams represent the business and operational work flows of a system. Activity Diagrams describe how
activities are coordinated to provide a service which can be at different levels of abstraction. Typically, an
event needs to be achieved by some operations, particularly where the operation is intended to achieve a
number of different things that require coordination, or how the events in a single use case relate to one
another, in particular, use cases where activities may overlap and require coordination. It is also suitable for
modeling how a collection of use cases coordinate to represent business workflows. As shown in below
diagram flow of system, first capture the image through R-Pi camera And Send to system then Detect Object
and Speech output with object Name.

Fig 4.8 : Activity diagram


4.5.3 Use Case Diagram:
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by
and created from a use-case analysis. Its purpose is to present a graphical overview of the functionality
provided by a system in terms of actors, their goals (represented as usecases),and any dependencies between
those use cases. Use case diagram consists of use cases and actors and shows the interaction between the use
case and actors

• The purpose is to show the interactions between the use case and actor.
• To represent the system requirements from user’s perspective.
• An actor could be the end-user of the system or an external system
As shown in figure system can connect to R-PI camera, Ultrasonic sensor, Ear phone. The webcam captures
image from a given picture and the text from the captured image is converted into an audio output using
Raspberry Pi. The quality of the image captured will be high so as to have fast and clear recognition due to
the high-resolution camera. And also we measure the distance of the object using the ultrasonic sensor. Both
outputs are heard in an audio. The output device can be a headset connected to the raspberry pi which can spell
out the text document loud. The proposed system detects the obstacle images which are present in outdoor and
indoor with the help of a camera.

Fig 4.9: Usecase diagram


4.5.4 Sequence Diagram:
A Sequence Diagram is an interaction diagram that emphasis the time ordering of messages; a
collaboration diagram is an interaction diagram that emphasizes the structural organization of the objects that
send and receive messages. Sequence diagrams and collaboration diagrams are isomorphic, meaning that you
can take one and transform it into the other. Sequence diagram and collaboration diagram are called
INTERACTION DIAGRAMS. An interaction diagram shows an interaction, consisting of set of objects and
their relationship including the messages that may be dispatched among them. A sequence diagram is an
introduction that empathizes the time ordering of messages. Graphically a sequence diagram is a table that
shows objects arranged along the X-axis and messages ordered in increasing time along the Y-axis. A sequence
diagram for Blind People Helping hand represent communication flow between People, System, Application.
The system consists of a webcam interfaced with Raspberry-Pi , Pi Camera detects and find the type of object
with the help of ultrasonic sensor. Ultrasonic sensor to detect the real time hurdles while walking on the roads.
The ultrasonic sensor used in this project plays a vital role. It detects the object in front of this. When object
is detected a indication sound is given to the user via earphone.

Fig 4.10 : Sequence diagram


4.5.5 Collaboration Diagram:
A collaboration diagram is an introduction diagram that emphasizes the structural organization of the
objects that send and receive messages. Graphically a collaboration diagram is a collection of vertices and
arcs. A Collaboration Diagram also called a communication diagram or interaction diagram, is an illustration
of the relationships and interactions among software objects. The concept is more than a decade old although
it has been refined as modeling paradigms have evolved. In the collaboration diagram, each message
constitutes a sequence number, such that the top-level message is marked as one and so on. The messages sent
during the same call are denoted with the same decimal prefix, but with different suffixes of 1, 2, etc. as per
their occurrence. The system is intended to provide Raspberry Pi. The proposed system detects an object
around them and sends feedback in the form of speech, warning messages via earphone.

Fig 4.11: Collaboration diagram


4.5.6 State Chart Diagram:
The state diagram shows the states of an object and represents activities as arrows connecting the states.
The Activity Diagram highlights the activities. Each activity is represented by a rounded rectangle-narrower
and more oval-shaped than the state icon. An arrow represents the transition from the one activity to the next.
The activity diagram has a starting point represented by filled-in circle, and an end point represented by bulls
eye. In shown Fig.A state diagram is used to represent the condition of the system or part of the system at
finite instances of time. It's a behavioral diagram and it represents the behavior using finite state transitions.
State chart diagram describes the flow of control from People to Detect object Using R-Pi Technology.

Fig 4.12: State Chart diagram


4.5.7 Deployment Diagram:
A Deployment Diagram shows the configuration of run-time processing nodes and the components
that live on them. Deployment diagrams address the static deployment view of architecture. They are related
to component diagrams in that a node typically encloses one or more components. The deployment diagram
shows exactly where each software component is deployed. It is deployed from Admin server to Application
and Application to Peoples.

Fig 4.13: Deployment diagram


4.6 Module Design for Object Detection:

4.6.1 Module Title/Name: Object Detection

4.6.2 Purpose of Module: The purpose of the module is to capture object from the camera and
to extract the object with clear perception. With help of Text-to-Speech engine the text will be
read for the user. It will convert normal text into speech. Object detection is a computer
technology related to computer vision and image processing that deals with detecting the
presence of objects with a limitative box and types or classes of objects located in an image in
digital images and videos. By Using object detection, blind peoples can understand their
surrounding environment without any challenges and remain independent of others.

4.6.3 Inputs and Outputs for Module:


Input: Object
Output: Object will be detected

4.6.4 Files Used by Module: Object Detection

4.6.5 Algorithm/ Procedural steps / Pseudo code:


1. Start
2. Camera will get switched on.
3. Object will be placed in front of camera.
4. Object will get detected.
5. Audio sound will be read for the user.
6. Surrounding environment will get known to the user.
7. Stop

4.6.6 Data flow Diagram/ Sequence Diagram with explanation:

4.6.6.1 DFD level 0:


Level 0 DFDs, also known as context diagrams, are the most basic data flow diagrams. They
provide a broad view that is easily digestible but offers little detail. Level 0 data flow diagrams
show a single process node and its connections to external entities. For instance, the figure shown
below illustrates the Blind People Helping Hand process with the flow of information between
Input and Output.

Fig 4.14: DFD Level 0 for Object Detection


4.6.6.2 DFD level 1:
The Level 0 DFD is broken down into more specific, Level 1 DFD. Level 1 DFD depicts basic
modules in the system and flow of data among various modules. Level 1 DFDs are still a general
overview, but they go into more detail than a context diagram. In level 1 DFD, the single process
node from the context diagram is broken down into sub-processes. As these processes are added,
the diagram will need additional data flows and data stores to link them together. Level 1 DFD also
mentions basic processes and sources of information.
• It provides a more detailed view of the Context Level Diagram.
• Here, the main functions carried out by the system are highlighted as we break into its sub-
processes.
In level 1 DFD, the single process node from the context diagram is broken down into sub-processes.
As these processes are added, the diagram will need additional data flows and data stores to link them
together. In the Blind People Helping Hand Application, process with the flow of information between
User and System. This can include Object Detection & generate voice output.

Fig 4.15: DFD Level 1 for Object Detection


4.6.6.3 DFD level 2:

DFD Level 2 simply break processes down into more detailed sub-processes. DFD Level 2 goes one
step deeper into parts of 1-level DFD. It can be used to plan or record the specific/necessary detail
about the system’s functioning The level 2 diagram below expands on the Blind People Helping Hand
process to include more granular processes involved, such as the detection and subsequent connected
data flows of Device connected to R-Pi is Earphone, Pi Camera, Ultrasonic Sensor

Fig 4.16: DFD Level 2 for Object Detection


4.6.7 Module wise Coding:

• Object Detection

# import the necessary packages


import numpy as np
def non_max_suppression(boxes, probs=None, overlapThresh=0.3):
# if there are no boxes, return an empty list
if len(boxes) == 0:
return []

# if the bounding boxes are integers, convert them to floats -- this


# is important since we'll be doing a bunch of divisions
if boxes.dtype.kind == "i":
boxes = boxes.astype("float")

# initialize the list of picked indexes


pick = []

# grab the coordinates of the bounding boxes


x1 = boxes[:, 0]
y1 = boxes[:, 1]
x2 = boxes[:, 2]
y2 = boxes[:, 3]

# compute the area of the bounding boxes and grab the indexes to sort
# (in the case that no probabilities are provided, simply sort on the
# bottom-left y-coordinate)
area = (x2 - x1 + 1) * (y2 - y1 + 1)
idxs = y2

# if probabilities are provided, sort on them instead


if probs is not None:
idxs = probs

# sort the indexes


idxs = np.argsort(idxs)

# keep looping while some indexes still remain in the indexes list
while len(idxs) > 0:
# grab the last index in the indexes list and add the index value
# to the list of picked indexes
last = len(idxs) - 1
i = idxs[last]
pick.append(i)

# find the largest (x, y) coordinates for the start of the bounding
# box and the smallest (x, y) coordinates for the end of the bounding
# box
xx1 = np.maximum(x1[i], x1[idxs[:last]])
yy1 = np.maximum(y1[i], y1[idxs[:last]])
xx2 = np.minimum(x2[i], x2[idxs[:last]])
yy2 = np.minimum(y2[i], y2[idxs[:last]])
# compute the width and height of the bounding box
w = np.maximum(0, xx2 - xx1 + 1)
h = np.maximum(0, yy2 - yy1 + 1)
# compute the ratio of overlap
overlap = (w * h) / area[idxs[:last]]

# delete all indexes from the index list that have overlap greater
# than the provided overlap threshold
idxs = np.delete(idxs, np.concatenate(([last],
np.where(overlap > overlapThresh)[0])))

# return only the bounding boxes that were picked


return boxes[pick].astype("int")
• Video Stream:

# import the necessary packages


from threading import Thread
import cv2

class WebcamVideoStream:
def __init__(self, src=0, name="WebcamVideoStream"):
# initialize the video camera stream and read the first frame
# from the stream
self.stream = cv2.VideoCapture(src)
(self.grabbed, self.frame) = self.stream.read()

# initialize the thread name


self.name = name

# initialize the variable used to indicate if the thread should


# be stopped
self.stopped = False

def start(self):
# start the thread to read frames from the video stream
t = Thread(target=self.update, name=self.name, args=())
t.daemon = True
t.start()
return self

def update(self):
# keep looping infinitely until the thread is stopped
while True:
# if the thread indicator variable is set, stop the thread
if self.stopped:
return

# otherwise, read the next frame from the stream


(self.grabbed, self.frame) = self.stream.read()

def read(self):
# return the frame most recently read
return self.frame

def stop(self):
# indicate that the thread should be stopped
self.stopped = True
• Counter for object detection:

# import the necessary packages


import cv2

def sort_contours(cnts, method="left-to-right"):


# initialize the reverse flag and sort index
reverse = False
i=0

# handle if we need to sort in reverse


if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i=1

# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b: b[1][i], reverse=reverse))

# return the list of sorted contours and bounding boxes


return cnts, boundingBoxes

def label_contour(image, c, i, color=(0, 255, 0), thickness=2):


# compute the center of the contour area and draw a circle
# representing the center
M = cv2.moments(c)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])

# draw the contour and label number on the image


cv2.drawContours(image, [c], -1, color, thickness)
ge, "#{}".format(i + 1), (cX - 20, cY), cv2.FONT_HERSHEY_SIMPLEX,
1.0, (255, 255, 255), 2)

# return the image with the contour number drawn on it


return image

4.7 Module Design for Object Recognition:


4.7.1 Module Title/Name: Object Recognition

4.7.2 Purpose of Module: The purpose of the module is to recognize the type of object from the
camera and to extract the object with clear perception. It is the technology in the field of
computer vision for finding and identifying objects in an image or video sequence. It finds the
instances of the object. Humans recognize a multitude of objects in images with little effort,
despite the fact that the image of the objects may vary somewhat in different viewpoints, in
many different sizes and scale or even when they are translated or rotated. Objects can even be
recognized when ey are partially obstructed from view. This task is still a challenge for
computer vision systems. Many approaches to the task have been implemented over multiple
decades.
4.7.3 nd Outputs for Module:
Input: Object, image or video
Output: Object will be recognized

4.7.4 Files Used by Module: Object_Recognition.py, MobileNetSSD_deploy.protxt

4.7.5 Algorithm/ Procedural steps / Pseudo code:


1. Start
2. Camera will get switched on.
3. Object will be placed in front of camera.
4. Object will get detected.
5. Object will be recognized.
6. Instance of the object will be recognized.
7. Type of the object will get recognized.
8. Audio sound will be read for the user.
9. Stop

4.7.6 Data flow Diagram/ Sequence Diagram with explanation:

4.7.6.1 DFD level 0:

Level 0 DFDs, also known as context diagrams, are the most basic data flow diagrams. They
provide a broad view that is easily digestible but offers little detail. Level 0 data flow diagrams
show a single process node and its connections to external entities. For instance, the figure
shown below illustrates the Blind People Helping Hand process with the flow of information
between Input and Output.

17 : DFD Level 0 for Object Recognition

4.7.6.2 DFD level 1:

In level 1 DFD, the single process node from the context diagram is broken down into sub-
processes. As these processes are added, the diagram will need additional data flows and data
stores to link them together. In the Blind People Helping Hand Application, process with the
flow of information between User and System. This can include Object Recognition & generate
voice output.
1111111111111111
Fig 4.18: DFD Level 1 for Object Recognition

OBJECT
RECOGNITION

4.7.6.3 DFD level 2:

DFD Level 2 simply break processes down into more detailed sub-processes. DFD Level 2
goes one step deeper into parts of 1-level DFD. It can be used to plan or record the
specific/necessary detail abou the system’s functioning The level 2 diagram below expands
on the Blind People Helping Hand process to include more granular processes involved, such
as the recognition and subsequent connected data flows of Device connected to R-Pi is
Earphone, Pi Camera, Ultrasonic Sensor
OBJECT
RECOGNITION

RECOGNIZE
OBJECT

Fig 4.19 : DFD Level 2 for Object Recognition

4.7.7 Module wise Coding:

• Object Recognition:

# import the necessary packages


import numpy as np
import cv2
import sys

# import any special Python 2.7 packages


if sys.version_info.major == 2:
from urllib import urlopen
# import any special Python 3 packages
elif sys.version_info.major == 3:
from urllib.request import urlopen

def translate(image, x, y):


# define the translation matrix and perform the translation
M = np.float32([[1, 0, x], [0, 1, y]])
shifted = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# return the translated image


return shifted

def rotate(image, angle, center=None, scale=1.0):


# grab the dimensions of the image
(h, w) = image.shape[:2]

# if the center is None, initialize it as the center of


# the image
if center is None:
center = (w // 2, h // 2)

# perform the rotation


M = cv2.getRotationMatrix2D(center, angle, scale)
rotated = cv2.warpAffine(image, M, (w, h))

# return the rotated image


return rotated

def rotate_bound(image, angle):


# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)

# grab the rotation matrix (applying the negative of the


# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])

# compute the new bounding dimensions of the image


nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))

# adjust the rotation matrix to take into account translation


M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))

def resize(image, width=None, height=None, inter=cv2.INTER_AREA):


# initialize the dimensions of the image to be resized and
# grab the image size
dim = None
(h, w) = image.shape[:2]

# if both the width and height are None, then return the
# original image
if width is None and height is None:
return image

# check to see if the width is None


if width is None:
# calculate the ratio of the height and construct the
# dimensions
r = height / float(h)
dim = (int(w * r), height)

# otherwise, the height is None


else:
# calculate the ratio of the width and construct the
# dimensions
r = width / float(w)
dim = (width, int(h * r))

# resize the image


resized = cv2.resize(image, dim, interpolation=inter)

# return the resized image


return resized

def skeletonize(image, size, structuring=cv2.MORPH_RECT):


# determine the area (i.e. total number of pixels in the image),
# initialize the output skeletonized image, and construct the
# morphological structuring element
area = image.shape[0] * image.shape[1]
skeleton = np.zeros(image.shape, dtype="uint8")
elem = cv2.getStructuringElement(structuring, size)

# keep looping until the erosions remove all pixels from the
# image
while True:
# erode and dilate the image using the structuring element
eroded = cv2.erode(image, elem)
temp = cv2.dilate(eroded, elem)
# subtract the temporary image from the original, eroded
# image, then take the bitwise 'or' between the skeleton
# and the temporary image
temp = cv2.subtract(image, temp)
skeleton = cv2.bitwise_or(skeleton, temp)
image = eroded.copy()

# if there are no more 'white' pixels in the image, then


# break from the loop
if area == area - cv2.countNonZero(image):
break

# return the skeletonized image


return skeleton

def opencv2matplotlib(image):
# OpenCV represents images in BGR order; however, Matplotlib
# expects the image in RGB order, so simply convert from BGR
# to RGB and return
return cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

def url_to_image(url, readFlag=cv2.IMREAD_COLOR):


# download the image, convert it to a NumPy array, and then read
# it into OpenCV format
resp = urlopen(url)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, readFlag)

# return the image


return image

def auto_canny(image, sigma=0.33):


# compute the median of the single channel pixel intensities
v = np.median(image)

# apply automatic Canny edge detection using the computed median


lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edged = cv2.Canny(image, lower, upper)

# return the edged image


return edged

def grab_contours(cnts):
# if the length the contours tuple returned by cv2.findContours
# is '2' then we are using either OpenCV v2.4, v4-beta, or
# v4-official
if len(cnts) == 2:
cnts = cnts[0]

# if the length of the contours tuple is '3' then we are using


# either OpenCV v3, v4-pre, or v4-alpha
elif len(cnts) == 3:
cnts = cnts[1]

# otherwise OpenCV has changed their cv2.findContours return


# signature yet again and I have no idea WTH is going on
else:
raise Exception(("Contours tuple must have length 2 or 3, "
"otherwise OpenCV changed their cv2.findContours return "
"signature yet again. Refer to OpenCV's documentation "
"in that case"))

# return the actual contours array


return cnts

def is_cv2(or_better=False):
# grab the OpenCV major version number
major = get_opencv_major_version()

# check to see if we are using *at least* OpenCV 2


if or_better:
return major >= 2

# otherwise we want to check for *strictly* OpenCV 2


return major == 2

def is_cv3(or_better=False):
# grab the OpenCV major version number
major = get_opencv_major_version()

# check to see if we are using *at least* OpenCV 3


if or_better:
return major >= 3

# otherwise we want to check for *strictly* OpenCV 3


return major == 3

def is_cv4(or_better=False):
# grab the OpenCV major version number
major = get_opencv_major_version()

# check to see if we are using *at least* OpenCV 4


if or_better:
return major >= 4

# otherwise we want to check for *strictly* OpenCV 4


return major == 4

def get_opencv_major_version(lib=None):
# if the supplied library is None, import OpenCV
if lib is None:
import cv2 as lib

# return the major version number


return int(lib.__version__.split(".")[0])

def check_opencv_version(major, lib=None):


# this function may be removed in a future release as we now
# use the get_opencv_major_function to obtain the current OpenCV
# version and then perform the actual version check *within* the
# respective function
import warnings
message = """
The check_opencv_version function is deprecated and may be
removed in a future release. Use at your own risk.
"""
warnings.warn(message, DeprecationWarning, stacklevel=2)

# if the supplied library is None, import OpenCV


if lib is None:
import cv2 as lib

# return whether or not the current OpenCV version matches the


# major version number
return lib.__version__.startswith(major)

def build_montages(image_list, image_shape, montage_shape):


"""
---------------------------------------------------------------------------------------------
author: Kyle Hounslow
---------------------------------------------------------------------------------------------
Converts a list of single images into a list of 'montage' images of specified rows and
columns.
A new montage image is started once rows and columns of montage image is filled.
Empty space of incomplete montage images are filled with black pixels
---------------------------------------------------------------------------------------------
:param image_list: python list of input images
:param image_shape: tuple, size each image will be resized to for display (width,
height)
:param montage_shape: tuple, shape of image montage (width, height)
:return: list of montage images in numpy array format
---------------------------------------------------------------------------------------------

example usage:
# load single image
img = cv2.imread('lena.jpg')
# duplicate image 25 times
num_imgs = 25
img_list = []
for i in xrange(num_imgs):
img_list.append(img)
# convert image list into a montage of 256x256 images tiled in a 5x5 montage
montages = make_montages_of_images(img_list, (256, 256), (5, 5))
# iterate through montages and display
for montage in montages:
cv2.imshow('montage image', montage)
cv2.waitKey(0)

----------------------------------------------------------------------------------------------
"""
if len(image_shape) != 2:
raise Exception('image shape must be list or tuple of length 2 (rows, cols)')
if len(montage_shape) != 2:
raise Exception('montage shape must be list or tuple of length 2 (rows, cols)')
image_montages = []
# start with black canvas to draw images onto
montage_image = np.zeros(shape=(image_shape[1] * (montage_shape[1]),
image_shape[0] * montage_shape[0], 3),
dtype=np.uint8)
cursor_pos = [0, 0]
start_new_img = False
for img in image_list:
if type(img).__module__ != np.__name__:
raise Exception('input of type {} is not a valid numpy array'.format(type(img)))
start_new_img = False
img = cv2.resize(img, image_shape)
# draw image to black canvas
montage_image[cursor_pos[1]:cursor_pos[1] + image_shape[1],
cursor_pos[0]:cursor_pos[0] + image_shape[0]] = img
cursor_pos[0] += image_shape[0] # increment cursor x position
if cursor_pos[0] >= montage_shape[0] * image_shape[0]:
cursor_pos[1] += image_shape[1] # increment cursor y position
cursor_pos[0] = 0
if cursor_pos[1] >= montage_shape[1] * image_shape[1]:
cursor_pos = [0, 0]
image_montages.append(montage_image)
# reset black canvas
montage_image = np.zeros(shape=(image_shape[1] * (montage_shape[1]),
image_shape[0] * montage_shape[0], 3),
dtype=np.uint8)
start_new_img = True
if start_new_img is False:
image_montages.append(montage_image) # add unfinished montage
return image_montages

def adjust_brightness_contrast(image, brightness=0., contrast=0.):


"""
Adjust the brightness and/or contrast of an image

:param image: OpenCV BGR image


:param contrast: Float, contrast adjustment with 0 meaning no change
:param brightness: Float, brightness adjustment with 0 meaning no change
"""
beta = 0
# See the OpenCV docs for more info on the `beta` parameter to addWeighted
#
https://docs.opencv.org/3.4.2/d2/de8/group__core__array.html#gafafb2513349db3bcff51f
54ee5592a19
return cv2.addWeighted(image,
1 + float(contrast) / 100.,
image,
beta,
float(brightness))

• Measuring Distance (Ultrasonic Sensor):

# import the necessary packages


from scipy.spatial import distance as dist
import numpy as np
import cv2

def order_points(pts):
# sort the points based on their x-coordinates
xSorted = pts[np.argsort(pts[:, 0]), :]

# grab the left-most and right-most points from the sorted


# x-roodinate points
leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]

# now, sort the left-most coordinates according to their


# y-coordinates so we can grab the top-left and bottom-left
# points, respectively
leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost

# now that we have the top-left coordinate, use it as an


# anchor to calculate the Euclidean distance between the
# top-left and right-most points; by the Pythagorean
# theorem, the point with the largest distance will be
# our bottom-right point
D = dist.cdist(tl[np.newaxis], rightMost, "euclidean")[0]
(br, tr) = rightMost[np.argsort(D)[::-1], :]

# return the coordinates in top-left, top-right,


# bottom-right, and bottom-left order
return np.array([tl, tr, br, bl], dtype="float32")

def four_point_transform(image, pts):


# obtain a consistent order of the points and unpack them
# individually
rect = order_points(pts)
(tl, tr, br, bl) = rect

# compute the width of the new image, which will be the


# maximum distance between bottom-right and bottom-left
# x-coordiates or the top-right and top-left x-coordinates
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))

# compute the height of the new image, which will be the


# maximum distance between the top-right and bottom-right
# y-coordinates or the top-left and bottom-left y-coordinates
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))

# now that we have the dimensions of the new image, construct


# the set of destination points to obtain a "birds eye view",
# (i.e. top-down view) of the image, again specifying points
# in the top-left, top-right, bottom-right, and bottom-left
# order
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype="float32")

# compute the perspective transform matrix and then apply it


M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

# return the warped image


return warped

• Drawing text with line breaks:

import cv2
def put_text(img, text, org, font_face, font_scale, color, thickness=1, line_type=8,
bottom_left_origin=False):
"""Utility for drawing text with line breaks

:param img: Image.


:param text: Text string to be drawn.
:param org: Bottom-left corner of the first line of the text string in the image.
:param font_face: Font type. One of FONT_HERSHEY_SIMPLEX,
FONT_HERSHEY_PLAIN, FONT_HERSHEY_DUPLEX,
FONT_HERSHEY_COMPLEX, FONT_HERSHEY_TRIPLEX,
FONT_HERSHEY_COMPLEX_SMALL,
FONT_HERSHEY_SCRIPT_SIMPLEX, or
FONT_HERSHEY_SCRIPT_COMPLEX, where each of the font ID’s
can be combined with FONT_ITALIC to get the slanted letters.
:param font_scale: Font scale factor that is multiplied by the font-specific base size.
:param color: Text color.
:param thickness: Thickness of the lines used to draw a text.
:param line_type: Line type. See the line for details.
:param bottom_left_origin: When true, the image data origin is at the bottom-left
corner.
Otherwise, it is at the top-left corner.
:return: None; image is modified in place
"""
# Break out drawing coords
x, y = org

# Break text into list of text lines


text_lines = text.split('\n')

# Get height of text lines in pixels (height of all lines is the same)
_, line_height = cv2.getTextSize('', font_face, font_scale, thickness)[0]
# Set distance between lines in pixels
line_gap = line_height // 3

for i, text_line in enumerate(text_lines):


# Find total size of text block before this line
line_y_adjustment = i * (line_gap + line_height)

# Move text down from original line based on line number


if not bottom_left_origin:
line_y = y + line_y_adjustment
else:
line_y = y - line_y_adjustment

# Draw text
cv2.putText(img,
text=text_lines[i],
org=(x, line_y),
fontFace=font_face,
fontScale=font_scale,
color=color,
thickness=thickness,
lineType=line_type,
bottomLeftOrigin=bottom_left_origin)

def put_centered_text(img, text, font_face, font_scale, color, thickness=1, line_type=8):


"""Utility for drawing vertically & horizontally centered text with line breaks
:param img: Image.
:param text: Text string to be drawn.
:param font_face: Font type. One of FONT_HERSHEY_SIMPLEX,
FONT_HERSHEY_PLAIN, FONT_HERSHEY_DUPLEX,
FONT_HERSHEY_COMPLEX, FONT_HERSHEY_TRIPLEX,
FONT_HERSHEY_COMPLEX_SMALL,
FONT_HERSHEY_SCRIPT_SIMPLEX, or
FONT_HERSHEY_SCRIPT_COMPLEX, where each of the font ID’s
can be combined with FONT_ITALIC to get the slanted letters.
:param font_scale: Font scale factor that is multiplied by the font-specific base size.
:param color: Text color.
:param thickness: Thickness of the lines used to draw a text.
:param line_type: Line type. See the line for details.
:return: None; image is modified in place
"""
# Save img dimensions
img_h, img_w = img.shape[:2]

# Break text into list of text lines


text_lines = text.split('\n')

# Get height of text lines in pixels (height of all lines is the same; width differs)
_, line_height = cv2.getTextSize('', font_face, font_scale, thickness)[0]
# Set distance between lines in pixels
line_gap = line_height // 3

# Calculate total text block height for centering


text_block_height = len(text_lines) * (line_height + line_gap)
text_block_height -= line_gap # There's one less gap than lines

for i, text_line in enumerate(text_lines):


# Get width of text line in pixels (height of all lines is the same)
line_width, _ = cv2.getTextSize(text_line, font_face, font_scale, thickness)[0]

# Center line with image dimensions


x = (img_w - line_width) // 2
y = (img_h + line_height) // 2

# Find total size of text block before this line


line_adjustment = i * (line_gap + line_height)

# Adjust line y and re-center relative to total text block height


y += line_adjustment - text_block_height // 2 + line_gap

# Draw text
cv2.putText(img,
text=text_lines[i],
org=(x, y),
fontFace=font_face,
fontScale=font_scale,
color=color,
thickness=thickness,
lineType=line_type)

4.8 Code for Object Detection and Recognition (All libraries imported):

# USAGE
# python real_time_object_detection.py --prototxt MobileNetSSD_deploy.prototxt.txt --model
MobileNetSSD_deploy.caffemodel

# import the necessary packages

from imutils.video import VideoStream


from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

import RPi.GPIO as GPIO


import time
import os

#GPIO Mode (BOARD / BCM)


GPIO.setmode(GPIO.BOARD)
#set GPIO Pins
GPIO_TRIGGER = 18
GPIO_ECHO = 24

#set GPIO direction (IN / OUT)


GPIO.setup(GPIO_TRIGGER, GPIO.OUT)
GPIO.setup(GPIO_ECHO, GPIO.IN)

def distance():
# set Trigger to HIGH
GPIO.output(GPIO_TRIGGER, True)

# set Trigger after 0.01ms to LOW


time.sleep(0.00001)
GPIO.output(GPIO_TRIGGER, False)

StartTime = time.time()
StopTime = time.time()

# save StartTime
while GPIO.input(GPIO_ECHO) == 0:
StartTime = time.time()

# save time of arrival


while GPIO.input(GPIO_ECHO) == 1:
StopTime = time.time()

# time difference between start and arrival


TimeElapsed = StopTime - StartTime
# multiply with the sonic speed (34300 cm/s)
# and divide by 2, because there and back
distance = (TimeElapsed * 34300) / 2

return distance

# construct the argument parse and parse the arguments


ap = argparse.ArgumentParser()
ap.add_argument("--video", help="path to video file. If empty, camera's stream will be used")
ap.add_argument("--prototxt", default="./MobileNetSSD_deploy.prototxt.txt",
help='Path to text network file: '
'MobileNetSSD_deploy.prototxt for Caffe model or '
)
ap.add_argument("--model", default="./MobileNetSSD_deploy.caffemodel",
help='Path to weights: '
'MobileNetSSD_deploy.caffemodel for Caffe model or '
)
ap.add_argument("--confidence", default=0.2, type=float, help="confidence threshold to filter out weak
detections")
args = vars(ap.parse_args())

# initialize the list of class labels MobileNet SSD was trained to


# detect, then generate a set of bounding box colors for each class
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

# load our serialized model from disk


print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
#net = cv2.dnn.readNetFromCaffe(args.prototxt, args.model)

# initialize the video stream, allow the cammera sensor to warmup,


# and initialize the FPS counter
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(2.0)
fps = FPS().start()

# loop over the frames from the video stream


while True:
# grab the frame from the threaded video stream and resize it
# to have a maximum width of 400 pixels
frame = vs.read()
frame = imutils.resize(frame, width=400)

# grab the frame dimensions and convert it to a blob


(h, w) = frame.shape[:2]
blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)),
0.007843, (300, 300), 127.5)

# pass the blob through the network and obtain the detections and
# predictions
net.setInput(blob)
detections = net.forward()

# loop over the detections


for i in np.arange(0, detections.shape[2]):
# extract the confidence (i.e., probability) associated with
# the prediction
confidence = detections[0, 0, i, 2]

# filter out weak detections by ensuring the `confidence` is


# greater than the minimum confidence
if confidence > args["confidence"]:
# extract the index of the class label from the
# `detections`, then compute the (x, y)-coordinates of
# the bounding box for the object
idx = int(detections[0, 0, i, 1])
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")

# draw the prediction on the frame


label = "{}: {:.2f}%".format(CLASSES[idx],
confidence * 100)
label2 = "{}".format(CLASSES[idx])
cv2.rectangle(frame, (startX, startY), (endX, endY),
COLORS[idx], 2)
y = startY - 15 if startY - 15 > 15 else startY + 15
cv2.putText(frame, label, (startX, y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
os.system('espeak "'+label2+'"')
print(label2)

dist = distance()
print ("Measured Distance = %.1f cm" % dist)
if dist<20.00:
text="Obstacle detected at "+str('%.2f' % dist)+" cm"
os.system('espeak "'+text+'"')
print ("Obstacle detected..!!")

# show the output frame


cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF

# if the `q` key was pressed, break from the loop


if key == ord("q"):
break

# update the FPS counter


fps.update()

# stop the timer and display FPS information


fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()

5. RESULT AND OUTPUTS


6. TESTING

6.1 Test Cases:

Test Test Case Input Data Expected Result Actual Result Status
Case ID Description
TC_01 Checking that Install OS in OS should be installed OS is installed and Pass
Raspberry-Pi works Raspberry-Pi and Raspberry-Pi Raspberry-Pi should
properly. should work as work as intended
intended

TC_02 Checking that Pi Connect Pi Camera should get Camera should get Pass
Camera works Camera to connected and should connected and
properly. Raspberry-Pi turn on when power should turn on when
supply is given to power supply is
Raspberry-Pi given to Raspberry-
Pi

TC_03 Installing libraries Import Libraries needed for Libraries needed for Pass
needed for libraries ultrasonic sensor ultrasonic sensor are
ultrasonic sensor should get installed installed

TC_04 Detecting the object Place object Object should be Object is detected Pass
in front of the detected and voice and voice output is
camera output should be given given

TC_05 Recognizing the Place object Object should be Object is detected Pass
type of the object in front of the detected and type of and type of object is
camera object should get recognized and also
recognized also speech output
speech output should should be given
be given

TC_06 Detecting and Place an Object at a distant Object at a distant Pass


recognizing the object at level should get level got detected
object at a distant distant level detected and and recognized by
level from the recognized by the Pi- the Pi-Camera
camera Camera

TC_07 Checking the Place object The object placed in The object placed in Pass
accuracy level of in front of an the front of the camera the front of the
the detected object camera should get detected camera is detected
accurately accurately

TC_08 Testing the Place the The object placed in The object placed in Pass
detection of the object in lighting camera should lighting camera is
object by placing in lighting be detected with good detected with good
lighting conditions condition accuracy level by the accuracy level by
camera and also the camera and also
speech output should speech output is
be given given

TC_09 Testing the Place the The object placed in a The object placed in Pass
detection of the object with a different orientation a different
object by placing an different should be detected and orientation is
object in a different orientation recognized by the detected and
orientation camera
recognized by the
camera

TC_10 Checking the Place All the objects placed All the objects Pass
response time of the multiple sequentially in front of placed sequentially
hardware kit to objects the camera should be in front of the
detect and sequentially detected and camera are detected
recognize the object in front of recognized accurately and recognized
camera accurately

7. COSTING OF THE PROJECT

7.1 Project Estimate


• Reconciled Estimate
The Constructive Cost Model (COCOMO) is an algorithmic software cost estimation
model developed by Barry Boehm. The model uses a basic regression formula, with parameters that are
derived from historical project data and current project characteristics. It is a method for evaluating the
cost of a software package. According to him software cost estimation should be done through three
stages:

1. Basic COCOMO Model


2. Intermediate COCOMO Model
3. Complete/Detailed COCOMO Model

• Basic COCOMO:
Computes software development effort and cost as a function of program size expressed in
estimated DSIs.
There are three modes within Basic COCOMO:
➢ Organic Mode:
Development projects typically are uncomplicated and involve small experienced teams. The
planned software is not considered innovative and requires a relatively small amount of DSIs
(typically under 50,000).
➢ Semidetached Mode:
Development projects typically are more complicated than in Organic Mode and involve teams
of people with mixed levels of experience. The software requires no more than 300,000 DSIs. The
project has characteristics of both projects for Organic Mode and projects for Embedded Mode.

➢ Embedded Mode:
Development projects must fit into a rigid set of requirements because the software is to be
embedded in a strongly joined complex of hardware, software, regulations and operating
procedures.

The basic COCOMO estimation model is given by the following expressions:


Effort = a1 x (KLOC)a2PM
Tdev=b1 x(Effort)b2 Months

Where,

• KLOC is the estimated size of the software product expressed in Kilo Lines of Code,

• a1, a2, b1, b2 are constants for each category of software products,
• Tdev is the estimated time to develop the software, expressed in months,

• Effort is the total effort required to develop the software product, expressed in person months
(PMs).

➢ Intermediate COCOMO:
An extension of the Basic model that computes software development effort by adding a set of
"cost drivers," that will determine the effort and duration of the project, such as assessments of
personnel and hardware.

• Detailed COCOMO:
An extension of the Intermediate model that adds effort multipliers for each phase of the project
to determine the cost drivers impact on each step.
Example: A distributed Management Information System (MIS) product for an organization having
offices at several places across the country can have the following sub-components:

• Database part
• Graphical User Interface (GUI) part
• Communication part

7.2 Project Cost Estimation


It is the most frequently used technique for evaluating the effectiveness of the proposed system
more commonly kwon as cost analysis. It determines the benefit savings that are expected from the
proposed system compare them with cost. The system is economically feasible since it would not entail
additional hardware. There by savings on the costs the manpower involved in economical feasible
study we do some calculations.

PHASES COST/HOUR HOURS COST


ESTIMATION

Requirement 30/- 20H 600


Gathering

Design Code 50/- 30H 1500

Code 50/- 20H 1250


Development

Testing 40/- 40H 2400

Implementation 60/- 10H 400

Table 7.1 Cost Estimation Table

COMPONENTS COST
Raspberry Pi 4 3870/-

RAM- 4GB

Raspberry Pi 2179/-
Camera V28

Kirtronic 394/-
Ultrasonic Sensor
HC-SR04

Memory Card 769/-


32GB

Table 7.2 Cost Estimation Table (Components)

7.3 Time Estimation:


Phases Time

Analysis 20H

Design 30H

Coding 20H

Testing 30H

Documentation 20H

Maintenance 40H

Total time 160H

Table 7.3 Time Estimation Table

8. APPLICATIONS OF THE PROJECT


1. This system is used to help the visually impaired to have access to the most important features of the phone
enhancing the quality of the system making use of different custom layouts and using speech to text.
2. Very affordable
3. Reliable application for visually impaired users as deference’s between objects are easily detected

9. FEATURES OF THE PROJECT


1. The system can be used in the real time object detection.
2. It helps Blind person which helps to prevent them from dangerous location.
3. This project is to provide cost effective way to allow path planning for blind people.
4. Speech output systems can be used to read screen text to computer users who are blind.
5. Low cost.
6. Huge processing power in a compact board.
7. Supports Python (making it easy to build applications).
8. We will use OpenCV (Open Computer Vision) algorithm for object recognition which will provide us
with a huge libraries of datasets.

10. ADVANTAGES AND DISADVANTAGES


10.1 Advantages:

1. Access to authorized personnel only.


2. No additional devices need to be carried for object detection.
3. The system is used in the real time object detection.
4. It helps to Blind person to prevent from dangerous location.
5. This project is to provide cost effective way to allow path planning for blind people.
6. Reducing the bulkiness and making the system portable for the application to come into picture in real
life.
7. Modifying speed, accuracy, and reliability of the system
8. Making the Raspberry Pi speak and respond to instructions given by blind person (eg. Navigation)

10.2 Disadvantages:
1. Slow Internet Connection.
2. Low Light Intensity affects accuracy of model

11. FUTURE SCOPE


In the domain of Artificial Intelligence, there was only the capturing the object using R-Pi in the existing
system. At present, the work was successful to detect, recognize and track the object. And is used to avoids
the user from touching the hot object. To further this project can be followed out with any other advanced
devices by using simple coding language to get it less complicated. The complication can be reduced by a tiny
gadget which could be more useful those people in this electronic world. The future perspective of this project
is to increase the object recognition to provide an exact distance measurement between the people and object.
However, for developing an application that involves many objects that are fast-moving, you should instead
consider faster hardware.

12. CONCLUSION
In this project, we have implemented an image to speech conversion technique using a raspberry pi. This
system is much helpful Application for blind people. This system will be very easy to use. This Application is
very helpful towards the blind people and makes very easy for them to use electronic gadgets with (text to
speech) TTS technology, they will be able to interact more efficiently to the electronic system. This is an
economical as well as an efficient device for the visually impaired people. The device is compact and helpful
to the society. The proposed system is mobile, robust, and efficient. Also, it creates a virtual environment and
this system provides a sense of assurance as it voices the name of the object recognized. The usage of
Raspberry Pi has optimized the system and brought down the cost drastically compared to conventional
systems. The proposed system is simple in design, easy to install and is reliable to a great extent. Blind person
is able to detect and recognize the object in front (within a specific range) through audio output. In all, the
developed system is able to develop a technical approach for solving a real- life problem in a time and cost-
effective manner to a great extent.

13. REFERENCES
1. Ayat A. Nada Department of Computers and Systems Electronics Research Institute, Giza, Egypt ,
“Assistive Infrared Sensor Based Smart Stick for Blind People” [email protected]

2. Arnesh Sen Kaustav Sen Jayoti Das Jadavpur University: Dept. of Physics, “Ultrasonic Blind Stick For
Completely Blind People To Avoid Any Kind Of Obstacles”, Kolkata, India [email protected].

3. “An Application of Infrared Sensors for Electronic White Stick” S. Innet 1, N. Ritnoom 21Department of
Computer and Multimedia Engineering 2Department of Electrical Engineering University of the Thai
Chamber of Commerce.

4. Sharang Sharma, Manind Gupta, Amit Kumar, Meenakshi Tripathi, Manoj Singh Gaur, “Multiple
Distance Sensors Based Smart Stick for Visually Impaired People.”, 2017.

5. M.Micheal Priyanka, M.Michael Dharsana, “Navigational Aiding System For Visually Impaired", Third
International Conference On Science Technology Engineering And Management (ICONSTEM), 2017

6. A. Aladrén, G. López-Nicolás, Luis Puig, and Josechu J. Guerrero, "Navigation Assistance for the
Visually Impaired Using RGB-D Sensor With Range Expansion." IEEE Systems Journal 2014.

7. Laviniu _epelea, Ioan Gavrilu_, Alexandru Gacsádi, “Smartphone Application to Assist Visually
Impaired People”, 14th International Conference on Engineering of Modern Electric Systems (EMES),
2017.

8. Kabalan Chaccour and Georges Badr," Novel indoor navigation system for Visually Impaired and blind
people",2015.

9. Jinqiang Bai, Shiguo Lian, Zhaoxiang Liu, Kai Wang, and Dijun Liu, "Smart Guiding Glasses for Visually
Impaired People in Indoor Environment", IEEE Transactions on Consumer Electronics, Vol. 63, No. 3,
August 2017,pp.258-266

10. Zoran Zivkovic. Improved Adaptive Gaussian Mixture Model for Background Subtraction Pattern
Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on 20 Sep 2004

You might also like