SHALAKA FOUNDATION’S
KEYSTONE SCHOOL OF
ENGINEERING
SPEECH-TO-TEXT USING
NLP
By:
OM SHINDE B401180185
UNDER THE GUIDANCE OF:
GAYATRI SHINDE B401180184 GUIDE:- PROF. TUSHAR SURWADE
RITIKA YADAV B401180199 CO-GUIDE:- SONAL CHANDERI
NANDINI YAMALE B401180200
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 1
INDEX
• Introduction • Hardware & Software
• Objectives Requirements
• Literature survey • Performance analysis
• Problem definition • Result
• System architecture • Conclusion
• Algorithm • Future scope
• Flowcharts • References
• Work breakdown
structure
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 2
INTRODUCTION
NON-TECHNICAL
USERS
Speech to
To bridge this Gap, we have
SQL Query
Generator
Traditional database
querying requires
knowledge of SQL
(Structured Query
Language)
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 3
OBJECTIVES
Simplify Database Interaction Enhance Accessibility
Speech To Query Conversion Improve Usability
Real Time Query Execution
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 4
LITERATURE SURVEY
SN Paper Title Author(s) Year Methodology Findings of paper
Proposed a deep learning-based
Seq2SQL: Generating [Link] djahantighi approach to generate SQL queries from
1 Structured Queries from 2022 natural language using sequence-to- Implemented using with only Text-To-SQL
Natural Language sequence models, paving the way for
automated query generation from text.
Provided an overview of modern speech
recognition systems, highlighting Google
Speech-to-Text Systems: An Implemented using with only Speech-To-
2 Hinton et al. 2019 Speech Recognition's role in real-time
Overview Text
transcription and its applications in
various fields.
SPEECH-TO-SQL: Towards
Proposed a speech driven interface for Speech signals are not available from text
speech-driven SQL Query
3 Yu and Deng 2018 relational database using cascaded and detailed inner structure of speech
Generation From Natural
methods,SQLNet signals.
Language Question
Proposed a system that accepts the Due to incorrect Grammer or
Speech to SQL Generator- A
4 Zhong et al 2017 spoken query as input and gives SQL Mispronunciation, system fails to give
voice Based Approach
query output
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 5
Highlighted the evolution of deep learning in
Speech-to-Text: Automatic
speech-to-text systems, which serve as a basis
5 Speech Recognition Using Deep Xu et al. 2017 Implemented using with only Speech-To-Text
for modern speech recognition APIs like Google
Learning
Speech Recognition.
Introduced a system that learns to map natural
DBPal: Weak Supervision for
Manjunath, language queries to SQL queries using weak
6 Learning a Natural Language 2016 Implemented using with only Speech-To-Text
Shravankumar supervision, providing insights into handling
Interface to Databases
incomplete or noisy user inputs.
Using Natural Language
Weir et al. Proposed a system to solve the ambuity Implemented using with only Speech-To-Text
7 Processing in Order to Create 2012 between same words with multiple meanings And needs for updated model for the same
SQL Queries
SQLNet: Generating Structured Proposed a novel approach for generating SQL
Queries from Natural Language Yuanfeng, Raymond, queries from natural language input without
8 2008 Implemented using with only Speech-To-Text
Without Reinforcement Xuefang reinforcement learning, improving both
Learning efficiency and accuracy in query generation.
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 6
PROBLEM DEFINITION
Non-technical users often face challenges in interacting with
databases due to the complexity of SQL syntax. This limits their
ability to retrieve or manage data effectively. The Speech to SQL
Query Generator addresses this problem by providing a voice-
based interface that converts natural language speech into SQL
queries, allowing users to interact with databases without needing
to understand SQL.
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 7
SYSTEM ARCHITECTURE
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 8
ALGORITHM
•Start
•User Interaction
Capture voice input from the user via the User Interface (Web Browser).
•Send Voice Data
Send the captured voice input to the Flask Server.
•Voice Recognition
On the Flask Server:
-Send the voice data to the Google Speech Recognition API.
-Receive the transcribed text from the API.
•Text Processing
Send the transcribed text to the NLP & SQL Query Generation Module.
Generate the SQL query based on the processed text.
•Execute SQL Query
Send the generated SQL query to the Sample Database.
Execute the SQL query against the database.
Retrieve the query result.
•Display Results
Send the query result back to the Flask Server.
Display the results to the user via the User Interface.
•End
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 9
FLOWCHARTS
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 10
WORK BREAKDOWN
STRUCTURE
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 11
HARDWARE AND SOFTWARE
REQUIREMENTS
SOFTWARE REQUIREMENTS
• Operating System : Windows XP/7/Vista on wards
• Coding Language : Python
• IDE : VS Code
• Web Browser : Google Chrome
HARDWARE REQUIREMENTS
• System : Pentium IV 2.4 GHz.
• Hard Disk : 256 GB(Min).
• Monitor : 15 VGA Colour.
• IO Devices : Keyboard and Mouse.
• Ram : 4 GB(Min).
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 12
PERFORMANCE ANALYSIS
Accuracy Performance
93
Speech Recognition Accuracy:
92 Achieved 92% accuracy in converting
91
speech to text using Google Speech
Recognition API.
90
89
SQL Query Conversion Accuracy:
88 Natural language converted to correct
87
SQL syntax with 88% accuracy on
average.
86
Speech Recognition SQL Generation Intent Recognition
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 13
System Performance Overview
Average Time from
Speech to SQL
Pass Rate (%) Output:
1.5 to 2 seconds
depending on input
length.
Avg. Response Time (sec)
0 10 20 30 40 50 60 70 80 90 100
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 14
RESULT
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 15
CONCLUSION
The Speech to SQL Query Generator project represents a significant
advancement in the field of Natural Language Processing and Database
Management, bridging the gap between human communication and machine
understanding. By utilizing voice recognition technology, this application
empowers users to interact with databases through intuitive voice commands,
eliminating the need for manual SQL query writing.
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 16
FUTURE SCOPE
Support Multiple Languages Contextual Understanding
Specific Applications and Use
Enhanced NLP
Cases
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 17
REFERENCES
[1]Jurafsky, D., & Martin, J. H. (2021) Speech and Language Processing: An Introduction to Natural Language
Processing, Computational Linguistics, and Speech Recognition. Prentice Hall.
[2]Graves, A., & Jaitly, N. (2014) "Towards End-to-End Speech Recognition with Recurrent Neural Networks."
International Conference on Machine Learning (ICML).
[3]Baker, J. K. (1975) "Stochastic Modeling for Automatic Speech Understanding." IEEE Transactions on
Audio, Speech, and Signal Processing.
[4]Khan, A., & Hossain, M. S. (2020) "A Survey of Natural Language Processing Techniques for Data
Querying." Journal of Computer and Communications.
[5]Chowdhury, S., & Shaikh, M. (2019) "Voice-Based Database Querying System." International Journal of
Computer Applications.
25/06/2025 DEPARTMENT OF COMPUTER ENGINEERING 18