Mr.
Desuri Venkata Simha Reddy
Guided By: Dr. Rajesh Kumar.T
Analysis of ML Algorithms with Enhanced Accuracy for
Conversion of Non-Audible Murmur to Normal Speech
INTRODUCTION
To convert the Non-Audible Murmur to Normal Speech using Machine Learning
Techniques.
Non-Audible Murmur (NAM) is a form of communication that allows individuals
to convey messages without speaking aloud.
The use of this technology could bring about a significant change in communication
in situations where speaking aloud is difficult or not appropriate, for instance, in
noisy surroundings or when trying to communicate secretly.
NAM technology is improving accessibility for individuals with disabilities,
automating call centre operations, and enabling hands-free control of devices such
as smartphones and smart speakers.
MATERIALS AND METHODS
Collection of
NAM_Speech Data Feature
Extractio
Pre-processing n
Soft Spoken
Murmur
Signal Noise removal
NAM Speaker
Recognized Audible Voice
Speech
Model ML Based Speech
Signal Recognition
Optimization
MODEL TRAINING
Audio Signal
RESULTS
Comparison of machine learning algorithms in terms of mean accuracy. The mean accuracy of the Random Forest algorithm is
better than the SVM, RNN, KNN, and CNN.
DISCUSSION AND CONCLUSION
The dataset is collected from the Kaggle & Timit open source website and it contains more than a lakhs of samples which is
tested from different speakers having glottal infection, throat cancer and cold sores. Twenty thousand samples were tested in the
audacity speech tools.
Among Machine Learning algorithms, it based on the outcomes of independent T-test statistical analysis, it is determined that
the mean accuracy of two groups between
1. Recurrent Neural Network(92.11%) & Support Vector Machine(90.28%)
2. K-Nearest Neighbors(92.11%) & Recurrent Neural Network(0.939%)
3. Convolutional Neural Network(95.89%) & K-Nearest Neighbors(93.89%)
4. Random Forest(99.86%) & Convolutional Neural Network(95.89%)
NAM Conversion is applicable in military, CBI Investigations and used by Speech impaired people for communication.
It is concluded that the conversion of Soft Spoken Murmur to Normal Speech using Machine Learning Techniques, Random
Forest Algorithm provides best accuracy compared with the other Machine Learning Algorithms.
BIBLIOGRAPHY
Chen, Chengxin, and Pengyuan Zhang. 2022. “CTA-RNN: Channel and Temporal-Wise Attention RNN Leveraging Pre-
Trained ASR Embeddings for Speech Emotion Recognition.” Interspeech 2022. https://doi.org/10.21437/interspeech.2022-10403.
T, Rajesh Kumar, and Kumar T. Rajesh. 2021. “Enhanced Optimization in DCNN for Conversion of Non Audible Murmur to
Normal Speech Based on Dirichlet Process Mixture Feature.” Revista Gestão Inovação E Tecnologias.
https://doi.org/10.47059/revistageintec.v11i4.2239.
Babani, Denis, Tomoki Toda, Hiroshi Saruwatari, and Kiyohiro Shikano. 2011. “Acoustic Model Training for Soft Spoken
Murmur Recognition Using Transformed Normal Speech Data.” 2011 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP). https://doi.org/10.1109/icassp.2011.5947535
Heracleous, Panikos, and Norihiro Hagita. 2010. “Non-Audible Murmur Recognition Based on Fusion of Audio and Visual
Streams.” Interspeech 2010. https://doi.org/10.21437/interspeech.2010-717.