0% found this document useful (0 votes)
24 views11 pages

Icdt 2024

Uploaded by

Siddhartha Negi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views11 pages

Icdt 2024

Uploaded by

Siddhartha Negi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

24

Developing a Model for Bird Vocalization Recognition and


Population Estimation in Forest Ecosystems
Kartik Yadav, Suyash Dabral , Satvik Vats, Vikrant
Sharma, Vinay Kukreja
Graphic Era Hill University, Graphic Era Hill University, Graphic Era Hill University, Graphic Era Hill
University, Chitkara University
kartik134yadav@[Link], s.dabral2001@[Link], svats@[Link], vsharma@[Link],
[Link]@[Link]
Saptarsi Pal, Shreyansh Mishra, Ajay Kumar
Research Scholar, Research Scholar, Research Scholar
Dronacharya Group of Institutions, Dronacharya Group of Institutions, Dronacharya
Group of Institutions
Outline

❑ Introduction
❑ Existing Approaches/Related Works
❑ Problems in Existing Approaches
❑ Proposed Methodology
❑ Results and Discussion
❑ Conclusions and Future Work
❑ References
Introduction

1. **Introduction to Automated Bird Population Surveys**:


1. Overview of Shift from Manual to Automated Surveys
2. Importance of Machine Learning and Digital Signal Processing Technologies

2. **Challenges in Bird Audio Detection**:


1. Issues with Weakly Labelled Data (WLD) and Forest Soundscapes Complexity
2. Need for Improved Methods for Accurate Bird Call Identification

3. **Proposed Methodology**:
1. Utilization of Convolutional Neural Networks (CNNs) for Acoustic Bird Audio Classification
2. Creation of Spectrograms from Audio Data for Model Training

4. **Results Compilation and Ecological Impact**:


1. Compilation of Model Results into CSV Files for Quantifying Bird Vocal Presence
2. Significance of Automated Bird Audio Classification in Ecological Research
Existing Approaches/Related Works

1. **Joint Detection and Classification**: Model used detector and classifier to assign probabilities and type of frames, combining them with
weighted pooling technique.

2. **Convolutional Neural Networks**: Study used CNN architectures to tailor design to handle unique spectrograms data and address class
imbalances.

3. **Mixture Invariant Training**: Used unsupervised machine learning model for sound separation improving reverberant speech separation
and enhancing speech from noisy mixtures.

4. **CNN architecture using Theano and Lasagne**: The model uses CNN with 7 weighted layers, no bottlenecks and no shortcuts, designed
for ease of adjustment and built with Theano and Lasagne, supporting grouped convolutions and a final 1x1 convolution before global average
pooling.

5. **Pretrained ResNet 50**: The pre-trained ResNet-50 model was adapted to classify 46 bird species by replacing its last 1,000 neuron layer
with a 46 neuron fully-connected layer and using grayscale spectrogram images as input.
Problems in Existing Approaches

1. **Difficulty in Distinguishing Sounds**:


- Difficulty in accurately distinguishing between bird calls and other environmental sounds, which could lead to false positives or negatives.

2. **Class Imbalance**:
- The class imbalances in the dataset, where some bird species have more recordings than others, which can affect the model's ability to
generalize well to less common species.

3. **Lack of Labelled Data**:


- The method relies on unsupervised learning, which can be less effective than supervised learning in learning to separate specific sounds
from mixtures.

4. **Inefficiency in Handling Large Datasets**:


- The computational cost and training time associated, especially when dealing with large datasets or complex audio signals.

5. **Changing Dataset**:
- The model may not perform as well on the new task if the characteristics of bird species differ significantly from those of ImageNet classes
Proposed Methodology

1. **Audio Preprocessing**:
1. Description: Conversion and Resampling of Audio Files
2. Method: Utilization of TensorFlow's AudioIOTensor for Dual-Channel Conversion and Resampling to 16 kHz

2. **Batch Processing**:
1. Description: Partitioning and Spectrogram Translation
2. Method: Non-overlapping Batch Partitioning and Translation to Spectrograms for Each Batch

3. **Threshold-Based Classification**:
1. Description: Categorization of Projected Values
2. Method: Classification Using a Threshold of 0.99 to Label Capuchin Bird Vocalizations as '1' or '0’

4. **Output Storage**:
1. Description: Storing Quantified Capuchin Bird Vocalizations
2. Method: Saving Results in CSV Format for Each Audio Recording
Results and Discussion

1. **Error at the end of training**:


- error after each epoch. The final value of loss at the end of 10th epoch is 1.2473e-08.

2. **Precision**:
- Precision that measures how accurately a model determines a positive output for given input data. The final value of precision at the
end of 10th epoch is 1.0000.

3. **Recall**:
- the model's ability to detect Positive samples. The final value of recall at the end of 10th epoch is 1.0000.

4. **Accuracy**:
- the accuracy that measures how accurately the model is predicting the values correctly. The final value of accuracy at the end of
10th epoch is 0.9375.
Conclusions and Future Work

1. **Utilization of Convolutional Neural Networks (CNNs)**:


1. Pioneering strategy employing CNNs for classifying bird audio amidst complex soundscapes.
2. Focus on capuchin bird vocalizations in forest sound recordings, offering a robust identification and quantification method.

2. **Achieving High Accuracy**:


1. Attainment of exceptional 93% accuracy through supervised learning and spectrogram-based feature extraction.
2. Simplifies data collection and enhances understanding of bird behavior across diverse ecological settings.

3. **Versatility in Audio Recognition**:


1. Demonstrated adaptability and versatility of the model hint at broader applications in audio recognition.
2. Holds promise for automating bird population surveys and advancing ecological research and wildlife conservation efforts.

4. **Future Directions**:
1. Expanding Model Capabilities:
1. Future endeavors may involve extending the model to recognize and classify vocalizations from various bird species in similar acoustic
environments.
2. Development of a multi-label classification approach for comprehensive assessments of bird biodiversity in ecological research.
References

1. Nurtantio, Pulung et al. “Bird Voice Classification Based on Combination Feature Extraction and Reduction Dimension with the K-Nearest Neighbor.” International Journal of Intelligent Engineering
and Systems (2022): n. pag.

2. Jiangjian Xie, Yujie Zhong, Junguo Zhang, Shuo Liu, Changqing Ding, Andreas Triantafyllopoulos,A review of automatic recognition technology for bird vocalizations in the deep learning
era,Ecological Informatics.

3. Stefan Kahl1 , Thomas Wilhelm-Stein1 , Hussein Hussein1 , Holger Klinck2 , Danny Kowerko1 , Marc Ritter3 , and Maximilian Eibl., Large-Scale Bird Sound Classification using Convolutional
Neural Networks.

4. F. Yang, Y. Jiang and Y. Xu, "Design of Bird Sound Recognition Model Based on Lightweight," in IEEE Access, vol. 10, pp. 85189-85198, 2022, doi: 10.1109/ACCESS.2022.3198104.

5. keywords: {Feature extraction;Birds;Convolution;Biological system modeling;Residual neural networks;Statistics;Speech recognition;Noise measurement;Deep learning;Attention mechanism;bird
sound recognition;deep learning;lightweight;multi-scale feature fusion},

6. M. Graciarena, M. Delplanche, E. Shriberg and A. Stolcke, "Bird species recognition combining acoustic and sequence modeling," 2011 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 341-344, doi: 10.1109/ICASSP.2011.5946410. keywords: {Birds;Hidden Markovmodels;Computational
dealing;Acoustics;Indexes;Training;Data models;Bird species recognition;phone n-gram modeling;Gaussian mixture model},

7. Li, Jingtan & Sun, Mengkai & Zhao, Zhonghao & Li, Xingcan & Li, Gaigai & Wu, Chen & Qian, Kun & Hu, Bin & Yamamoto, Yoshiharu & Schuller, Björn. (2023). Battling with the low-resource
condition for snore sound recognition: introducing a meta-learning strategy. EURASIP Journal on Audio, Speech, and Music Processing. 2023. 10.1186/s13636-023-00309-3.

8. Dhondt, André & Lambrechts, Marcel. (1992). Individual voice recognition in birds. Trends in ecology & evolution. 7. 178-9. 10.1016/0169-5347(92)90068-M.

9. Mohammad, Aas & Tripathi, Manish. (2019). Audio Analysis and Classification: A Review. International Journal of Research in Advent Technology. 7. 103-109. 10.32622/ijrat.
References

10. R. Wielgat, T. P. Zieliński, T. Potempa, A. Lisowska-Lis and D. Król, "HFCC based recognition of bird species," Signal Processing Algorithms, Architectures, Arrangements, and Applications SPA
2007, Poznan, Poland, 2007, pp. 129-134, doi: 10.1109/SPA.2007.5903313. keywords: {Microphones;Mel frequency cepstral coefficient;Feature extraction;Speech
recognition;Monitoring;Frequency modulation;Accuracy},

11. H. Tyagi, R. M. Hegde, H. A. Murthy and A. Prabhakar, "Automatic identification of bird calls using Spectral Ensemble Average Voice Prints," 2006 14th European Signal Processing Conference,
Florence, Italy, 2006, pp. 1-5. keywords: {Birds;Abstracts;Computational modeling;Reliability;Databases;Euclidean distance},

12. Darji, Mittal. (2017). Audio Signal Processing: A Review of Audio Signal Classification Features. International Journal of Scientific Research in Computer Science, Engineering and Information
Technology. 2. 227-230..

13. Chen, Tianxiang & Kumar, Avrosh & Nagarsheth, Parav & Sivaraman, Ganesh & Khoury, Elie. (2020). Generalization of Audio Deepfake Detection. 132-137. 10.21437/Odyssey.2020-19.

14. Lewis, Rebecca & Williams, Leah & Gilman, Robert. (2020). The uses and implications of avian vocalizations for conservation planning. Conservation Biology. 35. 10.1111/cobi.

15. Lam Pham, Huy Phan, Truc Nguyen, Ramaswamy Palaniappan, Alfred Mertins and Ian McLoughlin., Robust Acoustic Scene Classification using a Multi-Spectrogram Encoder-Decoder.

16. Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss Kevin and John R. Hershey., Unsupervised Sound Separation Using Mixture Invariant Training.

17. Qiuqiang Kong, Yong Xu, Mark D. Plumbley Center for Vision, Speech and Signal Processing (CVSSP) University of Surrey., Joint Detection and Classification Convolutional Neural Network on
Weakly Labelled Bird Audio Detection.
Thank You

You might also like