0% found this document useful (0 votes)
11 views

Flood detection PPT

This document presents a study on the application of supervised learning models for flood forecasting and risk mapping, emphasizing the limitations of traditional methods and the potential of machine learning techniques. The research utilizes historical data to develop models like Histogram-based Gradient Boosting Classifier, achieving improved accuracy and efficiency in flood predictions. The findings suggest that integrating advanced data processing and machine learning can significantly enhance real-time flood forecasting and disaster management strategies.

Uploaded by

Rishabh Vyas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Flood detection PPT

This document presents a study on the application of supervised learning models for flood forecasting and risk mapping, emphasizing the limitations of traditional methods and the potential of machine learning techniques. The research utilizes historical data to develop models like Histogram-based Gradient Boosting Classifier, achieving improved accuracy and efficiency in flood predictions. The findings suggest that integrating advanced data processing and machine learning can significantly enhance real-time flood forecasting and disaster management strategies.

Uploaded by

Rishabh Vyas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Application of Supervised

Learning Models for


Flood Forecasting and
Risk Mapping
METHODOLOGY PRESENTATION
Index
▪ Abstract
▪ Introduction
▪ Literature Review
▪ Objective
▪ Methodology
▪ Result
▪ Conclusion
▪ References
Abstract
Floods are among the most devastating disasters, doing significant damage to life, property, as well as the

environment. In recent years, climate change as well as urbanisation have heightened the frequency and intensity of

floods. Traditional flood prediction techniques often exhibit limitations in accuracy & real-time responsiveness. This

thesis studies the use of machine learning methodologies to improve flood detection as well as prediction systems.

Through the analysis of historical rainfall, river levels, and weather data, machine learning models such Random

Forest, Support Vector Machines, as well as Neural Networks may discern trends and provide timely predictions. The

research evaluates many models according to performance metrics such as accuracy and precision. The integration of

data-driven methodologies with hydrological systems is a viable alternative to mitigate flood effects. This study

advances the creation of intelligent early warning systems, facilitating quick evacuations and risk management

strategies.
Introduction
 Floods are highly destructive, causing major loss of life and property, especially in flood-prone regions like India.

 Traditional prediction methods are slow and complex, relying on hydrological models.

 Machine learning (ML) offers faster, more accurate flood forecasting using data from weather, satellites, and historical
records.

 India faces the world's highest annual flood risk, with recurring floods in many regions. Notably, Chennai experienced
record rainfall in 2015, emphasizing the need for precise urban flood forecasting.

 Integration with GIS and IoT enhances real-time analysis and early warning capabilities.[1]

 This research proposes an ML-based system for flash flood prediction in urban areas using real-time data and image
processing for quick, precise alerts.
Importance of Flood Prediction
in India
Minimizing Losses: Flood prediction helps reduce deaths and economic damage by enabling timely evacuations and
preventive actions. It's a cost-effective, non-structural flood control method supported by the Ministry of Jal Shakti. [2]

Better Preparedness: Accurate forecasts allow authorities to mobilize resources, deploy emergency services, and use
real-time tech for improved disaster response. [3]

Agricultural Protection: Early warnings support farmers in planning around floods, protecting crops and livelihoods in
India’s agriculture-driven economy. [4]

Smarter Infrastructure Planning: Flood data guides the design of resilient urban infrastructure like drainage systems
and flood barriers, reducing future risks. [5]

Community Involvement: Local knowledge enhances early warning systems, with organizations like the UNFCCC
recognizing the value of community engagement in flood management. [6][7]
Literature Review
Technological Innovations in Flood Prediction Systems

Technology Description Reference

Remote Sensing and Satellite Data Real-time imagery for tracking rainfall, river overflow, and land changes. [8]

Geographic Information Systems (GIS) Integrates spatial and hydrological data to model flood-prone zones and evacuation routes. [9]

IoT and Sensor Networks Provides real-time water level and rainfall data, crucial for early warning systems. [10]

Cloud Computing Enables big data processing and remote collaboration on flood forecasting models. [11]

UAVs and Drones Offers rapid aerial imagery of flood zones for real-time assessment and damage evaluation. [12]

Mobile-Based Early Warning Systems Delivers alerts and evacuation instructions via SMS and mobile apps to targeted regions. [13]
Literature Review
Challenges in Accurate Flood Prediction

Challenge Description Referenc


e
Data Scarcity in Remote Areas Many regions lack sufficient historical or real-time hydrological data, making model training difficult. [14]

Climate Variability Unpredictable shifts in weather patterns reduce the reliability of traditional and data-driven models. [15]

Model Overfitting and Bias ML models may overfit to training data, leading to poor generalization and unreliable forecasts. [1]

Integration of Diverse Datasets Combining satellite, sensor, weather, and historical data requires high compatibility and preprocessing. [16]

Lack of Real-Time Infrastructure Limited access to IoT devices and real-time data pipelines hampers immediate response and forecasting. [17]

Computational Constraints Advanced ML models demand significant computational resources for training, deployment, and real-time use. [18]
Research Problem
 Floods cause major damage to life, property, agriculture, and infrastructure—especially in India.

 Despite advances, traditional flood prediction models lack real-time accuracy due to weather variability and data
limitations.

 ML offers potential for better forecasting by learning complex, non-linear patterns from large datasets.

 Current ML-based systems face challenges like limited data, generalization, and feature selection.

 This thesis aims to develop accurate and efficient ML models for real-time flood detection and prediction, integrating
traditional and intelligent methods.
Research Objectives

 Data Collection & Processing: Acquire and clean hydrological & meteorological data; engineer relevant features.

 Model Development: Implement ML algorithms like RF, DT, and SVM; optimize and validate performance.

 Performance Evaluation: Use metrics (accuracy, F1-score, RMSE) and compare with traditional methods.

 Localized Forecasting: Deliver district-level, timely predictions for better disaster mitigation.

 Societal Impact: Apply models in real-world systems; assess benefits to public safety, policy, and economy.
Methodology
STEPS OF METHODOLOGY
The methodology outlines the systematic approach and procedures employed in this study to predict flood
using machine learning techniques -
• Collect the Data
• Pre- Process the data
• Model Development
• Training
• Evaluation
• Prediction
• Compare our result with existing work.
Methodology

Proposed Model: -
Methodology
Model Overview:

Model Chosen: Histogram-based Gradient Boosting Classifier (HistGradientBoostingClassifier)

 Ensemble Technique: Uses decision tree ensembles to improve prediction.

 Histogram Binning: Speeds up training by converting continuous data into discrete bins.

 Efficient & Scalable: Handles large datasets and missing values efficiently.

 Early Stopping: Prevents overfitting via validation monitoring.


Methodology
Model Development Process

Data Preparation

• Cleaned and engineered features

• SMOTE for class balance

• PCA applied (95% variance retained)

Model Training:

• Trained using Sklearn’s HistGradientBoostingClassifier

• Parameters tuned: max_iter, max_depth, learning_rate, max_leaf_nodes, min_samples_leaf


Methodology
Model Development Process:

Hyperparameter Tuning

• Used GridSearchCV

• Accuracy score used to favor accurate flood detection


Data Collection
 Data Sources: Two open-source meteorological & hydrological datasets from flood-prone South Asian regions,

especially Bangladesh (similar climate to Indian flood zones).

 Time Period: Historical monthly data from 1948 to 2013.

 Coverage: Includes data from multiple weather stations—each with geographic coordinates (X, Y, lat-long,

altitude).

Key Features: CollectedMax & Min Temperature (°C) Rainfall (cm), Relative Humidity (%) Wind Speed (m/s), Cloud

Coverage (okta) Bright Sunshine (hrs/day)Station Info (ID, Name, Location) Flood Label: Binary (1 = Flood, 0 = No

Flood)
Data Preprocessing
1. Data Cleaning: Initial preprocessing involved removing duplicates, handling missing values using mean or median

imputation, and eliminating irrelevant features like non-standard date formats and redundant identifiers. This step

ensured cleaner data for model training and reduced noise.

2. Feature Engineering: New features were derived to capture important relationships:

• Temperature Range: Difference between max and min temperature.

• Rainfall-Temperature Interaction: Product of rainfall and mean temperature.

• Rainfall Squared: Emphasized impact of extreme rainfall on floods.

3. Transformation: Numerical features were transformed using the Yeo-Johnson method, which handles skewness and

supports both zero and negative values—ideal for weather data variability.
Data Preprocessing
4. Feature Selection: To reduce dimensionality and improve interpretability, SelectKBest with ANOVA F-test was used.

The top 8 features contributing most to flood prediction were retained.

5. Handling Imbalanced Data: Flood instances were underrepresented. To address this, SMOTE (Synthetic Minority

Oversampling Technique) was applied to generate synthetic samples of the minority class, improving model sensitivity and

reducing bias.

6. Final Preprocessing: Selected features were standardized using z-score normalization.Then, PCA (Principal Component

Analysis) was applied to reduce dimensionality while preserving 95% variance, optimizing model performance and training

speed.
Result
RESULTS
To provide a thorough comparison, several classification models were trained as well as evaluated on the preprocessed
dataset. The models include "SVM, MLPClassifier, AdaBoost, SGDClassifier, Logistic Regression, as well as
Histogram-based Gradient Boosting". Each model was assessed based on Accuracy, F1 Score, Recall, as well as Training
Time as the main metrics.
Model Accuracy F1 Score Recall Training Time (seconds)

Support Vector Machine 0.920 0.920 0.923 10.144

Multi-layer Perceptron 0.919 0.919 0.915 13.121

AdaBoost 0.909 0.908 0.898 1.339

SGDClassifier 0.915 0.917 0.925 0.037

Histogram-based Gradient Boosting 0.923 0.924 0.922 0.488

Logistic Regression 0.916 0.917 0.929 0.024


Result
Model Results
0.929
0.93 0.925
0.923 0.924 0.923 0.922
0.925 0.92 0.919 0.92 0.919
0.916 0.917 0.917
0.92 0.915 0.915
0.915
0.909 0.908
0.91

0.905
0.898
Graph: Accuracy 0.9

Achieved 0.895

by models 0.89

0.885

0.88
Accuracy F1 Score Recall

Support Vector Machine Multi-layer Perceptron AdaBoost


SGDClassifier Histogram-based Gradient Boosting Logistic Regression
Result

Among all models, the "Histogram-based Gradient Boosting

Classifier" had the greatest overall performance, particularly

for Accuracy (0.923) and F1 Score (0.924), with a very short

training time. This made it the most robust choice for more

optimization.
Result
Metric Value
Value
Model Histogram-based Gradient Boosting 0.98
Accuracy 93% 98%
Precision 0.92 97%
96%
Recall 0.92
95%
F1 Score 0.92 94% 0.93
ROC-AUC Score 0.98 93% 0.92 0.92 0.92
Training Time 4.67 92%

(seconds) 91%
90%
89%
Accuracy Precision Recall F1 Score ROC-AUC
Score

Results of Proposed Model


Result

Confusion Matric of HGB Model ROC Curve of HGB Model


on test data on test data
Comparison
Comparing the Proposed method with Existing Work [19] :

Aspect Existing Study [19] Our Study (Histogram-based Gradient


Boosting)

Accuracy 0.86 0.93


Accuracy Comparison
94% 93%
F1 Score 0.88 0.92
92%
Recall 0.87 0.92
90%
Training Time 4.67 seconds 0.49 seconds 88%
86%
Hyperparameter No Yes (GridSearchCV) 86%

Tuning 84%

Model Type Logistic regression Ensemble Learning (HistGradientBoosting) 82%


Accuracy

Predictive Moderate High Logistic regression Histogram Based GB


Robustness
Handling Imbalance Not Specified SMOTE + PCA
Improvements And Novelty
Higher Accuracy & Recall: Achieved 0.93 accuracy and 0.92 recall, outperforming previous models.

Improved F1 Score: Scored 0.92, showing better balance between precision and recall.

Faster Training: Reduced training time to 0.49s vs. 4.67s in the neural network.

Smarter Preprocessing: Used SMOTE and PCA for better class balance and dimensionality reduction.

Better Model Design: Applied GridSearchCV and Histogram-based Gradient Boosting for stronger, more
reliable predictions.:
Conclusion and Future Work
This study presents a robust and efficient machine learning-based flood prediction model that significantly
improves accuracy, recall, and computational efficiency compared to traditional models. By leveraging
advanced preprocessing techniques like SMOTE and PCA, along with optimized ensemble methods such
as Histogram-based Gradient Boosting, the system ensures more reliable and real-time flood forecasting. In
future work, the model can be extended to include real-time sensor data integration, geographic
information systems (GIS), and satellite imagery to enhance spatial resolution and predictive capability.
Additionally, deploying the model as a cloud-based early warning system accessible to disaster
management authorities can amplify its practical impact.
References
[1] A. Mosavi, "Flood Prediction Using Machine Learning Models: Literature Review," https://doi.org/10.3390/w10111536, vol. 10, 2018.

[2] M. o. J. Shakti, "Flood Forecasting," Central Water Commission, Government of India, 2021. [Online]. Available: https://jalshakti-
dowr.gov.in.

[3] W. Management, "Real-time modeling and simulation for flood forecasting," [Online]. Available: https://3diwatermanagement.com.

[4] E. W. S. Initiative, "Empowering Farmers through Early Warnings," [Online]. Available: https://www.earlywarningsystem.org.

[5] ScienceDirect, "Flood Risk Assessment and Infrastructure Resilience," [Online]. Available: https://www.sciencedirect.com.

[6] UNFCCC, "Community-based approaches to flood management," [Online]. Available: https://unfccc.int.

[7] Schumann and B. G., "Progress in integration of remote sensing–derived flood extent and stage data and hydraulic models," Reviews
of Geophysics, 2009.

[8] N. N. Kourgialas and Karatzas, "Flood management and a GIS modelling method to assess flood-hazard areas—A case study,"
Hydrological Sciences Journal, p. 212–225, 2011.

[9] A. K. R. &. S. harma, "IoT-enabled smart flood monitoring system," Procedia Computer Science,, p. 2081–2090, 2020.

[10] M. M. S. M. &. A. Z. Gohar, "Cloud computing and smart grids: A review," Journal of Network and Computer Applications, p. 27–44,
2020.
References
[11] A. A. &. A. O. D. Adelakun, "Application of UAV technology in flood disaster monitoring and management," International Journal of Disaster Risk
Reduction, 2021.

[12] K. Beven, Rainfall-runoff modelling: The primer, Wiley-Blackwell, 2012.

[13] Z. W. &. S. H. J. Kundzewicz, "Floods in the IPCC TAR perspective. Natural Hazards," https://doi.org/10.1023/B:NHAZ.0000020250.95796.b9, p. 111–128,
2004.

[14] G. e. a. Schumann, "Progress in integration of remote sensing–derived flood extent and stage data and hydraulic models.," Reviews of Geophysics, 2009.

[15] A. K. R. &. S. S. Sharma, "IoT-enabled smart flood monitoring system," Procedia Computer Science, p. 2081–2090, 2020.

[16] R. J. S. L. M. &. S. D. P. Abrahart, "Practical hydroinformatics," Springer, 2008.

[17] H. Gawas, "Advancing Flood Prediction: Leveraging Machine Learning for Accurate Prediction," International Journal for Research in Applied Science and
Engineering Technology, p. 2235–2240, 2023.
References
[18] R. Byali and P. B. Divya, "Early Flood Detection Based on Iot Using Machine Learning," International Journal of Research Publication and Reviews, p.
3842–3846, 2022.

[19] Syeed, Miah Mohammad Asif, et al. "Flood prediction using machine learning models." 2022 International Congress on Human-Computer Interaction,
Optimization and Robotic Applications (HORA). IEEE, 2022.

You might also like