Flood Risk Analysis in Sindh, Pakistan: Predicting the Most Affected Tehsil Using Statistical and Machine Learning Models with Comprehensive Data

Authors

  • Amna Aziz Department of Computer Sciences IQRA University Karachi, Pakistan.
  • Muhammad Ali Department of Artificial Intelligence & Mathematical Sciences Sindh- Madressatul -Islam University (SMIU) Karachi, Pakistan.
  • Syed Muhammad Hassan Zaidi Department of Artificial Intelligence & Mathematical Sciences Sindh- Madressatul -Islam University (SMIU) Karachi, Pakistan.
  • Faisal Nawaz Department of Mathematics, Dawood University of Engineering and Technology (DUET), Pakistan.
  • Faryal Sheikh Department of Computer Science Sindh- Madressatul-Islam University (SMIU) Karachi, Pakistan.

DOI:

https://doi.org/10.62019/abbdm.v4i3.190

Keywords:

Flood Risk Sindh, Extreme Weather, Machine Learning, Statistical Model, Prediction, Climate Change, Disaster Management

Abstract

Extreme weather events pose significant risks and threats to the Sindh region, shedding light on the potential havoc they can wreak on livestock, human lives, and public infrastructure. Unusual monsoon rains in 2022 transformed the geography of Sindh, leading to devastating floods and significant damage. The current flood management strategies are effective, but there is some gap in early warning systems. Different findings suggest a need for refined approaches to improve predictive accuracy and response efficiency. The primary goal is to assess and improve existing flood response measures, identify opportunities for improvement, and strengthen resilience against flood events. In response to these challenges, this study provides actionable insights for policymakers and emergency planners to refine flood management strategies, ultimately aiming to bolster community resilience and preparedness. This paper presents a novel framework for a hybrid predictive model that combines machine learning and statistical analysis to improve the prediction of flood-induced displacement, providing timely warnings and preparing for future flood events. The selected dataset used for this research is from the 'Sindh Flood Data Analysis and Prediction 2022', which includes both demographic and geographical variables. By integrating machine learning and statistical models, the research evaluates their effectiveness in predicting flood impacts and identifying the most affected Tehsils using Python language and Seaborn library for analysis. The Gradient Boosting Classifier achieved the highest accuracy at 97.67%, followed by Logistic Regression at 95%. In contrast, Decision Trees and Lasso Regression demonstrated lower accuracies of 25% and 55%, respectively.

Downloads

Published

2024-08-08

How to Cite

Aziz, A., Ali, M., Zaidi, S. M. H., Nawaz, F., & Sheikh, F. (2024). Flood Risk Analysis in Sindh, Pakistan: Predicting the Most Affected Tehsil Using Statistical and Machine Learning Models with Comprehensive Data. The Asian Bulletin of Big Data Management, 4(3), Data Science 4(3)53–73. https://doi.org/10.62019/abbdm.v4i3.190