Robust machine learning algorithms for predicting coastal water quality index

Md Galal Uddin, Stephen Nash, Mir Talas Mahammad Diganta, Azizur Rahman, Agnieszka I. Olbert

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
4 Downloads (Pure)

Abstract

Coastal water quality assessment is an essential task to keep “good water quality” status for living organisms in coastal ecosystems. The Water quality index (WQI) is a widely used tool to assess water quality but this technique has received much criticism due to the model's reliability and inconsistence. The present study used a recently developed improved WQI model for calculating coastal WQIs in Cork Harbour. The aim of the research is to determine the most reliable and robust machine learning (ML) algorithm(s) to anticipate WQIs at each monitoring point instead of repeatedly employing SI and weight values in order to reduce model uncertainty. In this study, we compared eight commonly used algorithms, including Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGB), Extra Tree (ExT), Support Vector Machine (SVM), Linear Regression (LR), and Gaussian Naïve Bayes (GNB). For the purposes of developing the prediction models, the dataset was divided into two groups: training (70%) and testing (30%), whereas the models were validated using the 10-fold cross-validation method. In order to evaluate the models' performance, the RMSE, MSE, MAE, R2, and PREI metrics were used in this study. The tree-based DT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and the ExT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and ensemble tree-based XGB (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = +0.16 to −0.17) and RF (RMSE = 2.0, MSE = 3.80, MAE = 1.10, R2 = 0.98, PERI = +3.52 to −25.38) models outperformed other models. The results of model performance and PREI indicate that the DT, ExT, and GXB models could be effective, robust and significantly reduce model uncertainty in predicting WQIs. The findings of this study are also useful for reducing model uncertainty and optimizing the WQM-WQI model architecture for predicting WQI values.

Original languageEnglish
Article number115923
Pages (from-to)1-16
Number of pages16
JournalJournal of Environmental Management
Volume321
Issue number11
Early online date19 Aug 2022
DOIs
Publication statusE-pub ahead of print - 19 Aug 2022

Fingerprint

Dive into the research topics of 'Robust machine learning algorithms for predicting coastal water quality index'. Together they form a unique fingerprint.

Cite this