TY - JOUR
T1 - Robust machine learning algorithms for predicting coastal water quality index
AU - Uddin, Md Galal
AU - Nash, Stephen
AU - Mahammad Diganta, Mir Talas
AU - Rahman, Azizur
AU - Olbert, Agnieszka I.
N1 - Publisher Copyright:
© 2022 The Authors
PY - 2022/11/1
Y1 - 2022/11/1
N2 - Coastal water quality assessment is an essential task to keep “good water quality” status for living organisms in coastal ecosystems. The Water quality index (WQI) is a widely used tool to assess water quality but this technique has received much criticism due to the model's reliability and inconsistence. The present study used a recently developed improved WQI model for calculating coastal WQIs in Cork Harbour. The aim of the research is to determine the most reliable and robust machine learning (ML) algorithm(s) to anticipate WQIs at each monitoring point instead of repeatedly employing SI and weight values in order to reduce model uncertainty. In this study, we compared eight commonly used algorithms, including Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGB), Extra Tree (ExT), Support Vector Machine (SVM), Linear Regression (LR), and Gaussian Naïve Bayes (GNB). For the purposes of developing the prediction models, the dataset was divided into two groups: training (70%) and testing (30%), whereas the models were validated using the 10-fold cross-validation method. In order to evaluate the models' performance, the RMSE, MSE, MAE, R2, and PREI metrics were used in this study. The tree-based DT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and the ExT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and ensemble tree-based XGB (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = +0.16 to −0.17) and RF (RMSE = 2.0, MSE = 3.80, MAE = 1.10, R2 = 0.98, PERI = +3.52 to −25.38) models outperformed other models. The results of model performance and PREI indicate that the DT, ExT, and GXB models could be effective, robust and significantly reduce model uncertainty in predicting WQIs. The findings of this study are also useful for reducing model uncertainty and optimizing the WQM-WQI model architecture for predicting WQI values.
AB - Coastal water quality assessment is an essential task to keep “good water quality” status for living organisms in coastal ecosystems. The Water quality index (WQI) is a widely used tool to assess water quality but this technique has received much criticism due to the model's reliability and inconsistence. The present study used a recently developed improved WQI model for calculating coastal WQIs in Cork Harbour. The aim of the research is to determine the most reliable and robust machine learning (ML) algorithm(s) to anticipate WQIs at each monitoring point instead of repeatedly employing SI and weight values in order to reduce model uncertainty. In this study, we compared eight commonly used algorithms, including Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGB), Extra Tree (ExT), Support Vector Machine (SVM), Linear Regression (LR), and Gaussian Naïve Bayes (GNB). For the purposes of developing the prediction models, the dataset was divided into two groups: training (70%) and testing (30%), whereas the models were validated using the 10-fold cross-validation method. In order to evaluate the models' performance, the RMSE, MSE, MAE, R2, and PREI metrics were used in this study. The tree-based DT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and the ExT (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = 0.0) and ensemble tree-based XGB (RMSE = 0.0, MSE = 0.0, MAE = 0.0, R2 = 1.0 and PERI = +0.16 to −0.17) and RF (RMSE = 2.0, MSE = 3.80, MAE = 1.10, R2 = 0.98, PERI = +3.52 to −25.38) models outperformed other models. The results of model performance and PREI indicate that the DT, ExT, and GXB models could be effective, robust and significantly reduce model uncertainty in predicting WQIs. The findings of this study are also useful for reducing model uncertainty and optimizing the WQM-WQI model architecture for predicting WQI values.
KW - Robust machine learning algorithms
KW - Coastal water quality index model
KW - Coastal water quality
KW - Uncertainty
KW - Cork harbour
KW - Reproducibility of Results
KW - Algorithms
KW - Ecosystem
KW - Bayes Theorem
KW - Machine Learning
KW - Water Quality
UR - http://www.scopus.com/inward/record.url?scp=85136224158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136224158&partnerID=8YFLogxK
UR - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4135697
U2 - 10.1016/j.jenvman.2022.115923
DO - 10.1016/j.jenvman.2022.115923
M3 - Article
C2 - 35988401
AN - SCOPUS:85136224158
SN - 1095-8630
VL - 321
JO - Journal of Environmental Management
JF - Journal of Environmental Management
M1 - 115923
ER -