Development of a novel water quality index model using data science approaches

Md Galal Uddin

Research output: ThesisDoctoral Thesis


Surface water quality management is an essential task to acheive "good" water quality across all states. A number of tools and techniques are widely used to assess water quality, the water quality index (WQI) model is one of them. Commonly, the WQI model is used to convert various water quality indicator data into a single numerical value and its use has increased tremendously due to its simple mathematical operation and the fact that its results are easy to interpret by both professionals and non-experts. Most WQI application to date have focused on freshwaters - lakes, river, and ground waters, with just a small
number focused on coastal water quality assessment. Despite their advantages, WQI models have received criticism with respect to their reliability, validity, and inconsistency of results. In addition, several studies have revealed that
the existing WQI model results contain a lot of uncertainty due to eclipsing and
ambiguity problems associated with their architecture.

The research focuses on the development of a novel WQI model for coastal and
transitional coastal (TrC) waters. A comprehensive WQI model was developed for
assessing TrC waters, with significant improvements over existing WQI models. The model consists of five identical components, including: (i) indicator selection
technique—a random forest machine learning algorithm was used to select the crucial water quality indicator; (ii) sub-index (SI) functions—three brand new linear interpolation functions were developed for rescaling various water quality indicators' information into a uniform scale; (iii) indicators' weight method—a novel approach was developed using the random forest machine learning (ML) algorithm and mathematical rank sum weighting technique for estimating the weight values based on the relative significance of real-time information on water quality; (iii) aggregation function—the weighted quadratic mean (WQM) function was utilized for computing the water quality index (WQI) score; and (v) score interpretation scheme - a brand new classification scheme was proposed by analysing a range of classification techniques for assessing the state of water quality.

Model performance was evaluated and validated across four Irish TrC waterbodies using the EPA’s water quality monitoring data. Model performance was tested andevaluated using state-of-the-art machine learning (ML) and artificial intelligence (AI) techniques in terms of reducing the eclipsing and ambiguity problems as well as model uncertainty. The key findings from the applications results are as follows:
 The sensitivity results of model indicate that the model outputs could be
explained by more than 95% of the input entities, including less than 2%
uncertainty with a 95% confidence interval at p < 0.0001.
 The model performance validation results of NSE and MEF show that the
developed model is superior for computing WQI scores at most monitoring
sites across four application domains through the summer and winter
 In addition, most statistical measures of performance metrics also indicate
that the model is more effective for the prediction of WQI score in TrC
 The assessment results of water quality proved that the developed model is
reliable for optimizing the model ambiguity and eclipsing problems.
 Moreover, the performance of model applications reveals that suggested
indicators: dissolved oxygen (DOX), biological oxygen demand (BOD5),
pH, water temperature (TEMP), transparency (TRAN), and three nutrient
enrichment indicators included total oxidized nitrogen (TON), dissolved
inorganic nitrogen (DIN), and molybdate reactive phosphorus (MRP)O
might be adequate and reliable to monitor the transitional and coastal
waters with the proposed model.
 Overall, the model’s results (applications, sensitivity, eclipsing, ambiguity,
and uncertainty) indicate that the proposed model is more robust than other
WQI approaches.

The developed model could be useful for further improvement of the existing WQI system and effective in improving the typical monitoring program for managing TrC waters in the world. Although the present model was developed for transitional and coastal water quality, the approach could also be utilized to assess water quality in other waterbody types, e.g. rivers or lakes, and geographical locations.
Original languageEnglish
QualificationDoctor of Information Technology
Awarding Institution
  • University of Galway
  • Olbert, Agnieszka I., Principal Supervisor, External person
  • Nash, Stephen, Co-Supervisor, External person
  • Rahman, Azizur, Principal Supervisor
Award date18 Aug 2023
Place of PublicationIreland
Publication statusPublished - 2023
Externally publishedYes


Dive into the research topics of 'Development of a novel water quality index model using data science approaches'. Together they form a unique fingerprint.

Cite this