Analysis of Data Cleansing Methods for Improving Meteorological Data Quality: A Case Study

Gea Rahman, Md Akram Hossain Khan

Research output: Contribution to journalArticlepeer-review

Abstract

Quality in meteorological data is one of the main issues for many real applications including weather forecasting and for developing irrigation models. The integrity of meteorological data may be compromised for several reasons including the presence of corrupted and missing data which can be added due to interference and equipment malfunctioning. A decrease in data quality can significantly affect the efficiency of weather forecasting systems and irrigation models. Therefore, it is imperative to address the corrupt and missing data prior to their utilisation. In this study, we introduce a Data Cleansing Scheme (DCS) for handling the corrupt and missing values in a real meteorological dataset. DCS utilises a cutting-edge corrupt data identification method and a cutting-edge missing data imputation method to cleanse the meteorological data. The finalised dataset, free from any corrupt or missing values, is subsequently employed for data mining endeavours such as classification and knowledge discovery. Despite the negative impact of corrupt and missing values on the quality of data analysis results, this study demonstrates an enhancement when corrupt data is identified and missing values are imputed using DCS. Our extensive empirical and statistical analyses indicate the effectiveness of DCS for improving meteorological data quality.
Original languageEnglish
JournalEarth Science Informatics
Early online date06 Dec 2024
Publication statusPublished - 2025

Fingerprint

Dive into the research topics of 'Analysis of Data Cleansing Methods for Improving Meteorological Data Quality: A Case Study'. Together they form a unique fingerprint.

Cite this