Missing Value Imputation Using Decision Trees and Decision Forests by Splitting and Merging Records: Two Novel Techniques

Research output: Contribution to journalArticle

58 Citations (Scopus)

Abstract

We present two novel techniques for the imputation of both categorical and numerical missing values. The techniques use decision trees and forests to identify horizontal segments of a data set where the records belonging to a segment have higher similarity and attribute correlations. Using the similarity and correlations, missing values are then imputed. To achieve a higher quality of imputation some segments are merged together using a novel approach. We use nine publicly available data sets to experimentally compare our techniques with a few existing ones in terms of four commonly used evaluation criteria. The experimental results indicate a clear superiority of our techniques based on statistical analyses such as confidence interval.
Original languageEnglish
Pages (from-to)51-65
Number of pages15
JournalKnowledge-Based Systems
Volume53
DOIs
Publication statusPublished - Nov 2013

Fingerprint Dive into the research topics of 'Missing Value Imputation Using Decision Trees and Decision Forests by Splitting and Merging Records: Two Novel Techniques'. Together they form a unique fingerprint.

  • Cite this