TY - JOUR
T1 - A hybrid feature selection with ensemble classification for imbalanced healthcare data
T2 - A case study for brain tumor diagnosis
AU - Huda, Shamsul
AU - Yearwood, John
AU - Jelinek, Herbert F.
AU - Hassan, Mohammad Mehedi
AU - Fortino, Giancarlo
AU - Buckland, Michael
N1 - Includes bibliographical references.
PY - 2016
Y1 - 2016
N2 - Electronic health records (EHRs) are providing increased access to healthcare data that can be made available for advanced data analysis. This can be used by the healthcare professionals to make a more informed decision providing improved quality of care. However, due to the inherent heterogeneous and imbalanced characteristics of medical data from EHRs, data analysis task faces a big challenge. In this paper, we address the challenges of imbalanced medical data about a brain tumor diagnosis problem. Morphometric analysis of histopathological images is rapidly emerging as a valuable diagnostic tool for neuropathology. Oligodendroglioma is one type of brain tumor that has a good response to treatment provided the tumor subtype is recognized accurately. The genetic variant, 1p-/19q-, has recently been found to have high chemosensitivity, and has morphological attributes that may lend it to automated image analysis and histological processing and diagnosis. This paper aims to achieve a fast, affordable, and objective diagnosis of this genetic variant of oligodendroglioma with a novel data mining approach combining a feature selection and ensemble-based classification. In this paper, 63 instances of brain tumor with oligodendroglioma are obtained due to prevalence and incidence of the tumor variant. In order to minimize the effect of an imbalanced healthcare data set, a global optimization-based hybrid wrapper-filter feature selection with ensemble classification is applied. The experiment results show that the proposed approach outperforms the standard techniques used in brain tumor classification problem to overcome the imbalanced characteristics of medical data.
AB - Electronic health records (EHRs) are providing increased access to healthcare data that can be made available for advanced data analysis. This can be used by the healthcare professionals to make a more informed decision providing improved quality of care. However, due to the inherent heterogeneous and imbalanced characteristics of medical data from EHRs, data analysis task faces a big challenge. In this paper, we address the challenges of imbalanced medical data about a brain tumor diagnosis problem. Morphometric analysis of histopathological images is rapidly emerging as a valuable diagnostic tool for neuropathology. Oligodendroglioma is one type of brain tumor that has a good response to treatment provided the tumor subtype is recognized accurately. The genetic variant, 1p-/19q-, has recently been found to have high chemosensitivity, and has morphological attributes that may lend it to automated image analysis and histological processing and diagnosis. This paper aims to achieve a fast, affordable, and objective diagnosis of this genetic variant of oligodendroglioma with a novel data mining approach combining a feature selection and ensemble-based classification. In this paper, 63 instances of brain tumor with oligodendroglioma are obtained due to prevalence and incidence of the tumor variant. In order to minimize the effect of an imbalanced healthcare data set, a global optimization-based hybrid wrapper-filter feature selection with ensemble classification is applied. The experiment results show that the proposed approach outperforms the standard techniques used in brain tumor classification problem to overcome the imbalanced characteristics of medical data.
KW - ANNIGMA
KW - Brain tumor
KW - Classification
KW - Feature selection
KW - Morphological features
KW - MRMR
UR - http://www.scopus.com/inward/record.url?scp=85015222079&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015222079&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2016.2647238
DO - 10.1109/ACCESS.2016.2647238
M3 - Article
AN - SCOPUS:85015222079
SN - 2169-3536
VL - 4
SP - 9145
EP - 9154
JO - IEEE Access
JF - IEEE Access
M1 - 7809136
ER -