TY - JOUR
T1 - Machine learning model matters its accuracy
T2 - a comparative study of ensemble learning and AutoML using heart disease prediction
AU - Rimal, Yagyanath
AU - Paudel, Siddhartha
AU - Sharma, Navneet
AU - Alsadoon, Abeer
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2024/4
Y1 - 2024/4
N2 - Ensemble machine learning is the concept of using multiple models to gain better performance from the combination of weak individual models. New researchers focus on improving machine learning models for accurate classification and prediction on test data, highlighting the critical issue of overall model quality. Once weak learners’ ensembles for making strong models were compared separately, the precision, accuracy, and f1 model score were compared separately, and the majority of voting aggregation recommended the best mode for deployment. The model accuracy and their performances of the decision tree, logistic regression, support vector machine, random forest, artificial neural network, gaussian, k nearest neighbor, and multilayer perception were compared for the best model prediction. Similarly, Auto Machine Learning (AutoML) supports both binary classifications and regression problems that can be applied instantly without feature engineering directly. AutoML tries to develop a list of more robust models in tabular form and then determine whose accuracy prediction is the best. This research compares the eighteen (18) different machine learning models, i.e., eight (8) different models that were individually trained and ten (10) from AutoML, whose accuracy, mse, and r2 scores were compared with the same open-source heart disease data set. The support vector, logistic regression, and neural network models produced the highest 80% accuracy result compared to the gaussian, k nearest neighbors, and multilayer perception algorithms, which scored a 76% accuracy score. Similarly, after using AutoML, the generalized linear model (88%), gradient boosting model (87%), distributed random forest model (87%), extra tree model score (82%), and accuracy scores (82%), which ultimately mattered for model accuracy of prediction, were recommended for heart disease classification.
AB - Ensemble machine learning is the concept of using multiple models to gain better performance from the combination of weak individual models. New researchers focus on improving machine learning models for accurate classification and prediction on test data, highlighting the critical issue of overall model quality. Once weak learners’ ensembles for making strong models were compared separately, the precision, accuracy, and f1 model score were compared separately, and the majority of voting aggregation recommended the best mode for deployment. The model accuracy and their performances of the decision tree, logistic regression, support vector machine, random forest, artificial neural network, gaussian, k nearest neighbor, and multilayer perception were compared for the best model prediction. Similarly, Auto Machine Learning (AutoML) supports both binary classifications and regression problems that can be applied instantly without feature engineering directly. AutoML tries to develop a list of more robust models in tabular form and then determine whose accuracy prediction is the best. This research compares the eighteen (18) different machine learning models, i.e., eight (8) different models that were individually trained and ten (10) from AutoML, whose accuracy, mse, and r2 scores were compared with the same open-source heart disease data set. The support vector, logistic regression, and neural network models produced the highest 80% accuracy result compared to the gaussian, k nearest neighbors, and multilayer perception algorithms, which scored a 76% accuracy score. Similarly, after using AutoML, the generalized linear model (88%), gradient boosting model (87%), distributed random forest model (87%), extra tree model score (82%), and accuracy scores (82%), which ultimately mattered for model accuracy of prediction, were recommended for heart disease classification.
KW - Artificial neural network
KW - AutoML
KW - Ensemble learning
KW - K nearest neighbors
KW - Random forest
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=85172781852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172781852&partnerID=8YFLogxK
U2 - 10.1007/s11042-023-16380-z
DO - 10.1007/s11042-023-16380-z
M3 - Article
AN - SCOPUS:85172781852
SN - 1380-7501
VL - 83
SP - 35025
EP - 35042
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 12
ER -