FastForest: Increasing random forest processing speed while maintaining accuracy

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)

Abstract

Random Forest remains one of Data Mining’s most enduring ensemble algorithms, achieving well-documented levels of accuracy and processing speed, as well as regularly appearing in new research. However, with data mining now reaching the domain of hardware-constrained devices such as smartphones and Internet of Things (IoT) devices, there is continued need for further research into algorithm efficiency to deliver greater processing speed without sacrificing accuracy. Our proposed FastForest algorithm achieves this result through a combination of three optimising components - Subsample Aggregating (‘Subbagging’), Logarithmic Split-Point Sampling and Dynamic Restricted Subspacing. Empirical testing shows FastForest delivers an average 24% increase in model-training speed compared with Random Forest whilst maintaining (and frequently exceeding) classification accuracy over tests involving 45 datasets on both PC and smartphone platforms. Further tests show FastForest achieves favourable results against a number of ensemble classifiers including implementations of Bagging and Random Subspace. With growing interest in machine-learning on mobile devices, FastForest provides an efficient ensemble classifier that can achieve faster results on hardware-constrained devices, such as smartphones.

Original languageEnglish
Pages (from-to)130-152
Number of pages23
JournalInformation Sciences
Volume557
Early online date29 Dec 2020
DOIs
Publication statusPublished - May 2021

Fingerprint

Dive into the research topics of 'FastForest: Increasing random forest processing speed while maintaining accuracy'. Together they form a unique fingerprint.

Cite this