TY - JOUR
T1 - FastForest
T2 - Increasing random forest processing speed while maintaining accuracy
AU - Yates, Darren
AU - Islam, Md Zahidul
N1 - Publisher Copyright:
© 2021 Elsevier Inc.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/5
Y1 - 2021/5
N2 - Random Forest remains one of Data Mining’s
most enduring ensemble algorithms, achieving well-documented levels of
accuracy and processing speed, as well as regularly appearing in new
research. However, with data mining now reaching the domain of
hardware-constrained devices such as smartphones and Internet of Things
(IoT) devices, there is continued need for further research into
algorithm efficiency to deliver greater processing speed without
sacrificing accuracy. Our proposed FastForest algorithm achieves this
result through a combination of three optimising components - Subsample
Aggregating (‘Subbagging’), Logarithmic Split-Point Sampling and Dynamic
Restricted Subspacing. Empirical testing shows FastForest delivers an
average 24% increase in model-training speed compared with Random Forest
whilst maintaining (and frequently exceeding) classification accuracy
over tests involving 45 datasets on both PC and smartphone platforms.
Further tests show FastForest achieves favourable results against a
number of ensemble classifiers including implementations of Bagging and
Random Subspace. With growing interest in machine-learning on mobile
devices, FastForest provides an efficient ensemble classifier that can
achieve faster results on hardware-constrained devices, such as
smartphones.
AB - Random Forest remains one of Data Mining’s
most enduring ensemble algorithms, achieving well-documented levels of
accuracy and processing speed, as well as regularly appearing in new
research. However, with data mining now reaching the domain of
hardware-constrained devices such as smartphones and Internet of Things
(IoT) devices, there is continued need for further research into
algorithm efficiency to deliver greater processing speed without
sacrificing accuracy. Our proposed FastForest algorithm achieves this
result through a combination of three optimising components - Subsample
Aggregating (‘Subbagging’), Logarithmic Split-Point Sampling and Dynamic
Restricted Subspacing. Empirical testing shows FastForest delivers an
average 24% increase in model-training speed compared with Random Forest
whilst maintaining (and frequently exceeding) classification accuracy
over tests involving 45 datasets on both PC and smartphone platforms.
Further tests show FastForest achieves favourable results against a
number of ensemble classifiers including implementations of Bagging and
Random Subspace. With growing interest in machine-learning on mobile
devices, FastForest provides an efficient ensemble classifier that can
achieve faster results on hardware-constrained devices, such as
smartphones.
KW - Accuracy
KW - Ensemble classifier
KW - Random forest
KW - Speed
KW - Subbagging
UR - http://www.scopus.com/inward/record.url?scp=85099682034&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099682034&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2020.12.067
DO - 10.1016/j.ins.2020.12.067
M3 - Article
AN - SCOPUS:85099682034
SN - 0020-0255
VL - 557
SP - 130
EP - 152
JO - Information Sciences
JF - Information Sciences
ER -