Abstract

Random Forest draws much interest from the research community because of its simplicity and excellent performance. The splitting attribute at each node of a decision tree for Random Forest is determined from a predefined number of randomly selected subset of attributes of the entire attribute set. The size of the subset is one of the most controversial points of Random Forest that encouraged many contributions. However, a little attention is given to improve Random Forest specifically for those records that are hard to classify. In this paper, we propose a novel technique of detecting hard-to-classify records and increase the weights of those records in a training data set. We then build Random Forest from the weighted training data set. The experimental results presented in this paper indicate that the ensemble accuracy of Random Forest can be improved when applied on weighted training data sets with more emphasis on hard-to-classify records.
Original languageEnglish
Title of host publicationProceedings of the 12th International Conference on Advanced Data Mining and Applications, ADMA 2016
EditorsJinyan Li, Xue Li, Shuliang Wang, Jianxin Li, Quan Z. Sheng
Place of PublicationSwitzerland
PublisherSpringer
Pages558-566
Number of pages9
Volume10086
ISBN (Electronic)9783319495866
ISBN (Print)9783319495859
DOIs
Publication statusPublished - 2016
EventAdvanced Data Mining and Applications (ADMA) 12th International Conference - Mantra Legends Hotel, Gold Coast, Australia
Duration: 12 Dec 201615 Dec 2016
https://cs.adelaide.edu.au/~adma2016/
https://cs.adelaide.edu.au/~adma2016/ADMA2016_program.pdf (Conference program)

Conference

ConferenceAdvanced Data Mining and Applications (ADMA) 12th International Conference
Country/TerritoryAustralia
CityGold Coast
Period12/12/1615/12/16
OtherThe year 2016 marks the 12th aniversary of the International Conference on Advanced Data Mining and Applications (ADMA 2016).
The conference aims at bringing together the experts on data mining from around the world, and providing a leading international forum for the dissemination of original research findings in data mining, spanning applications, algorithms, software and systems, as well as different applied disciplines with potential in data mining, such as smartphone and social network mining, bio-medical science and green computing. ADMA 2016 will promote the same close interaction and collaboration among practitioners and researchers. Published papers will go through a full peer review process.
Internet address

Fingerprint

Dive into the research topics of 'On improving random forest for hard-to-classify records'. Together they form a unique fingerprint.

Cite this