Random Forest draws much interest from the research community because of its simplicity and excellent performance. The splitting attribute at each node of a decision tree for Random Forest is determined from a predefined number of randomly selected subset of attributes of the entire attribute set. The size of the subset is one of the most controversial points of Random Forest that encouraged many contributions. However, a little attention is given to improve Random Forest specifically for those records that are hard to classify. In this paper, we propose a novel technique of detecting hard-to-classify records and increase the weights of those records in a training data set. We then build Random Forest from the weighted training data set. The experimental results presented in this paper indicate that the ensemble accuracy of Random Forest can be improved when applied on weighted training data sets with more emphasis on hard-to-classify records.
|Title of host publication||Proceedings of the 12th International Conference on Advanced Data Mining and Applications, ADMA 2016|
|Editors||Jinyan Li, Xue Li, Shuliang Wang, Jianxin Li, Quan Z. Sheng|
|Place of Publication||Switzerland|
|Number of pages||9|
|Publication status||Published - 2016|
|Event||Advanced Data Mining and Applications (ADMA) 12th International Conference - Mantra Legends Hotel, Gold Coast, Australia|
Duration: 12 Dec 2016 → 15 Dec 2016
https://cs.adelaide.edu.au/~adma2016/ADMA2016_program.pdf (Conference program)
|Conference||Advanced Data Mining and Applications (ADMA) 12th International Conference|
|Period||12/12/16 → 15/12/16|
|Other||The year 2016 marks the 12th aniversary of the International Conference on Advanced Data Mining and Applications (ADMA 2016).|
The conference aims at bringing together the experts on data mining from around the world, and providing a leading international forum for the dissemination of original research findings in data mining, spanning applications, algorithms, software and systems, as well as different applied disciplines with potential in data mining, such as smartphone and social network mining, bio-medical science and green computing. ADMA 2016 will promote the same close interaction and collaboration among practitioners and researchers. Published papers will go through a full peer review process.