TY - JOUR
T1 - Decision Trees in Federated Learning: Current State and Future Opportunities
AU - Bewong, Michael
AU - Islam, Zahid
AU - Heiyanthuduwage, Sudath
AU - Altas, Irfan
AU - Deho, Oscar
PY - 2024
Y1 - 2024
N2 - Federated learning (FL) is a distributed machine learning technique that enables multiple decentralized clients to develop a model collaboratively without exchanging their local data. Recent strict privacy laws make it even more difficult for the gathering and integration of data in a centralized location for full utilization. Federated learning is compatible with established privacy laws like General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), and China’s Cybersecurity Law. Further, there are very few scenarios where centralized, properly labeled, and complete data are available. Federated learning provides a way to solve this problem. As a result, much research has been conducted in several areas within the emerging field of FL. This review paper focuses on decision tree-based FL systems due to their desirable properties of interpretability, parallelism, and high performance. We take a closer look at the motivations, design considerations, tree building algorithms, and security mechanisms used for these systems. We also present the various datasets used in these systems, demonstrated application areas, and the evidence of their benefits. The objective of this paper is to provide an informative overview about the characteristics of FL, privacy and security mechanisms used in them, available open source development frameworks for FL, and the decision tree-based systems developed in FL for researchers in academia and system architects in the industry.
AB - Federated learning (FL) is a distributed machine learning technique that enables multiple decentralized clients to develop a model collaboratively without exchanging their local data. Recent strict privacy laws make it even more difficult for the gathering and integration of data in a centralized location for full utilization. Federated learning is compatible with established privacy laws like General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), and China’s Cybersecurity Law. Further, there are very few scenarios where centralized, properly labeled, and complete data are available. Federated learning provides a way to solve this problem. As a result, much research has been conducted in several areas within the emerging field of FL. This review paper focuses on decision tree-based FL systems due to their desirable properties of interpretability, parallelism, and high performance. We take a closer look at the motivations, design considerations, tree building algorithms, and security mechanisms used for these systems. We also present the various datasets used in these systems, demonstrated application areas, and the evidence of their benefits. The objective of this paper is to provide an informative overview about the characteristics of FL, privacy and security mechanisms used in them, available open source development frameworks for FL, and the decision tree-based systems developed in FL for researchers in academia and system architects in the industry.
U2 - 10.1109/ACCESS.2024.3440998
DO - 10.1109/ACCESS.2024.3440998
M3 - Article
SN - 2169-3536
SP - 1
JO - IEEE Access
JF - IEEE Access
ER -