Decision tree classification with differential privacy: A survey

Sam Fletcher, Zahid Islam

Research output: Contribution to journalArticle

3 Citations (Scopus)


Data mining information about people is becoming increasingly important in the data-driven society of the21st century. Unfortunately, sometimes there are real-world considerations that conflict with the goals ofdata mining; sometimes the privacy of the people being data mined needs to be considered. This necessitatesthat the output of data mining algorithms be modified to preserve privacy while simultaneously not ruiningthe predictive power of the outputted model. Differential privacy is a strong, enforceable definition of privacythat can be used in data mining algorithms, guaranteeing that nothing will be learned about the people inthe data that could not already be discovered without their participation. In this survey, we focus on oneparticular data mining algorithm – decision trees – and how differential privacy interacts with each of thecomponents that constitute decision tree algorithms. We analyze both greedy and random decision trees,and the conflicts that arise when trying to balance privacy requirements with the accuracy of the model.
Original languageEnglish
Article number83
Pages (from-to)1-35
Number of pages35
JournalACM Computing Surveys
Issue number4
Early online dateAug 2019
Publication statusPublished - Sep 2019

Fingerprint Dive into the research topics of 'Decision tree classification with differential privacy: A survey'. Together they form a unique fingerprint.

  • Cite this