Abstract

With the ubiquity of data collection in today’s society, protecting each individual’s privacy is a growing concern. Differential Privacy provides an enforceable definition of privacy that allows data owners to promise each individual that their presence in the dataset will be almost undetectable. Data Mining techniques are often used to discover knowledge in data, however these techniques are not differentially privacy by default. In this paper, we propose a differentially private decision forest algorithm that takes advantage of a novel theorem for the local sensitivity of the Gini Index. The Gini Index plays an important role in building a decision forest, and the sensitivity of it’s equation dictates how much noise needs to be added to make the forest be differentially private. We prove that the Gini Index can have a substantially lower sensitivity than that used in previous work, leading to superior empirical results. We compare the prediction accuracy of our decision forest to not only previous work, but also to the popular Random Forest algorithm to demonstrate how close our differentially private algorithm can come to a completely non-private forest.
Original languageEnglish
Title of host publicationProceedings of the Thirteenth Australasian Data Mining Conference (AusDM 15)
Place of PublicationAustralia
PublisherCRPIT
Pages99-108
Number of pages10
ISBN (Print)9781921770180
Publication statusPublished - 2015
EventThe 13th Australasian Data Mining Conference: AusDM 2015 - University of Technology, Sydney, Australia
Duration: 08 Aug 201509 Aug 2015
https://web.archive.org/web/20150820140652/http://ausdm15.ausdm.org/

Conference

ConferenceThe 13th Australasian Data Mining Conference
Country/TerritoryAustralia
CitySydney
Period08/08/1509/08/15
Internet address

Fingerprint

Dive into the research topics of 'A differentially private decision forest'. Together they form a unique fingerprint.

Cite this