Abstract
This work aims to maximise the utility of published data for the partition-based anonymisation of transactional data. We make an observation that, by optimising the clustering i.e. horizontal partitioning, the utility of published data can significantly be improved without affecting the privacy guarantees. We present a new clustering method with a specially designed distance function that considers the effect of sensitive terms in the privacy goal as part of the clustering process. In this way, when the clustering minimises the total intra-cluster distances of the partition, the utility loss is also minimised. We present two algorithms DocClust and DetK for clustering transactions and determining the best number of clusters respectively.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining |
Subtitle of host publication | 21st Pacific-Asia Conference, PAKDD 2017 Jeju, South Korea, May 23–26, 2017 Proceedings, Part II |
Editors | Jinho Kim, Kyuseok Shim, Longbing Cao, Jae-Gil Lee, Xuemin Lin, Yang-Sae Moon |
Place of Publication | Cham |
Publisher | Springer International Publishing AG |
Pages | 481-494 |
Number of pages | 14 |
ISBN (Electronic) | 9783319575292 |
ISBN (Print) | 9783319575285 |
DOIs | |
Publication status | Published - 2017 |
Event | The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017) - Seogwipo KAL Hotel , Jeju, Korea, Republic of Duration: 23 May 2017 → 26 May 2017 http://pakdd2017.snu.ac.kr/ |
Conference
Conference | The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017) |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju |
Period | 23/05/17 → 26/05/17 |
Internet address |