Utility aware clustering for publishing transactional data

Michael Bewong, Jixue Liu, Lin Liu, Jiuyong Li

Research output: Book chapter/Published conference paperConference paper

5 Citations (Scopus)

Abstract

This work aims to maximise the utility of published data for the partition-based anonymisation of transactional data. We make an observation that, by optimising the clustering i.e. horizontal partitioning, the utility of published data can significantly be improved without affecting the privacy guarantees. We present a new clustering method with a specially designed distance function that considers the effect of sensitive terms in the privacy goal as part of the clustering process. In this way, when the clustering minimises the total intra-cluster distances of the partition, the utility loss is also minimised. We present two algorithms DocClust and DetK for clustering transactions and determining the best number of clusters respectively.
Original languageEnglish
Title of host publicationPacific-Asia Conference on Knowledge Discovery and Data Mining
EditorsJinho Kim, Kyuseok Shim, Longbing Cao, Jae-Gil Lee, Xuemin Lin, Yang-Sae Moon
Place of PublicationCham
PublisherSpringer International Publishing AG
Pages481-494
Number of pages14
ISBN (Print)9783319575292
Publication statusPublished - 2017
EventThe Pacific-Asia Conference on Knowledge Discovery and Data Mining 2017: PAKDD 2017 - Seogwipo KAL Hotel , Jeju, Korea, Republic of
Duration: 23 May 201726 May 2017
http://pakdd2017.snu.ac.kr/

Conference

ConferenceThe Pacific-Asia Conference on Knowledge Discovery and Data Mining 2017
CountryKorea, Republic of
CityJeju
Period23/05/1726/05/17
Internet address

Cite this

Bewong, M., Liu, J., Liu, L., & Li, J. (2017). Utility aware clustering for publishing transactional data. In J. Kim, K. Shim, L. Cao, J-G. Lee, X. Lin, & Y-S. Moon (Eds.), Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 481-494). Springer International Publishing AG.