Abstract
The continuous release of data, also called serial publication is critical for data analytics but it can lead to severe privacy disclosures via composition attacks. The serial publication often consists of several corpora and each corpus is an update of the previous one. While each individually published corpus may be privacy preserving, when considered together the whole serial publication may be at risk of privacy disclosures. Existing solutions addressing this problem often afford the privacy guarantees of k-anonymity and l-diversity which are prone to attribute disclosures via skewness attacks, and they focus only on relational data. This paper addresses the serial publication problem in the transactional data setting. First, we model the privacy disclosure risks associated with serially published data probabilistically. We then develop a rigorous privacy guarantee and a serial publication method Sanony that satisfies the privacy guarantee without excessive utility loss. We evaluate our method on two benchmark datasets and the results show our framework affords stronger privacy with much lower perturbation rates than existing state-of-the-art techniques.
Original language | English |
---|---|
Pages (from-to) | 53-70 |
Number of pages | 18 |
Journal | Information Systems |
Volume | 82 |
Early online date | 25 Jan 2019 |
DOIs | |
Publication status | Published - May 2019 |