An analysis of statistics-based data preprocessing mechanisms for privacy protection in Big Data

Research output: Other contribution to conferencePresentation onlypeer-review


Big data analytics is a very fast growing research domain which embedded the combination of computational (i.e. computer-intensive) and inferential (i.e. statistics-oriented) thinking. Information are increasingly gathered into big data system by cheap and numerous info-sensing mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification readers and wireless sensor networks. As a results big database contain large amount of private and sensitive data including healthcare, business, financial and criminal records. These private and sensitive data cannot be share to everyone, so privacy protection of data is required in analytics system before employing data mining and machine learning. Data preprocessing could be one of the useful methods for avoiding privacy leakage of data in addition to make the data clean, noise free and consistent. This paper examines a range of statistics-based data preprocessing approaches such as data perturbation and normalisation using the z-score, min-max and decimal scaling methods for privacy protection in big data environment. The experiment results reveal that statistics-based preprocessing mechanisms are effective to protect confidential information and also maintain the performance of data mining and machine learning tools’ analyses outcomes.
Original languageEnglish
Number of pages1
Publication statusPublished - 2017
EventThe 4th Cyber Security Symposium: CSS 2017 - International Hotel Wagga, Wagga Wagga, Australia
Duration: 08 Jun 201709 Jun 2017 (Conference website)


ConferenceThe 4th Cyber Security Symposium
CityWagga Wagga
OtherThe 4th Cyber Security Symposium focuses on all aspects of techniques and applications linked to ICT security research. The purpose of CSS’2017 is to provide a forum for presentation and discussion of innovative ideas, research results, applications and experience as well as highlight activities in the related areas.

Abstract submissions on all aspects of ICT Security are welcome. A request will then be made for the extended version of your abstract as a paper for submission in the special issue of the following journals:

1. Journal of Network and Computer Applications (JCR Q1)
2. Journal of Parallel and Distributed Computing (JCR Q1)
3. IEEE Access
Internet address


Dive into the research topics of 'An analysis of statistics-based data preprocessing mechanisms for privacy protection in Big Data'. Together they form a unique fingerprint.

Cite this