Font Size: a A A

Research On Privacy-preserving Data Publishing Methods And Their Applications

Posted on:2019-02-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:A S M Touhidul HasanFull Text:PDF
GTID:1318330566459280Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In Big Data era,the performance of data analysis heavily depends on the availability of high-quality data.To guarantee the quality of data analysis,well-organized information sharing between organizations become essential.Data often contain personally identifiable information and therefore,publishing such data may result in privacy breach.Privacy-preserving methods and anonymization techniques are necessary to reduce the probability of privacy disclosure.The most straightforward mode of protecting privacy disclosure is not to publish data to the other parties.This idea would be destructive since it will prevent the process of data analysis for interesting and useful patterns.This dissertation investigates the methods and applications of privacy-preserving data publishing and analysis.For the publishing of relational data,we consider the privacy and data utility problems in the published dataset.When a dataset is released,the available dataset breaches user identity because of the identification-nature of the relational data.Hence,it is essential to apply anonymization techniques to preserve user privacy in the published dataset.Even applying anonymization techniques,the published dataset still breaches user identity by the linking,background knowledge,and composition attacks.To protect linking,background knowledge,and composition attacks the published dataset requires satisfying a privacy model.Most of the models decrease data utility to meet the privacy requirements in the published dataset.We commence analyzing the privacy and data utility problems for relational data publishing in a single release of the dataset.To increase data utility and privacy in a single release,we propose the value swapping method to anonymize the published data.The propose method works on negative association rules to swap the invalid records to satisfy the privacy requirement.Value swapping method helps to increase the privacy and keeps the better data utility in a single release of the dataset.We continue examining the problem of privacy-preserving in a sequential release of datasets.In a sequential release,different data publishers publish their data without concerning each other which is multiple independent data publishing.An attack on personal privacy which uses multiple independent datasets is called a composition attack.For increasing data utility and protecting from composition attack,we propose merging method.The propose method applies cell generalization approach to protect the personal privacy from composition attack and increase the data utility as well.In addition,we study the bike-sharing data publishing problem as the application of our proposed methods.For publishing bike-sharing data,we have used the methodology from multiple independent data publishing.Hence,the released bike sharing dataset will be protected from privacy violations and increases the data utility.In the last few years,the privacy preservation becomes essential for the non-relational datasets,i.e.,trajectory data.Trajectory data could disclose users' privacy by their visited places.Reward-based location-based service(LBS)applications collect user trajectories to provide the requested service.We investigate the privacy problems in the reward-based LBS applications and propose client-server privacy architecture with bounded perturbation technique to anonymize the trajectory data.The propose method introduces global locations set to anonymize the users' trajectories.The client-server privacy architecture and bounded perturbation technique preserves user privacy and provides better data utility for the anonymized dataset.The contributions of this dissertation are summarized as follows:1.Proposing a value swapping method for single release of the dataset.Value swapping method uses negative association rules to protect against linking and background knowledge attack,and it enhances the published dataset utility.2.Proposing the merging method for sequential releases of datasets.Merging method can successfully protect published datasets from composition attacks and increases the data utility.3.Analyzing the data publishing problems for bike sharing datasets,and we propose a grouping approach based on merging method.Grouping approach preserves user privacy and increases the data utility in the published datasets.4.Proposing the client-server privacy architecture and bounded perturbation technique to anonymize the trajectory data.Bounded perturbation method introduces global locations set to anonymize the trajectory.The anonymized trajectory preserves user privacy and provides better data utility.
Keywords/Search Tags:Privacy-Preserving, Data Publishing, Anonymization, Bounded Perturbation, Data utility
PDF Full Text Request
Related items