Font Size: a A A

Bayesian Network-based Data Publishing Method Using Smooth Sensitivity

Posted on:2020-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:M Z LiFull Text:PDF
GTID:2428330620954270Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The issue of privacy protection in data publishing is an important research direction in the field of information security.How to prevent the disclosure of sensitive information has become a hot topic of research.Due to the large data volume and the high correlation,high-dimensional data leads to poor data utility when data is published by differential privacy.One of the reasons for this problem is that most differential privacy protection algorithms use global sensitivity which fails to notice the noise added into data should be different in the actual data set.Not only that,differential privacy protection for high-dimensional data generally takes a lot of time.Therefore,how to guarantee the utility of data and improve the efficiency of the algorithm while protecting data privacy has become the main research issue of high-dimensional data publishing based on differential privacy.In order to solve the above issues,this dissertation proposes the following two algorithms:Firstly,a data publishing algorithm SSPrivBayes(Smooth Sensitivity Privacy Bayes)based on Bayesian Network is proposed,which is an improved algorithm for PrivBayes.In order to improve the utility of published data,SSPrivBayes algorithm introduces the concept of smooth sensitivity,which reduces the noise while realizing differential privacy,thus improving the utility of published data.The experiment is carried out on four real data sets,which proves that the algorithm proposed in this dissertation can improve the utility of published data.Secondly,an algorithm PBCPC(Privacy Bayesian Candidate Parents and Children)which can reduce the search space of Bayesian Network is proposed.For solving the problem of too large search space of Bayesian Network,the algorithm obtains the candidate parent and child sets of target variables by heuristic method,which reduces the search space of Bayesian Network and improves the execution efficiency of the algorithm.The experimental results show that the algorithm PBCPC does not have an advantage in running time with fewer attributes.On the contrary,the more the number of attributes,the algorithm PBCPC is better than the algorithm PrivBayes in running time.
Keywords/Search Tags:smooth sensitivity, Bayesian Network, differential privacy, data publishing, privacy protection
PDF Full Text Request
Related items