Font Size: a A A

An Edge Correlation Based Differentially Private Bayesian Network Data Release Method

Posted on:2019-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:J C ZhangFull Text:PDF
GTID:2428330548494885Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Differential privacy protection(DP)provides a strict and proven privacy guarantee and assume that the attacker has any background knowledge,the technology really before and is very different from working in the field of privacy protection,and in 2016,with apple's worldwide developers conference announced that apple has already used the differential privacy protection technology to protect the privacy of IOS users,more and more researchers began to focus in the field of the new technology.Therefore,data release is designed to ensure the security and privacy of data,and the data distribution of differential privacy protection has been rapidly developed.Because the difference privacy does not limit the background knowledge of the attacker,it has been rapidly developed over the years.In general,many studies on differential privacy protection assume that the meta-group attributes in the data set are unrelated,or independent.But related data and no related data sets is distinct in privacy budget cost,that is to say,the difference of privacy protection on the related data set there is a great research space of privacy protection.The purpose of this paper is to research through a weighted bayesian network to protect with closely related statistical database data information,this paper proposes a based on the difference of privacy under the edge related bayesian network data distribution method.Firstly,the relationship between the main attributes of the high dimensional data set is obtained through bayesian network,and the primary property set is obtained,and then the lower dimensional attribute set is obtained by the TBT algorithm.Then,by measuring the low dimensional data set on the distance between the attributes of the node for each edge PF vector,and then to standardize the PF vector,and then introduce relevance to define the key while the close degree,is obtained based on the sensitivity of the edge related;Finally in the process of adding noise,use NDR algorithm in low dimensional data set to use the new tuples edge sensitivity related to control of noise distribution and size of less privacy as possible in order to achieve the budget to maximize the purpose of data availability.And then,through the experiment,statistical query test on the three data sets,with MAE and accuracy index evaluation algorithm,by contrast,found that the proposed method compared with the previous baseline algorithm of bayesian networks have more excellent performance.The HDR algorithm proposed in this paper can guarantee the privacy of data release and ensure the availability of data release,which can achieve a better balance.An edge-related approach is proposed in this paper to ensure that privacy costs are smaller and more secure.Through bayesian network,a key set of data set is obtained,which can reduce the data dimension.New privacy overhead and sensitivity parameters are obtained by redefining the correlation between correlation data sets.The edge-related HDR algorithm achieves high privacy protection for data,and ensures that the results can be safely used.
Keywords/Search Tags:Data released, Difference privacy, Bayesian network, Edge Correlation, Edge Sensitivity
PDF Full Text Request
Related items