Font Size: a A A

Association Data Release Based On Local Differential Privacy

Posted on:2020-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:T DongFull Text:PDF
GTID:2428330575971918Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Local differential privacy is one of the strongest privacy guarantees model in the field of privacy preserving data publishing.However,this model mainly used for single-attribute data collection at present,and there are few studies on data distribution for multi-attribute associations.In addition,when the data attribute values are perturbed independently in data release,it might result in excessive information loss.Thus,the research of association data release protection based on local differential privacy has become an urgent problem to be solved.Firstly,in order to solve the problem that the association between attributes in the field of privacy preserving data publishing is ignored,this paper proposes an algorithm of constructing A:-degree private Bayesian network.This algorithm utilizes Bayesian network to provide an intuitive model of inter-attribute correlation,mainly combining mutual information with greedy algorithm.Specifically,the mutual information is used to quantify the association between attributes,and the largest mutual information attribute pair is greedily selected under the constraint of k value to construct a low-order Bayesian network model.Secondly,in order to solve the problem of excessive loss of attribute independent perturbation information in data distribution,a local differential private method for correlating data perturbation is proposed.This method mainly combines attribute grouping ideas and random response techniques to achieve the purpose of perturbing the original data set.Specifically,this paper proposes a grouping idea based on the largest average mutual information attribute pair.This idea calculates the mutual information between the parent and child nodes of the ?-degree private Bayesian network as the mutual information value of the current attribute group,and then sets the threshold to divide the attributes of the entire Bayesian network into two categories,ie,a weakly associated attribute set and a robust set of associated attributes.Then,this paper adopts the second-perturbation random response technique to construct the perturbation matrix,and adds disturbance noises that meet different privacy budgets for weak or robust association attribute sets to ensure local differential privacy.Furthermore,this paper uses noise edges and Bayesian networks to construct an approximate distribution of a given data set to achieve association data publishing based on local differential privacy.Finally,this paper adopts the UCI open source dataset.Adult1,to test and evaluate the utility of the proposed method through various metrics,including the average KL-difference,cosine similarity and average running time of the dataset before and after the disturbance.The change in correlation is evaluated by comparing the mutual information between the attributes before and after the disturbance.Through experimental comparison and analysis,it shows that the method of this paper greatly guarantees the association between attributes and can bring less utility loss.Figure [9] table [6] reference [52]...
Keywords/Search Tags:local differential privacy, privacy preserving data publishing, Bayesian network, association data
PDF Full Text Request
Related items