Font Size: a A A

Local Differential Privacy Preserving Of High-Dimensional Data

Posted on:2022-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:R S WuFull Text:PDF
GTID:2518306341455614Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Mobile Internet and intelligent terminal devices can generate many high-dimensional data,which has potential rules and values.However,if it is published directly without purification,the privacy information of users or organizations may be disclosed.Some existing differential privacy publishing methods for high-dimensional data cannot protect users' privacy.On the one hand,centralized differential privacy cannot solve the problem of privacy leakage caused by untrusted servers,and has many problems when it is applied to crowdsourcing scenarios.On the other hand,differential privacy will not only introduce a lot of noise and reduce data utility,but also cause high time complexity and computational complexity.In addition,most methods do not consider the relevance of attributes when disturbing data,which further leads to information loss.This paper proposes a high-dimensional data publishing mechanism based on RR and Markov network(HDPRM)to solve the above problems.The main contents are as follows:Firstly,this paper adopts the random response method for the local user data,constructs a perturbed matrix based on staircase mechanism to perturb the data,and proposes an algorithm,which satisfies ?-LDP,to provide local privacy protection,and proving the availability of this algorithm theoretically.Secondly,to reduce the computational cost,mutual information is used to obtain the association between the attributes.According to this association,the Markov network is constructed,and the junction tree algorithm reduces the dimension of the data;In order to improve the availability of data,the joint distribution of attributes is reconstructed by EM algorithm,and the final synthetic data is generated by sampling according to the joint distribution and marginal distribution of attributes.Finally,the performance of the proposed algorithm is verified by experiments and compared with RRPP and Invariant-PRAM.Experimental results show that HDPRM has a better advantage in improving data utility when the amount of data is large or the privacy budget is small.In addition,to better explore,Bayesian network is used as a reference to analyze the advantages and disadvantages of the two schemes in detail.Because Bayesian network is more complex than Markov network,the computational complexity is higher.On the other hand,Markov network can reduce the computational cost and achieve the effect close to Bayesian network,so the scheme proposed in this paper is more practical.In the future,local differential privacy will be applied to NoSQL and big data computing framework to make differential privacy technology more widely used.Figure 17 table 1 reference 65...
Keywords/Search Tags:local differential privacy, random response, Markov network, mutual information, high-dimensional data
PDF Full Text Request
Related items