Font Size: a A A

Research On Key Technologies Of Privacy Protection For Data Publishing

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:W J LiFull Text:PDF
GTID:2428330632454259Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous advancement of the digital and information era,many data collection agencies need to publish the collected raw data(such as medical data,financial data,etc.)in order to facilitate data analysis and mining,and to generate more effective data from the released data Local decision support,however,a large amount of personally sensitive information is involved in the original data released,and the direct release of data can cause serious leakage of personal privacy.Therefore,how to provide data researchers with a lot of effective information while using privacy technology to ensure the privacy and security of the original data becomes extremely challenging.In order to solve the problem of privacy data leakage for data release,this article has carried out the following research on key privacy protection technologies in data release:(1)An overview of the research background of privacy protection for data distribution and the traditional privacy protection model are summarized.The definition of differential privacy and its implementation mechanism are introduced,and the current technology and research progress of differential privacy protection for data release are analyzed.(2)Aiming at the problem of data privacy disclosure during data fusion publishing,in a personalized privacy-protected data publishing environment,a data hierarchical fusion publishing mechanism based on differential privacy protection(HDFPM)is proposed to solve the current data fusion publishing mechanism cannot resist disadvantages of the background knowledge attack.This mechanism classifies users' rights and payment.In the process of data fusion,differential privacy protection technology is combined with the classification tree and its improved algorithm to perform data fusion,and the hierarchical differential privacy budget is reasonably distributed to realize the classification of fusion data privacy protection.Experimental results show that this mechanism can not only realize effective data fusion,but also protect sensitive data.(3)Aiming at the problem of poor availability of publishing results caused by "the curse of dimensionality" in high-dimensional data publishing,we present PPDP-PCAO(Privacy Preserving Data Publishing with Principal Component Analysis Optimization)method,which can better address the problem of the lower utility of release results because of the high noise introduced by the curse of dimensionality.PPDP-PCAO improves the Principal Component Analysis(PCA)algorithm by employing the attribute importance,and reduces the dimension of the data with the improved PCA,which reduces the time and space cost.PPDP-PCAO introduces the evaluation mechanism based on mutual-information into data release,which evaluates the data generated by setting the different quantities of principal components to determine the optimal quantities.PPDP-PCAO considers the existence of multi-sensitive attributes in high-dimensional data,while the traditional methods of allocating privacy budgets cannot satisfy the personalized privacy protection.PPDP-PCAO introduces the sensitivity preference,combines the optimal matching theory,and designs the sensitive attribute hierarchical protection strategy.Extensive experimental results on different real datasets demonstrate that PPDP-PCAO not only guarantees the privacy of published dataset,but also significantly improves the accuracy and data utility than other high-dimensional data publishing methods.
Keywords/Search Tags:data publishing, privacy protection, privacy classification, differential privacy, the evaluation mechanism
PDF Full Text Request
Related items