Font Size: a A A

Research And Implementation Of Data Release Technology Based On Differential Privacy

Posted on:2020-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:H F KeFull Text:PDF
GTID:2438330623464245Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and big data applications,people's way of life and work has been changed in recent years,and the value of data has been reflected in various industries.However,data publishing poses a huge threat to users' privacy while creating value.Therefore,user privacy protection has become a critical security requirement in data processing systems.If the published data has not been properly processed,the user's sensitive information may be acquired by adversaries.For example,untrusted entities may maliciously obtain the user's identity information,historical location and movement track,or analyze the user's behavioral habits through the background knowledge and context information.In order to solve the privacy problem involved in big data publishing,this thesis makes the following specific research:(1)Aiming at the problem of privacy leakage in data publishing,this thesis proposes a confusing differential privacy data publishing scheme,named AQ-DP,based on quasi-identifiers classification.This scheme proposes a method to classify quasi-identifiers according to sensitive attributes,and employs random shuffling and data generalization algorithm to process the classified quasi-identifiers,which can provide privacy protection without destroying the relevance of initial data.Differential privacy is introduced to provide stronger protection in this model.It can be proved that AQ-DP satisfies the definition of differential privacy,thus the security of AQ-DP can be guaranteed theoretically.In addition,the scheme utilizes the KL divergence and mutual information to measure the data utility and privacy level respectively.The experimental results show that AQ-DP has better performance while satisfying more security attributes,thus AQ-DP can be better applied to actual scenarios compared with the popular differential privacy proposals.(2)To solve the problem that single server cannot process massive data in real-time,this thesis proposes a multi-cluster distributed differential privacy data publishing scheme,named MCDP,based on neural network.This scheme establishes a distributed differential privacy model for data processing,so it can greatly liberate the central server from the data processing pressure.And it can be theoretically proved that the proposed distributed differential privacy model meets the definition of differential privacy.What's more,the central server does not store any user's data in the MCDP model,therefore,user's data will not be leaked even if the central server is attacked.The different privacy parameters among clusters also ensure the flexibility and feasibility of MCDP.The experimental results show that MCDP can provide strong security protection and real-time operation efficiency.(3)Based on the application background of data publishing and differential privacy,this thesis implements a data query system based on differential privacy,which is combined with the existing mainstream technologies and frameworks such as Node.js and electron.Users can query the system to obtain relevant statistical information about the data set using for decision-making or data mining,but cannot obtain the privacy information of individuals in the data set.For the interactive query and non-interactive query of differential privacy,the system uses different differential privacy processing to meet the user's query needs.The system provides different differential privacy parameters for users to choose.Under different privacy parameters,the probability distribution of noise added by system is different,and the system can display the current noise probability density image in real-time.
Keywords/Search Tags:Privacy protection, Differential privacy, Data publishing, Quasi-identifier classification, Multi-clusters, Neural networks
PDF Full Text Request
Related items