Font Size: a A A

The Research On Statistical And Spatial Data Publication Algorithms Satisfying Differential Privacy

Posted on:2019-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:M H LiFull Text:PDF
GTID:2348330542989027Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays the user data collected by corporations and service providers contains a great deal of useful information,so data sharing is often required.But the consequent privacy leak accidents are emerging,thus people pay attention to the development of accurate privacy protection data release methods.Differential privacy theory is based on rigorous mathematics,and it could provide accurate definition of privacy protection and quantitative assessment methods.So this theory is widely applied to many fields that are relevant to the data statistics and release.The different differential privacy publishing methods are usually required for different data publishing methods.In this paper,we focus on two types of data publishing methods,and then propose corresponding improvements.For the data publication based on the histogram which is applied to the statistical data,this paper presents the partition histogram algorithm based on the Haar Wavelet Transform.The algorithm uses the greedy partition algorithm to get a good partition structure at first,and then uses the wavelet transform to add the noise.At last,the algorithm restores it back to the original histogram structure.The algorithm not only reduces the complexity of the wavelet tree constructed by wavelet transform,but also decreases the query noise from multiple linear growth to polylogarithmic growth,so the accuracy of histogram query can be improved.For the data publication based on the partition which is applied to the geospatial data,this paper proposes the adaptive noise-adding algorithm based on grids.Firstly,The algorithm partitions the data domain into grids which have equal size,and adds the Laplace noise of uniform scale parameters to each grid,and then optimizes the selected set of grids recursively which could reduce the relative error of each grid by adding noise adaptively.At last,the algorithm gets a second layer of grids by partitioning the optimized grids.On one hand,the algorithm can adaptively add noise according to the counts in each grid.On the other hand,it can reduce the query error and improve the accuracy of the query.
Keywords/Search Tags:Data Publication, Privacy Protection, Differential Privacy, Noise Optimization
PDF Full Text Request
Related items