Font Size: a A A

The Research On Data Publication Algorithms Satisfying Differential Privacy

Posted on:2018-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y XuFull Text:PDF
GTID:2348330512977216Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays many service providers have accumulated myriad user data.Sharing data avoids data precipitation,but it involves user privacy.Thus,the issue of privacy protection in data release has gradually attracted the attention of academia and industry.Differential privacy protection model has been applied to varied fields due to its excellent performance.This paper studies data publishing algorithms that satisfy this model and proposes different solutions for different data types.Our goal is to improve the accuracy of published data by ensuring that the publishing algorithm meets this model.For the numerical data,this article proposes a data distribution algorithm based on ordered histogram partitioning.The algorithm performs the first transformation on the histogram intervals and adjusts the interval order under the premise of preserving the histogram structure,which eradicates "partition truncation".Meanwhile,we use the greedy partition algorithm and consider the impact of the Laplace noise on the partition scheme.This avoids the prior agreement on the number of partitions and improves the applicability of the algorithm to different datasets.Our algorithm improves the accuracy of the range query on the histogram by ensuring that the published histogram satisfies the differential privacy protection model.For the associated data represented by complex graphs,in this article we propose an algorithm based on hierarchical stochastic graph partitioning.Considering that the user's interest in the data is usually concentrated on a community,we apply differential privacy protection to the edges of a complex graph in different degrees,combined with community detection.Based on divide and conquer,our algorithm constructs a hierarchical random graph of the community,and then merges the sub-level random graph,which eliminates the inefficiency of the original algorithm when dealing with large-scale complex graphs.Our algorithm trades off the different parts of the complex graph,as well as privacy and data availability.Consequently,according to the distribution of node degree and the shortest path distribution,the proposed algorithm keeps the properties of the primitive complex graph better in the experiment.
Keywords/Search Tags:Data Publication, Privacy Protection, Differential Privacy
PDF Full Text Request
Related items