Font Size: a A A

Research On Histogram Publishing Technology Based On Differential Privacy

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:C Q WangFull Text:PDF
GTID:2518306740982589Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Personalized statistical features of histogram involve individual privacy,and direct publishing will inevitably bring about privacy leakage problems.In recent years,differential privacy has received continuous attention in the field of privacy protection histogram publishing due to its weak dependence on background knowledge.Researches on privacy protection histogram publishing currently mostly focus on one-time publishing of dataset histogram and continuous publishing of data stream histograms.The existing dataset histogram one-time publishing methods use grouping to reduce the error of published histogram,but they cannot balance the grouping approximation error and the Laplacian error effectively.The existing data stream histograms continuous publishing methods are seldom studied,and they all focus on high real-time publishing scenarios.Applying these methods to low real-time scenarios will result in poor histogram utility.In order to solve the above problems,this thesis proposes histogram publishing methods that satisfy differential privacy constraints.The main work is as follows:(1)In view of existing dataset histogram privacy protection one-time publishing methods cannot effectively balance the grouping approximation error and Laplacian error,resulting in the lack of histogram utility,the constraint inference method is used to obtain the sorted histogram under the premise of satisfying differential privacy constraint;Based on the ordered histogram,a dynamic programming grouping method is proposed,which generates groups with the smallest total error on the histogram with added noise;on this basis,a high-precision histogram publishing method diff-HP that satisfies differential privacy is proposed.(2)In view of existing data stream histograms privacy protection continuous publishing methods mainly focusing on high real-time publishing scenarios.When these methods are applied to low real-time publishing scenarios,it is difficult to effectively identify the stable areas of the bin count stream,and the grouping information will be discarded then resulting in the lack of histogram utility.The 1-delay HCP method that can tolerate low delay in low realtime publishing scenarios is proposed.By comparing the errors of the latest bin count using different publishing methods,the group of the bin count at the time to be published is calculated adaptively.The w-delay HCP method that can tolerate high delay in low real-time publishing scenarios is proposed.Based on the cached data from the time to be published to the latest time,a global grouping method of bin count stream is proposed to obtain the group of the bin count at the time to be published.The two grouping methods reduce the sensitivity by adding Laplacian noise to the group,achieving the effect of improving the utility of published histograms.The delay HCP is further proposed,which can realize the automatic switching between 1-delay HCP and w-delay HCP by detecting the trend of data stream changes in real time.Experimental results on real datasets show that the proposed method can maintain the utility of published histograms while protecting data privacy.
Keywords/Search Tags:Histogram, Differential privacy, One-time Publishing, Continuous Publishing, Grouping
PDF Full Text Request
Related items