Font Size: a A A

Research On Sharing And Publishing Technology Of Frequent Itemsets Based On Differential Privacy

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:X H BaoFull Text:PDF
GTID:2518306740482544Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing attention paid to privacy protection in data analysis,the issue of privacy-preserving frequent itemsets publishing has received continuous attention from researchers,and a series of frequent itemsets release methods based on differential privacy have been proposed.According to the information accuracy and data situation of transaction data,frequent itemsets mining is mainly divided into static deterministic data,static uncertain data,and deterministic stream data.At present,the research on privacy-preserving frequent itemsets publishing for static deterministic data has become mature.The privacy-preserving frequent itemsets publishing for static uncertain data and deterministic stream data still have many shortcomings in terms of data privacy security and accuracy of release results.The work of this thesis is as follows:(1)The accuracy of the existing privacy-preserving top-k frequent itemsets publishing method of uncertain data is affected by the value of k,and it is difficult to balance data availability and privacy security.This thesis integrates the privacy protection mechanism with the mining process,and separates the noise addition operation from the top-k itemsets screening to avoid the dependence of the algorithm accuracy on k.The candidate level information extraction strategy is designed,and the characteristic of the upper threshold of uncertain data is used to reduce the search space and the privacy budget.On this basis,an algorithm for publishing frequent itemsets of static uncertain data based on differential privacy is proposed to realize the safe publishing of frequent itemsets.(2)The accuracy of the existing methods for privacy-preserving publishing frequent itemsets of data stream depends heavily on the rationality of the number of windows,and the truncation error is large.This thesis designs an adaptive -dynamic sliding window protocol to realize that the publishing accuracy is independent of the value of .Frequent itemsets information is used for transaction truncation,and the concept of negative items is introduced to maximize the retention of frequent itemsets in transactions and reduce truncation errors.On this basis,a privacy-preserving data stream frequent itemsets publishing method DP?DFIM is proposed to realize the safe release of data stream frequent itemsets.Theoretical analysis and experimental results show that the proposed methods can ensure the accuracy of the itemsets and their supports while ensuring that the published itemsets meet the differential privacy.
Keywords/Search Tags:Frequent Itemsets Publishing, Uncertain Dataset, Data Stream, Differential Privacy
PDF Full Text Request
Related items