Research On Frequent Itemset Mining Of Complex Data Based On Local Differential Privacy

Posted on:2021-02-18

Degree:Master

Type:Thesis

Country:China

Candidate:H J Liu

Full Text:PDF

GTID:2428330620476444

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Frequent itemset mining is a basic data mining task,and it also plays a key role in association rule algorithms.However,it is very likely that the user's personal information will be leaked during the mining process.In recent years,the application of differential privacy protection model to mining frequent itemsets is a relatively common and reliable protection method.Most of the papers are proposed for central differential privacy and few papers apply local differential privacy to mining frequent Itemsets.The advantage of local differential privacy is that the user first disturbs the original data on the client,and then sends the disturbed data to the aggregator,which can prevent the leakage of user data by aggregator,thereby improving the user data protection.At present,there is no complete framework to apply local differential privacy to frequent itemset mining tasks,and there are problems of high communication cost and low mining accuracy.In order to solve the above problems,this paper proposes corresponding solutions:(1)We propose a complete framework for applying local differential privacy to frequent itemset mining,and it is applicable to the case where user data types are more complex.The framework first maps the user's original data to a binary string using bitmap coding.Aiming at the multi-attribute of users,the Threshold Random Response(TRR)algorithm is proposed to achieve the best perturbation method.The user sends the disturbed data to a third-party aggregator.The aggregator performs unbiased estimation on the collected data,and uses FP-Tree for mining frequent itemsets.Finally,we verified the feasibility of the framework through experiments and the TRR algorithm is superior to the existing perturbation algorithms.The TRR algorithm results are more accurate under the same degree of disturbance.(2)We propose a Hadamard Response(HR)algorithm that satisfies local differential privacy and applies to mining frequent itemsets.HR algorithm uses the Hadamard matrix to map and store user data,appropriately increasing the range of possible values output by the user and sampling the mapped strings uniformly.If the user's threshold is 6),the communication cost is reduced from (6))to (7)2)6))and the accuracy of the result is also improved.Finally,we compare the HR algorithm with the previous frequent itemset mining perturbation algorithms through experiments.It was found that the HR algorithm has the lowest communication cost and high availability of results.

Keywords/Search Tags:

local differential privacy, frequent itemset mining, complex data, TRR, Hadamard random response

PDF Full Text Request

Related items

1	Research On Frequency Estimation And Frequent Itemset Mining For Local Differential Privacy Protection
2	Research On Frequent Pattern Mining Algorithm Under Local Differential Privacy
3	Frequent Sequence Mining For Local Difference Privacy
4	Research On Frequent Itemset Mining Based On Differentially Private Model
5	Research On The Key Technology Of Frequent Pattern Mining Based On Local Differential Privacy
6	Study On The Frequent Itemset Mining Based On Differential Privacy
7	Research On Frequent Itemset Mining Method With Differential Privacy Based On Transaction Truncation
8	Statistical Data Analyses Based On Local Differential Privacy
9	Research On Key Technologies Of Privacy Preserving Data Mining Based On Local Differential Privacy
10	Perturbed Data Publishing With Local Differential Privacy Constraints