Font Size: a A A

Frequent Sequence Mining For Local Difference Privacy

Posted on:2021-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:C GongFull Text:PDF
GTID:2428330605456906Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of science and technology,e-commerce platforms are gradually integrated into people's lives.Commodity recommendation is an effective means for e-commerce platforms to provide personalized services,however,the user data collected by the e-commerce platform leads to privacy leakage.To solve this problem,researchers have proposed privacy protection techniques for frequent itemsets and privacy protection techniques for frequent sequences,including frequent pattern protection based on pruning ideas,frequent pattern protection based on April algorithm,and frequent sequence mining for differential privacy etc.Because the current research on frequent sequence protection mainly focuses on traditional algorithms and pruning ideas,and the exploration of frequent sequence protection based on time series is relatively few,its research often ignores the problems of sequence redundancy and so on.In order to solve the above problem,based on the attributes of the dataset,this thesis considers how to complete the design of mining algorithms under privacy protection.In this,mainly focus on how to reduce sequence redundancy and how to improve sequence applicability,the specific work is:It mainly researches how to effectively obtain the reduction representation of the optimal frequent sequences,and reduce the privacy leakage of frequent sequences before and after reconstruction.In the regard,this thesis proposes a frequent sequence mining algorithm that satisfies local differential privacy.Firstly,for redundant and non-redundant attributes,the redundant sequence pruning method and the FP-Growth principle are used to prune the transaction data set,and make their meet local differential privacy.Secondly,this thesis uses Boolean association rule mining algorithm and random response to perform interference division,this paper uses the ID number of time series and redundant sequence pruning method,to find the optimal frequent k-sequence pattern(1 ≤k≤n),and provides a theoretical basis for reducing the frequency of frequent sequences.Thirdly,the ratio changes before and after frequent sequence disturbances and the proprietary privacy budget are used to determine the optimal local sensitivity,to solve the problem of double allocation of privacy budget.Finally,extensive experimental tests were performed using the kosarak data set,and the stability and feasibility of algorithm in this paper were effectively verified through experimental indicators such as frequent sequence accuracy and running time,through theoretical research and experimental analysis,it shows that this algorithm has higher privacy and better data utility compared with traditional algorithmsFigure[21]table[7]reference[81]...
Keywords/Search Tags:local differential privacy, frequent sequences, proprietary privacy budget, local sensitivity, random response
PDF Full Text Request
Related items