Research On Frequent Itemsets Mining Algorithms Based On Differential Privacy Protection

Posted on:2019-12-30

Degree:Master

Type:Thesis

Country:China

Candidate:P Luo

Full Text:PDF

GTID:2518306512456304

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of Internet information technology,the data has been as a kind of important resource demand by the government and institutions,through the analysis of these huge amounts of data researchers can get more information about the current world,so the data mining tcchnology appcared.As a kind of data mining,frequent pattern mining has been widely used in the application of recommendation system and personalized website.However,due to the privacy disclosure in recent years,data mining technology is facing serious challenges.How to apply the frequent pattern mining to obtain the valuable model and realize the protection of personal privacy information has become a research hotspot in this field.And differential privacy protection model,to protect the privacy of data information points out a new strategy,since it has the strict model and background knowledge can effectively prevent the attack and the attention of academia.How to improve the efficiency of mining algorithm and get the high availability result set under the condition of differential privacy protection has become the focus of research in this field.This paper works on the efficiency of frequent itemsets mining,algorithms under differential privacy.Through in-depth analysis of factors that restrict the efficiency of differential privacy protection algorithms,an improved algorithm is proposed and researched.The main results arc as follows:1)For DP-topkP(Differentially Private top-k Pattern Mining)algorithm in a database containing a large number of long transactions,When the minimum threshold gradually becomes smaller or the transaction data sets continue to increase,it takes a lot of time,so we put forward a kind of improved algorithm efficient algorithm DP-OPtopkP(Differentially Private Optimal top-k the Pattern Mining),The new algorithm uses a length selection mechanism to predispose the transaction database.Secondly,the candidate frequent item set obtained by the FP-Growth algorithm is used to reduce the collection scale by using the closed frequent item set.The experimental results show that the improved algorithm DP-OPtopkP is improved in efficiency and has good usability.2)Under the condition of large-scale data set,the FP-tree which is brought by the improved algorithm DP-OPtopkP may not stop in the memory,which leads to the rapid decline of the overall efficiency of the algorithm.The parallel improvement scheme of the DP-OPtopkP is proposed.The main idea of this scheme is to synchronize the data in batch.First,the truncated data sets are divided according to the established requirements;then the FP-Growth algorithm is run separately on each partition;then the frequency selection set is divided and the closed frequent itemset algorithm is run separately on each partition;finally,the result set is calculated.Experimental results on large scale datasets show that the parallelized DP-OPtopkP algorithm has obvious advantages.

Keywords/Search Tags:

Differential privacy, Data mining, DP-topkP

PDF Full Text Request

Related items

1	Research On Data Publishing And Mining Method Based On Differential Privacy
2	Research On Key Technologies Of Privacy Preserving Data Mining Based On Local Differential Privacy
3	Real-time Data Privacy Protection With Adaptive ?-event Differential Privacy
4	Research And Application Of Association Rule Mining Algorithm Based On Differential Privacy Protection
5	Statistical Data Analyses Based On Local Differential Privacy
6	Research Of Frequent Itemsets Mining Algorithm With Differential Privacy For Large-scale Data
7	Research On Frequency Estimation And Frequent Itemset Mining For Local Differential Privacy Protection
8	Research On Improvement Of K-means Clustering Algorithm Based On Differential Privacy
9	Research On Data Release And Mining Of Social Network Based On Differential Privacy
10	Research On Differential Privacy Protection Based On Classified Data