Font Size: a A A

Research On The Key Technology Of Frequent Pattern Mining Based On Local Differential Privacy

Posted on:2020-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:N FuFull Text:PDF
GTID:2438330596471161Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technology,massive amounts of data information have been continuously generated by individuals and enterprises.In order to provide users with better personalization,various social organizations are increasingly keen to collect and analyze user data.However,the data often contains personal sensitive information,direct release and analysis will lead to leakage of user privacy,which will pose a threat to user security.Common methods for protecting user privacy include anonymization,data encryption,differential privacy,but these methods have problems that prevent attacks from people with arbitrary background knowledge or attacks from untrusted third-party collectors.To handle this problem,local differential privacy protection model emerges,which ensures that neither the data collector nor other data analysts can accurately obtain the true and sensitive information of each user.Frequent item mining based on local differential privacy has received extensive attention from researchers.Most of the existing mining techniques are based on simple single-valued data.While this paper will discover differentially private frequent items from key-value data.According to the sensitive situation of key and value,different methods are used for mining.One of which uses the partial perturbation and truncation to achieve the higher accuracy.The experimental results show that the above method is superior to the competitors method in terms of distribution accuracy.Frequent sequential pattern mining is another important mining object in this paper.For the sequential data,this paper proposes LDPFSM and ILDPFSM mining methods.The LDPFSM method uses the prefix tree to complete the mining work.In view of the low accuracy and high total communication cost of this method,ILDPFSM uses the idea of user grouping and sampling to make up for the shortcomings.In the experiment,several existing data perturbation methods are used to compare,and the results show that ILDPFSM method is superior to other methods in the usability of mining results.
Keywords/Search Tags:local differential privacy, frequent items, frequent sequences, prefix tree, randomized response
PDF Full Text Request
Related items