Research On Frequent Pattern Mining With Local Differential Privacy

Posted on:2022-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:J L Wang

Full Text:PDF

GTID:2518306542963499

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Over the past 30 years,data has increased on a large scale in various field.Such massive amounts of data may have potential correlations or patterns,which can be mined for more interesting knowledge.However,it can also pose a threat to privacy while analysts mine data without any restrictions,which will inevitably lead to serious privacy problems.Data aggregation with local differential privacy(LDP)guarantees strong and fine-grained privacy for users by providing plausible deniability of sensitive data that has a broad application prospect.This thesis focuses on the privacy-preserving mining frequent patterns task in the context of LDP,and the main work is as follows:(1)This thesis proposes a frequent pattern tree(FP-Tree)based approach to discover frequent itemsets in the set-valued data setting,which has good extensibility and availability.It is the first time to construct one noise FP-Tree in a breadth-first way while satifying LDP,and the designed pruning algorithm of tree nodes can effectively reduce the scale of injected noise.Then,two post-processing optimizations,named imposing consistency and weighted combination,are designed to identify consistent itemsets.The former exploits consistency among nodes so that they will not break constraints.And the latter increases effectiveness of all published results.Finally,the strong privacy and high usability of the proposed approach are demonstrated by the experimental analysis with the leading SVSM approach as well as the theoretical analysis and proof.(2)This thesis proposes an extended FP-Tree based top-k query approach in the context of LDP,which can be applied to multiset data scenarios to collect frequent high-utility items.Unlike the set-valued data scenarios,which implicitly assumes that each item has equal value or utility,multiset data marks the utility attributes of items and assumes that the utility varies between items.The proposed approach constructs the noise FP-Tree by using key-value data collection protocol,i.e.,PCKV,and optimizes it to solve the problem of private storage and query over multiset data.Finally,taking refined O-PCKV as the comparison approach,the performance differences of two approaches are compared in the real dataset experiments.

Keywords/Search Tags:

LDP, FP-Tree, frequent itemsets, multiset data, high-utility items

PDF Full Text Request

Related items

1	Research On Frequent And High Utility Itemsets Miningalgorithm With Multiple Minimum Utility Thresholds
2	The Study Of Mining Algorithm For High Utility And Frequent Itemsets Based On Multiple Minimum Support
3	FP-Tree Based Mining Frequent Itemsets Over Data Streams
4	Research On Count-based Algorithm For Mining Frequent Items Over Data Stream
5	Research On Frequent And High Utility Itemset Mining Algorithms In Association Rule Mining
6	Research On Data Items Frequent Itemsets Mining Algorithm Based On Sliding Window
7	The Research And Implementation Of Mining Frequent Itemsets Algorithm Over Streaming Data
8	Earch On Frequent And High Utility Itemset Mining Algorithms
9	High-utility Association Rule Mining
10	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System