Font Size: a A A

Research On Differentially Private Frequent Subgraph Mining

Posted on:2018-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:K XiaoFull Text:PDF
GTID:2348330518497012Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data mining becomes a powerful tool for researchers to extract knowledge from the big data in these years. Frequent pattern mining is one to mine association rules among transactions in data mining. As a part of it, frequent subgragh mining aims at getting frequent subgraphs from a collection of input graphs, which is important in graph analysis.However, if the input graph contains sensitive information, report or share the frequent subgraph may pose the individual's information unexpectedly. To address this problem, we propose a differentially private subgraph mining algorithm named DFG.DFG utilizes a two-phase model. In the first part of DFG, we get the frequent subgraphs in an increasing order according to the number of edge in a graph. In each step, we privately get the number of frequent subgraph by using binary estimate algorithm, and select the frequent subgraph with conditional exponential mechanism; In the second phase,we build a lattice according to the relationship between each pair of frequent subgraphs. Then based on noisy-aware path construction method,we construct multiple paths which covers the entire lattice. With the count accumulation method, we get the noisy support to each frequent subgraph in each path. We optimized the noisy of each frequent subgraph which is contained in multiple paths using noisy merge mechanism. We prove that our DFG algorithm satisfies ?-differential privacy. Further experiment shows that DFG attains higher data utility than the state-of-the-art algorithm while get frequent subgraphs privately. Moreover, we extend our DFG algorithm to differentially private frequent pattern mining. We proposed DFISL algorithm by applying DFG to frequent itemset mining.Instead of path construction, we propose sub lattice construction method and counting noisy based on multiple sub lattices, which decrease the amount of noisy added on each frequent itemset effectively. The experiment result shows our DFISL algorithm performs better than PFP-growth and DiffFIM algorithm on data utility.
Keywords/Search Tags:data mining, differential privacy, frequent subgraph mining, two-phase, lattice
PDF Full Text Request
Related items