Font Size: a A A

Subgraph Size Related Frequent Approximate Subgraph Mining In Single Graph

Posted on:2020-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:J K DouFull Text:PDF
GTID:2370330596468147Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph data plays an important role in the era of Big Data,and it has a wide range of applications in various scenarios,such as social networks,protein interaction networks,partnership networks,etc.The problem studied by this paper is subgraph mining,and the purpose is to achieve frequent approximate subgraph mining.There were many researches in the area of frequent subgraph mining.However,existing researches either does not consider the approximation between subgraphs and their occurrence,or ignore the effect of subgraph size on similarity.However,according to human perception,graph with different size should have different degree of fault tolerance.Similarly,the size of the subgraphs should have a significant impact on the degree of the approximation between the subgraphs and their occurrences.Therefore,this paper designs a strategy to mine frequent approximate subgraphs and consider the effect of the size of subgraphs,and a fast algorithm to mine frequent approximate subgraphs.The algorithm not only considers the approximate occurrence when calculating the support of the subgraph,but also considers the influence of the size of the subgraph on the degree of similarity between subgraphs and their occurrences.First of all,for the generation of candidate frequent approximate subgraphs,this paper proposes an algorithm which all subgraphs of a given single graph are traversed by a traversal method that does not miss any subgraphs.In order to improve the efficiency of traversal and reduce the number of candidate frequent approximate subgraphs,we estimate the upper bound of the size of frequent approximate subgraphs,and pruning the traversal process to filter out some subgraph as early as possible.Experiments prove that the upper bound can improve the efficiency of the generation process.Secondly,in order to improve the efficiency of the algorithm,it is concluded that the supports of the frequent approximate subgraphs are consistent with the property called “Local Anti-monotonicity”.And a strategy is designed to prune candidate frequent approximate subgraphs before the generation of their approximate subgraphs.Experiments show that this property is very important to improve the efficiency of the algorithm.Thirdly,for the approximate subgraph generation of candidate frequent approximate subgraphs,we design an algorithm based on vertex and edge cut.And it is proved that the support of the candidate frequent approximate subgraph acquired by counting only the occurrences of the subgraphs that generated by this algorithm,or by counting all occurrences of all approximate subgraphs are the same.Next,by comparing with the existing algorithm,the algorithm in this paper has obvious advantages in efficiency.Meanwhile,a simple example shows that this algorithm can find frequent subgraphs that cannot be found by traditional algorithm,which further indicates that it is important to consider approximation and the effect of the size of subgraph during the process of mining frequent subgraphs.Finally,we adapt our algorithm to mine frequent closed approximate subgraphs and frequent maximal approximate subgraphs,this improves the integrity of this paper,extends the application scenario of the algorithm in this paper.Experiment results prove the effectiveness of these two algorithms.
Keywords/Search Tags:Graph Data Mining, Frequent Subgraph Mining, Approximation, Pruning, Closed Approximate Subgraph, Maximal Approximate Subgraph
PDF Full Text Request
Related items