Font Size: a A A

Frequent Subgraph Mining Based On Partially Labeled Graph

Posted on:2010-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q YaoFull Text:PDF
GTID:2178360278960106Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently years, data mining has achieved great success both in science research and practical application field. According to the data mining technical development and the requirement of practical application, the data mining object have been expanded from traditional itemset into structural data, such as path, tree and graph etc, as a result, the graph mining becomes an active research topic in data mining nowadays.Similar to the frequent itemset is the groundwork in traditional itemset mining, frequent graph(i.e. the most frequent substructure in graph database) is the foundation of graph mining. But the complexity of graph mining is more huge than itemset mining,the reason is that, the key issue of frequent graph mining is isomorphism test and the graph isomorphism have been proved a NP-complete problem. With in-depth study, lots of efficient and scaleable algorithm about frequent graph mining has been porposed so for, such as gSpan, Gaston etc.But most of recent algorithm only apply to full labeled graph. To partially labeled graph, they are disabled or useable but no specific optimize process. So, in this paper, we proposed an frequent graph algorithm based on partially labeled graph: PLSG (Partially Labeled based Structural Graph Mining). Exact match and inexact match is proposed in PLSG algorithm, and based on the idea, firstly we preprocess the partially labeled graph ,in order to change partially labeled graph into full labeled graph, then,graph mining based on the full labeled graph. To achieve efficiency and scalability ,PLSG uses lots of optimization method, including to use canonical form to avoid subgraph somorphism, and to generate candidate subgraph through edgeâ†'pathâ†'treeâ†'graph to pruning as soon as possiable.we demonstrate with sufficient experiments: (1)frequent graph mining based on partialy labeled graph, no matter exact match or inexact match, PLSG is able to conduct efficient mining based on partialy labeled graph. (2)frequent graph mining based on full labeled graph, PLSG is more efficient than Gaston.
Keywords/Search Tags:Data Mining, Frequent Itemset, Frequent Subgraph, Structure Mining, Partially Labeled Graph
PDF Full Text Request
Related items