Font Size: a A A

An Efficient Algorithm Of Mining Frequent Subgraph Patterns In Uncertain Graph Database

Posted on:2014-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:W L WangFull Text:PDF
GTID:2268330422450459Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There are more and more application domains that start using graphs which arecalled graph data to represent and store relationships between data objects with therapid development of the information technology. It is a very important researchtopic and has a great value in many applications and domains mining the graph datato get the structural features,forming principles and existential patterns of therelationships between data objects,since there are large amout of userful knowledgeand the information that we are interested in in the graph data。In practical applications of many domains, the data we got are withuncertainties naturally itself due to some reasons,like the imprecision of the dataand the objective limits of data obtaining techniques,etc. The graph data withuncertainties is called uncertain graph data. For example, the data of theprotein-protein interactions network in bioinformatics and the data of sensornetworks,etc.Mining frequent subgraph patterns in graph dataset is a very import kind ofgraph mining with many approaches and great interest which can obtain muchuseful information from graph data as in protein-protein interactions network inbioinformatics and sensor networks. So there is a growing interest in miningfrequent subgraph patterns in graph dataset which has two main difficulties to dealwith. The first one is there are huge numbers of candidate patterns to be examined,and there are huge numbers of subgraph isomorphism tests required to find thegraphs that contain a given pattern. The latter becomes even more challenging,when dealing with uncertain graphs.Focusing on the objectives of minimizing the time and memory cost of thealgorithm of mining frequent subgraph patterns in uncertain graph database,we doa study of mining frequent subgraph patterns in uncertain graph database underexpected semantics with the theories,methodologies and techniques of data mining,probability and algorithms. The main contribution we give is proposeing analgorithm we called MUSIC(Mining Uncertain Subgraph Patterns With Index ofConnectivity and Edge),which uses an index of the uncertain graph database toreduce the number of comparisons needed to get the frequent subgraph patterns weneed. MUSIC relies on the apriori property for enumerating candidate subgraphpatterns efficiently. Furthermore, the index can be used to reduce the number ofcomparisons required for computing the expected support of each candidate pattern. We will also show further increase the efficiency of the algorithm with additionaloptimizations with respect to scheduling and early termination and the evaluation ofour approach on three real datasets as well as on synthetic uncertain graphdatabases to demonstrate the significant cost savings with respect to thestate-of-the-art approach.
Keywords/Search Tags:Uncertain Graph, Frequent Subgraph Pattern, Expected Support, Uncertain Graph Index
PDF Full Text Request
Related items