Algorithms For Mining Frequent Subgraphs And Its Applications In Biological Network

Posted on:2010-08-07

Degree:Master

Type:Thesis

Country:China

Candidate:A G Dong

Full Text:PDF

GTID:2178360278955036

Subject:Traffic Information Engineering & Control

Abstract/Summary:

PDF Full Text Request

The major completion of human genome plan indicate that the age of post-genome is finally coming. The abundant bio-information data people accumulated in recent years provide the basic reference for revealing the mystery of lives, and the focus of research in biology have changed from the partial research of individual genes or proteins in cells to the system research based on the entire genes and proteins as well as the metabolic products. The research on the structure and the detection technique on gene regulatory networks, protein- protein interaction networks and metabolic networks have promoted the molecule-biology age to the system-biology age stage by stage. Genes and proteins can generate superiority functional modules by netty interactions, therefore, using mathematic modeling to design efficient algorithms for mining and analyzing the modules is help to study the functions of the organism itself and the evolvement relations between different species, and could provide warrants for analyzing and understanding the basic disciplines of lives. In this paper, some works are done in the frequent subgraph mining.Some classical graph-based algorithms for mining frequent subgraphs are systematically studied and comprehensively summarized in this paper. based on this, a novel algorithm for mining frequent subgraphs is proposed, which consists of subgraph searching and graph isomorphism testing. For subgraph searching, a definition of ring destribution is introduced, and an algorithm based on ring distribution - ESR(Enumerate Subgraphs based on Ring) is presented. For graph isomorphism, by using degree sequences and eigenvalues of a graph, two algorithms are proposed for directed graphs and undirected ones, respectively. Graph isomorphism algorithm is used to categorize the searched subgraphs, and frequent subgraphs are obtained according to the result of the categorizing. When the size of a network is large, there are a large number of subgraphs, which leads to high computational effort. As a result, we put forward a random categorizing algoritm and a Hamilton subgraph mining algorithm. The former selects at random a certain quantity of subgraphs for isomorphism testing, which is an approximate algorithm. And the latter aims at finding only certain type of subgraphs, such as subgraphs with Hamilton cycle, which reduces the number of searched subgraphs. Experimental evaluation on five real biological networks shows that our algorithm is superior to the algorithms available.

Keywords/Search Tags:

data mining, frequent subgraph, graph isomorphism, motif, algorithm, Hamilton subgraph

PDF Full Text Request

Related items

1	Graph Isomorphism Algorithm Based On Feature Compression And Application In Network Motif Discovery
2	Research On Exact Subgraph Search Technology In Graph Database
3	Frequent Subgraph Mining Algorithm On Historical Graph Data
4	Research On The Algorithm For Mining Structured Data
5	Research On Distributed Subgraph Matching Algorithm For Large Scale Graph Data
6	Research Of Subgraph Isomorphism Search Algorithm On Graph Data
7	Research Of Query Technologies On Graph Data
8	Experimental Study On Extended Subgraph Isomorphism Problem
9	Research And Application Of Mining Graph Structure Algorithm
10	Design And Implementation Of Efficient Subgraph Matching Algorithm On Large