Research On Ensemble Feature Selection And Construction Of Gene Regulatory Networks

Posted on:2017-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:G B Zhou

Full Text:PDF

GTID:2310330488458750

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of biological information technology, the emergence of massive genomic data promoted human entering post genome era, enabled the researchers to understanding the complex biological process governing the life of a living organism from the perspective of system, no longer limited to study the function of a single gene. Under this background, systems biology has got the fast development. Proper elucidation of gene regulatory networks has become one of the major challenges of the systems biology. Gene regulatory networks describe the interactions between genes in the form of graphical. Reverse engineering gene regulatory networks are easier to help us understand the molecular mechanisms of living cells maintaining stability when fluctuating environmental conditions. With the development of DNA microarray technology, substantial accumulation of gene expression data have given rise to a lot of computational methods for constructing gene regulatory networks. In addition, gene sequence data and functional annotation data are constantly emerging. How to effectively use the complementary relations between different data sources is very important to construct gene regulatory networks accurately due to different types of data often provides different information.Using feature selection methods to construct gene regulatory networks from gene expression data suffering a shortcoming, which the importance score of each potential edge in the network only given, but not decided a suitable cut-off value to transform the obtained ranking into an actual network structure. In this work, we present EFI-GA, a method for the construction of gene regulatory networks from gene expression data by combining ensemble feature importance algorithm with genetic algorithm. In this method, an importance score for each potential regulators is computed with respect to the target gene using the ensemble feature importance algorithm, and a high importance score represents the confidence that this is a true regulating link between both genes. Next, a perfect subset of regulators with high degree of confidence will be selected by genetic algorithm. The experimental results on DREAM4 network inference challenge datasets show that the effectiveness of the method.In response to external environment stimulation or completing a life process, transcription factors regulate the expression of their target genes involved in the same life process, so they often share the same or similar functions. The accuracy of the construction of gene regulatory networks will be improved when take into consideration of the functional relationship between transcription factors and their target genes. We proposed a method to construct gene regulatory networks, which combines gene expression data, gene sequence data and Gene Ontology data in order to take advantage of relevant features of different data sources to improve the accuracy of the construction of gene regulatory networks. Using a variety of data source to build characteristic vector, and using support vector machine to build classification model to predict the interactions between transcription factors and target genes. Experimental results on Arabidopsis Thaliana dataset and Tomato dataset show that the method has a higher accuracy.

Keywords/Search Tags:

Multiple Data Integration, Gene Regulatory Networks, Ensemble Feature Selection, Gene Ontology

PDF Full Text Request

Related items

1	Researches On Gene Regulatory Network Reconstruction Based On The Feature Selection And Topology Analysis
2	Construction Of Gene Regulatory Networks Based On Data Integration
3	Research On 2D Spatial Gene Selection Algorithm Based On Unbalanced Gene Data
4	Research On The Construction Method Of Gene Regulatory Network Based On Feature Selection
5	Researches On Optimized Characteristic Gene Selection Based On Neighborhood Mutual Information
6	Inferring gene regulatory networks from expression data using ensemble methods
7	Parallel Feature Selection And Ensemble Classification For Gene Expression Data
8	Constructing Gene Regulatory Network By Integrating Diverse Genomic Data
9	NF-?B Target Genes Spectrum And Study Of Gene Regulatory Networks
10	Merging Multiple Microarry Datasets To Build Gene Regulatory Networks