Font Size: a A A

Research On Ensemble Feature Selection And Construction Of Gene Regulatory Networks

Posted on:2017-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:G B ZhouFull Text:PDF
GTID:2310330488458750Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of biological information technology, the emergence of massive genomic data promoted human entering post genome era, enabled the researchers to understanding the complex biological process governing the life of a living organism from the perspective of system, no longer limited to study the function of a single gene. Under this background, systems biology has got the fast development. Proper elucidation of gene regulatory networks has become one of the major challenges of the systems biology. Gene regulatory networks describe the interactions between genes in the form of graphical. Reverse engineering gene regulatory networks are easier to help us understand the molecular mechanisms of living cells maintaining stability when fluctuating environmental conditions. With the development of DNA microarray technology, substantial accumulation of gene expression data have given rise to a lot of computational methods for constructing gene regulatory networks. In addition, gene sequence data and functional annotation data are constantly emerging. How to effectively use the complementary relations between different data sources is very important to construct gene regulatory networks accurately due to different types of data often provides different information.Using feature selection methods to construct gene regulatory networks from gene expression data suffering a shortcoming, which the importance score of each potential edge in the network only given, but not decided a suitable cut-off value to transform the obtained ranking into an actual network structure. In this work, we present EFI-GA, a method for the construction of gene regulatory networks from gene expression data by combining ensemble feature importance algorithm with genetic algorithm. In this method, an importance score for each potential regulators is computed with respect to the target gene using the ensemble feature importance algorithm, and a high importance score represents the confidence that this is a true regulating link between both genes. Next, a perfect subset of regulators with high degree of confidence will be selected by genetic algorithm. The experimental results on DREAM4 network inference challenge datasets show that the effectiveness of the method.In response to external environment stimulation or completing a life process, transcription factors regulate the expression of their target genes involved in the same life process, so they often share the same or similar functions. The accuracy of the construction of gene regulatory networks will be improved when take into consideration of the functional relationship between transcription factors and their target genes. We proposed a method to construct gene regulatory networks, which combines gene expression data, gene sequence data and Gene Ontology data in order to take advantage of relevant features of different data sources to improve the accuracy of the construction of gene regulatory networks. Using a variety of data source to build characteristic vector, and using support vector machine to build classification model to predict the interactions between transcription factors and target genes. Experimental results on Arabidopsis Thaliana dataset and Tomato dataset show that the method has a higher accuracy.
Keywords/Search Tags:Multiple Data Integration, Gene Regulatory Networks, Ensemble Feature Selection, Gene Ontology
PDF Full Text Request
Related items