Font Size: a A A

Research Of Reconstructing Gene Regulatory Network Based On Recurrent Neural Network Model

Posted on:2011-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:L Y SuFull Text:PDF
GTID:2120360305454887Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In-depth study of molecular biology reveals that the complex biological phenomena isthe results of genetic regulation among genes, but so far there has not been yet fully awareof the mechanism of gene regulation. In order to understand the biological system ofinternal models and mechanisms, building gene regulation network through geneexpression data is the important part of the functional genomics research. However, due tothe imperfect current gene expression data, successfully constructing a gene regulatorynetwork model by a small amount of expression data is a severe test for various regulatorymodels.This paper explains the principle of building gene regulatory network, describes andanalyses several gene regulatory network constructing models. We realize that they can notdig out biology sense because they can not make use of time-series gene expression dataeffectively.To solve this problem, we propose a recurrent neural network model to construct generegulatory networks in this paper. Recurrent neural network has been widely used inmodeling time series data because of its flexible nonlinear modeling capability. Thisdata-driven modeling approach which can repeat gene expression data is applied to geneexpression data. Its recursive properties not only make good use of time series geneexpression data, but also reflect the dynamic nature of characteristics of gene regulationwhich is consistent with the biological characteristics. Theoretically, recurrent neuralnetwork can construct a biological significance of gene regulatory networks by usingmicroarray data.One of the major problems for genetic network inference is the curse of dimensionality,which describes the thousands of genes but only a limited number of time points. Thissituation limits the application of many data-driven computational models and makes itvery difficult to infer a fully determined large-scale regulatory network. Theoretically, it isfeasible to find these connection weights from gene expression data. However, it is verydifficult to find the exact weight from a small number of currently available geneexpression data. The biological knowledge of genetic regulatory networks assumes that agene is only regulated by a limited number of genes. In other words, the regulatorynetworks are sparsely connected rather than fully connected and most weight values arezeros. It is reasonable to identify the weights whose values are nonzeros from these data.We propose a two-step procedure for genetic regulatory network inference. The goal of the first step is to train structure of the network, to determine which weight values arenonsignificant and clamp them to zero. With the result of the first step, the nonzero weightscan be further fine-tuned. This procedure provides a way to identify and understand theregulatory mechanism in a genetic network.We use simulated annealing (SA) approach to search the network structure space,which is a heuristic random search process based on the Monte Carlo iterative solutionmethod. Simulated annealing approach has been theoretically proven can converge to theoptimal solution of the global optimization algorithm with probability 1. In this paper, 1, 0represent the weight matrices in the availability of interaction.We use particle swarm optimization (PSO) to train the parameters of the network. PSOhas been widely used due to its fast convergence and simple practicality. We improve thePSO based on the antibody concentration regulation theory and the vaccine inoculationimmune principle because of its regular falling into local optimum. Firstly, we select thenext generation of particles (antibody) based on the antibody concentration regulationtheory, the concentration of particles is inversely proportional to the size of selectionprobability, which introduce the diversity to improve the algorithm's global searchcapabilities from falling into local solutions. Secondly, we maintain the appropriatestructure based on the vaccine inoculation immune principle to guide the search, whichimprove the performance of optimization. It is demonstrated that the improved particleswarm optimization algorithm was successful through simulated data experiments. Itobviously enhances the method's global search capability.Because of the current biological experiment conditions, the microarray data haveseveral defects such as noisy, missing values, the time interval long and a limited number oftime point data. In addition, it is too difficult to verify the correctness of the result on thecurrent biological knowledge. As a result, we use artificial simulation data to verify theeffectiveness of the algorithm. First we apply the algorithm to a simplified synthetic geneticnetwork with four genes. The result reflects that time series expression data are well fit,which means the algorithm is effective. Meanwhile, the advantage of the method is provedby comparing to other constructing methods. We employed IPSO/RNN to analyze the SOSDNA repair network in bacterium Escherichia coli. DNA damage can induce the release ofintracellular virus, strengthen the role of viral transformation, enhance chromosomalrecombination and formation of cell plasminogen activator, etc., which may be have adirect relationship with oncogene activation and tumor formation. Therefore, the study ofDNA SOS repair system has important biological significance. The conclusion is that themethod proposed in this paper can identify gene regulatory network to a certain extent bycomparing the inferred network and really biological network. The method is effective anduseful on the real biological data, which can provide some help for biologists.The content of this paper includes introducing biological knowledge, datapreprocessing, gene regulation network construction principles and methods, recurrent neural network model theory, particle swarm optimization theory and principles ofsimulated annealing algorithm. Focusing on how to reconstruct gene regulatory networkbased on the recurrent neural network model. We propose a two-step procedure for geneticregulatory network inference. At first we use simulated annealing algorithm to searchnetwork structure space and find meaningful weights that indicate the regulatory relations.Secondly we adopt improved particle swarm optimization algorithm based on immuneprinciple for determining the network parameters. Our approach has been applied to bothartificial data sets and Desoxyribonucleic acid (DNA)Repair System of Escherichia colidata sets. The result demonstrates that the method in this paper is effective forreconstructing gene regulatory network.
Keywords/Search Tags:gene regulatory, network recurrent neural network, simulate annealing, immunesystem, particle swarm optimization
PDF Full Text Request
Related items