Font Size: a A A

Research On Computational Intelligence Methods Of Biological Network Construction

Posted on:2016-05-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y O ZhaoFull Text:PDF
GTID:1220330461484420Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the completion of Human Genome Project (HGP), people have entered into the post-genome era. The aim of the research is gradually changed from the analysis of the composition of genes to the exploration of the function of genes. However, it was found that analyzing a single gene is very difficult to know its specific biological function, because the cell is a complex non-linear system that the execution of any behavior requires the cooperation of several genes, proteins and metabolites. So it makes the researchers concentrate on the relationship of genes and proteins.Biological network are important tools for study the relationship of macromolecules which can reflect the interactions between genes, proteins, metabolites and the environment and play an important role for understanding their functions at molecular levels. At the same time, it can help to reveal the mechanisms of metabolism, signal transduction, cell differentiation, formation and apoptosis which can provide a solid theoretical basis for the development of new drugs, prevention and treatment of diseases.It is a very difficult task to construct biological networks. In tradition, people construct the network manually by the experimental data. However, it is a time-consuming and expensive task; the efficiency is also very low. With the development of the high-throughput technology, more and more data are generated, and the manual methods are unable to meet the rapid growth of the data. So people begin to investigate automatic methods to achieve the task.The work of the thesis is from the point that by use of computational intelligence algorithms and reverse engineering principles automatically builds the biological networks. The work is composed of two parts. One is dynamic network building; the other is static network building. Dynamic network refers to gene regulatory network (GRN) which is constructed by learning from the time series data of gene expression. Static network refers to protein-protein interaction network whose main work is to build prediction models by protein sequence data to determine the network edges (the interaction between proteins). For the study of these two issues, the main contributions and innovations are as follows:1. A dynamic model based on mass action law (MA) is proposed to depict the gene regulatory network.Many models can be used to describe the gene regulatory network. The ordinary differential equations (ODEs) are the most popular as they can reveal the dynamics of gene regulation. Most of the ODEs are based on S-System. However, the parameters of this model have no apparent biological meanings since it is merely a mathematic model and is not inferred from the process of gene regulation. Therefore, even if the model is constructed, it is very difficult to explain. In order to solve this problem, the mass action model is proposed which is based on the classic biochemical reaction law-the mass action law. The equations of the model are inferred by fully considering the biological mechanisms of gene regulation. So it can describe more accurately the regulations between genes compared with traditional models. Moreover, all the parameters of the model are interpretable that are in favor of further analysis.2. An automatic construction algorithm based on mass action model is proposed for inferring the gene regulatory network.Although the mass action model have many merits, but how to automatically construct the model by learning the time series expression data is still a problem. For solving this problem, the hybrid algorithm is proposed which is based on Population based Increment Learning (PBIL) and Trigonometric Differential Evolution (TDE). The proposal has two phases. The first phase is employing the improved PBIL to infer the interactions between genes which are one of the following three states "active", "inhibit", "unregulated", in order to construct the structure of the GRN. The second phase is applying TDE algorithm to optimize the parameters of the model which make the inferred data fit the experimental data at utmost. Experiments carried on an artificial synthetic network with various levels of noise, as well as on two well-known real-life genetic network show that our approach can successfully auto-finish the task of model construction and parameter identification. Compared with other works, this method also has a great improvement in performances. Moreover, the parameters in this model have clear biochemical meanings and are benefit for further analysis.3. A time-delayed mass action model (TDMA) is proposed for inferring the gene regulatory network.The Gene regulation is not an instantaneous procedure. It involves a lot of chemical reactions that consume time. However, the time delay is not considered by many traditional models. To deal with the problem, the time delay is introduced into the mass action model and the time-delayed differential equations is employed to instead of the original ordinary differential equations which can be more precisely depict the real process of gene regulation.4. A parallel automatic construction algorithm based on time-delayed mass action model is proposed for inferring the gene regulatory network.The hybrid algorithm based on PBIL and TDE is also employed for TDMA model inferrence. Due to the added time delay parameters, the computational complex of the automatic construction algorithm is promoted and it would cost more time to obtain a satisfied result. For solving this problem, the Message Passing Interface (MPI) is applied for improving the construction algorithm. MPI is a popular parallel programming model that can transform a serial algorithm to a parallel algorithm which can use the multi-core of the processor more efficiently. Experiments were performed on three well-known network motifs of GRN:C1-FFL, I1-FFL, Bi-fan and a real-life network:simplified IRMA synthetic network. Simulation results show that the proposal can not only successfully infer the network structure and parameters, but also infer the time delay between gene regulations. Compared with other works, the method also has a great improvement in performances.5. A novel ensemble of Probabilistic Neural Network (PNN) is proposed to predict protein-protein interactions.One of the most important problems for construction of protein-protein interaction network is to find the protein pairs which have interactions with each other. The commonly used methods often employ prediction models that are built based on protein sequences to determine the PPIs. Since the protein sequence is very simple, how to choose the right feature to represent protein sequences is very vital. Previous methods tend to select a single optimum feature or a combination of many features. But a single feature cannot fully reflect the characteristic of the protein, and the feature combination is computational intensive and would interfere with each other, although it can represent multi-characteristics of the protein. For solving the problem, the ensemble method is proposed. In order to get more comprehensive information,11 different physicochemical properties and Auto Covariance (AC) method were applied for feature selection. Then,11 different PNN were employed to learn the 11 feature vectors. At last, all the PNN were ensemble to make the final decision. The key advantage of the algorithm is that it combines variety of physicochemical property features to construct diverse individual classifiers for prediction. What makes the method much more attractive is that it not only generated much more diverse and robust individual classifiers, but also contains different interaction physicochemical information that represents the structure and the function of proteins. Moreover, the PNN is robust to noise and trained easily, it is suitable for dealing with large scale noisy PPIs data. Experiment results on DIP, H.pylori and Human datasets show that our proposed method performs better than the other related works.
Keywords/Search Tags:Gene Regulatory Network, Protein-protein Interaction Network, Law of Mass Action, Computational Intelligence, Ensemble Learning
PDF Full Text Request
Related items