Font Size: a A A

Reconstruction Of Gene Regulatory Networks From Gene Expression Data

Posted on:2019-01-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:1360330623453325Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Gene regulation networks(GRNs)can mine the life phenomena and their physiological activities from the perspective of gene interaction,which is an interdisciplinary research field of computer,mathematics and biology.The construction and analysis of gene network can help reveal gene function,analyze gene interaction,and provide support and help for disease pathogenesis and drug design.This thesis will make full use of the complex network methods,mathematical statistics and pattern recognition theory to reconstruct the gene regulatory network from gene expression data.The main contributions of thesis are as follows:1.Bayesian network methods cannot handle large-scale gene regulatory networks due to their high computational complexity,and also suffer from false positive problems.We present a novel algorithm,namely local Bayesian network(namely LBN),to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme.Specifically,the LBN algorithm first uses mutual information to construct an initial network or GRN,which is decomposed into a number of local networks or GRNs.Then,BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes,which significantly reduces the exponential search space from all possible GRN structures.Integrating these local BNs forms a tentative network or GRN by performing conditional mutual information,which reduces redundant regulations in the GRN and thus alleviates the false positive problem.The final network or GRN can be obtained by iteratively performing conditional mutual information and local BN on the tentative network.In the iterative process,the false or redundant regulations are gradually removed.When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli,our results suggest that LBN outperforms other state-of-the-art methods(ARACNE,GENIE3 and NARROMI)significantly,with more accurate and robust performance.2.The information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems.By using the ordered conditional mutual information and limited parent node genes,we present a novel algorithm(namely OCMIPN)to fast infer GRNs from gene expression data.The OCMIPN method first uses ordered conditional mutual information to construct an initial GRN relation network.Then,according to the prior knowledge of gene regulatory network topology structure,BN method is employed to generate final GRNs by limiting the number of parent nodes for each gene,which significantly reduces the computational complexity.Tested on the Synthetic networks as well as real biological molecular networks with different sizes and topologies,the results show that OCMIPN can infer RGNs with higher accuracy and low computational times.The OCMIPN's performance outperforms other state-of-the-art methods,such as LASSO,ARACNE,ScanBMA and LBN.3.For mutual information and conditional mutual information,gene regulatory networks suffer from higher false-positive and false-negative problem and can't identify the directions of regulatory interactions.By using the partial mutual information and Bayesian scoring function,we present a novel algorithm(namely PMIBSF).The PMIBSF method firstly constructs an initial gene complete network according to the number of gene nodes.Then,part mutual information was used to delete the redundant correlative edges in the initial gene correlation network.At last,the Bayesian network scoring function is used to learn the Bayesian network structure and infer the gene regulatory network quickly.Tested on the Synthetic networks as well as real biological molecular networks with different sizes and topologies,the results show that PMIBSF can infer RGNs with higher accuracy.The PMIBSF's performance outperforms other state-of-the-art methods,such as LP,PC-alg,NARROMI and ARACNE.4.It is too large that time complexity of reconstruction algorithm for large-scale gene regulatory network(including hundreds or even thousands of gene nodes).We present a new large-scale gene regulatory network algorithm based on common gene module(namely CGMN).Firstly,CGMN algorithm uses six gene clustering algorithms to cluster functional modules with similar gene expression patterns from the gene expression data.Then,the common modules discoved by six clustering methods are took as the gene module nodess.In the end,LBN algorithm is used to construct the gene-modules regulatory network.Simulation results showed that the time complexity of reconstructing the large-scale gene regulatory network can be effectively reduced by searching the number of compressed gene nodes in common modules.CGMN algorithm can effectively construct large-scale gene regulatory network.
Keywords/Search Tags:Gene regulatory network, Mutual information, Conditional mutual information, Limited parent nodes, Bayesian network model, Part mutual information, Common gene module
PDF Full Text Request
Related items