The purpose of biological network inference is to constructing reliable mathematical model for biological systems with efficient measurements.The models of biological networks are crucial to under the regulation mechanism underlying the measured data,as well as to guide the modular construction of synthetic gene circuits.With development of omics data including microarrays,computational modeling of biological networks including gene networks become feasible.However,network inference needs support from high performance algorithm besides prior knowledge.Feature selection in machine learning provides a promising solution for inference problem.This study applies feature selection and graph-based measure in gene regulatory network(GRN)inference:The major works are described as following:1)For small-scale GRN,the linear model is still a good choice to reconstruct GRN.Under the linear model assumption,using support vector machine regression(S VR)method for feature selection to reconstruct the whole gene regulatory network.Compared with the singular value decomposition(SVD)method,the SVR-based method obtain higher accuracy.Using the corresponding sequential data set of GRN,the experimental results verify the effectiveness of the algorithm.2)Considering nonlinearity in GRN,tree-based regression approaches have advantages in efficiency and accuracy compare with linear regression approaches in general.Different tree-based methods cover the dynamics of the same network from various perspective.This paper applies Gradient Boosting to infer the GRN,then integrating multiple inferred outcomes from several inferring algorithms including Random Forest via a weighted voting mechanism.As the calculation of weights for each outcome is unsupervised,this paper defines a score to evaluate the degree of reliability for each outcome,and use this score in determine the weights for different tree-based regression approaches.The simulation outcomes validate the effectiveness of the proposed method.3)With the inferred topology,this chapter evaluate the importance score of nodes in a digraph through topological analysis,thus selecting the subset of key genes.In this study,the first level of key gene nodes correspond to root strongly connected components(SCC),which are located in upstream of information flow in a given digraph.In order to determine the unique set of root SCC,this chapter defines a cost function using graph-based measure and applies GA approach to minimize the cost function.After obtaining the root SCC,the proposed hierarchical estimation strategy first calculates the regulatory parameters relevant with key genes,then extending to the next stage of genes using the parameters in later level as prior knowledge.In this way,original parameter estimation problem is decomposed into a set of sub-problems with various priority levels.Experimental outcomes indicate that hierarchical estimation strategy is able to obtain lower MSE indexes compared with the traditional one-time-all strategy.Besides,the computational time is much less. |