Font Size: a A A

Inferring Gene Regulatory Networks Based On Logic Relationships

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:R J HeFull Text:PDF
GTID:2370330572952129Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Gene regulatory network is one of the most significant issues in bioinformatics due to the fact that it is the network model of analysis of mutual regulation relationship of gene.Meanwhile,with the rapid development of gene chip technique,it is possible to infer gene regulatory network by applying high-throughput gene expression data.Gene regulatory network allows people to be able to analyze the mutual regulation relationship of gene.Furthermore,it makes contribution in gene function observation and prediction of virulence gene as well.This has extremely effect on diagnosis of complex disease,cus tom therapeutic regimen and research of specified medicine.Nowadays,there is large number of mathematical models of inferring gene regulatory network,while information theory is an important field among them.It is capable to measure the nonlinear relationship of gene correctly and effectively by calculating the mutual information between genes.Additionally,it can also calculate continuous data.However,there are two major weaknesses in this method.Firstly,it is common that mutual information of continuous data could be estimated by a technique called kernel density estimator.It can only be suitable for the data which satisfies Gaussian distribution,which means it is improper when data is unknown distributed.Secondly,mutual information is possible to overestimate the mutual relationship between genes.Hence,the accuracy of inferring network is relative low since numerous false positives occur.This paper is based on the two issues mentioned above.On the one hand,the k-nearest neighbors estimation method is applied to achieve the estimation of entropy of continuous data and mutual information.This is able to solve the problem of unknown distributed continuous data mentioned previously.Since the chosen parameters are relatively sensitive in this method,experiment and research are designed and made respectively for chosen parameters in this paper.On the other hand,logic relationships are applied to measure the mutual relationship between genes.Although logic relationships are first used in discrete data,it is extended to the continuous case in this paper.The general principle is that measure the mutual relationship between genes from another aspect by calculating uncertainty coefficients.In this case,part of false positives created by overestimation of applying mutual information is eliminated.Therefore,in this paper,gene regulatory network could be inferred when the continuous data are unknown distributed by applying the combination of k-nearest neighbors estimation method and logic relationships.In order to verify the method of this paper,the data of DREAM 3 Challenge were used to infer the network with node sizes of 10,50 and 100 respectively,and compared with the classic algorithms ARACNE and NARROMI which based on the mutual information to infer gene regulatory network.The results show that for the case of 10,using our method to estimate the mutual information will produce a certain error because of the small sample size,so the result is not as good as the above two classical algorithms.When the node size is 50 and 100,the method of this paper is superior to the above two algorithms in terms of false positive rate,MCC and F-score.This also shows that the method of this paper can be deleted those false positive edges which are overestimate by mutual information.Therefore,it is concluded that the method of this paper requires a certain sample size to accurately infer the gene regulatory network with unknown distribution rules.
Keywords/Search Tags:Gene regulatory network, mutual information, k-nearest neighbors, logic relationships
PDF Full Text Request
Related items