Font Size: a A A

Protein Function Prediction Based On Graph Theory And Interaction Network

Posted on:2016-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y DiaoFull Text:PDF
GTID:2180330461478985Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Protein plays an important role in life activities. Proteomics can get the overall and comprehensive understanding about disease, cell metabolism process and etc., by integration of the interaction between the protein, the interaction of network formation and the different functions in the network, different levels of information. Now the high-throughput biological experiment method and the calculation method obtain a large amount of protein interaction data, but the data of false positives and false negatives noise affect the quality of the data of interaction seriously, which lead to protein function prediction results are not accurate enough.To solve these problems, this paper constitutes the characteristics of network topology structure by comprehensive protein interactions from the perspective of graph theory, introduced to the bi-relational graph and other background knowledge,based on the multi-Iable model by combining with the method of multi-kernel learning and tansductive learning, verify the advantages of the model and the effectiveness of the method in heterogeneous data sources of Yeast and Mouse protein.Concern the problem that protein interaction network of multiple kernels from heterogeneous data sources contains huge amount of information. Due to data redundancy, the predicted results could not fully reflect the distribution of data. The functional categories network and protein interaction networks are combined, a multi-label learning algorithm is proposed based on the directed bi-relational graph model and multi-kernel learning. First, we use a directed bi-relational graph to capture the relationships between pairs of proteins, between pairs of functions, an adaptive learning model is built between proteins and functions, and then using the loss function of equation and Expectation Maximization Algorithm. Next, multiply associative matrices are obtained by using the graph optimization strategy to fuse the functional categories and protein interaction networks. Finally, the prediction model is built by the associative matrices and adaptive learning model. Experimental results using multiple heterogeneous data sources of Yeast and Mouse protein show that the proposed method has better classification effect.Concern the problem that the multiple kernels data sources need to compute the combination of the nuclear matrix coefficient in the fusion process, which demands big memory, time consuming and assumes existence of large amounts of labeled training data in real life. a multi-label learning algorithm is proposed based on the directed bi-relational graph theory and tansductive learning, estimate the label sets of the unlabeled instances effectively by utilizing the information from both labeled and unlabeled data, derive a closed-form solution to this optimization problem and propose an effective algorithm to assign label sets to the unlabeled instances by using the multiply associative matrices. In order to improve the accuracy of the experimental results, we evaluate the performance of our algorithms on the protein interaction network of multiple kernels from heterogeneous data sources, the results show that our approaches perform better than recently proposed protein function prediction methods on composite and multiple kernels, the proposed method has stable performance.
Keywords/Search Tags:Protein Function Prediction, Directed Bi-Relational Graph, Multi-KernelLearning, Tansductive Learning
PDF Full Text Request
Related items