Font Size: a A A

Study Of Protein Function Prediction Using Semi-supervised Learning

Posted on:2012-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:C WuFull Text:PDF
GTID:2178330335482325Subject:Biological Information Science and Technology
Abstract/Summary:PDF Full Text Request
As the fundamental component of living matter, protein is closely bound with the growth of living organism. Studying the functions of protein is one of the research hot spot on Proteome, and it is helpful for human being to uncover the secrets of life. Although biochemistry experiments are the most reliable approaches in biology to analyze the functions of proteins, they are time consuming and expensive, so they are not able to fulfill the needs of processing protein data whose number is increasing in high speed. This phenomenon encourages researchers to study protein function prediction using computational approach.With the development of high-throughput biology technology, protein function prediction through protein-protein interaction networks is focused by many researchers. Because of the complex structure of protein-protein interaction networks, analyzing by machine learning methods is often used.Based on a Semi-Supervised learning approach called global optimization model, this article studied protein function prediction using protein-protein interaction networks. To deal with the shortage of global optimization model where local information is not fully adopted, this article proposed a local heuristic search guided global optimization model. Under this model, protein function prediction using ant colony optimization algorithm and protein function prediction using shuffled frog-leaping algorithm were proposed. On purpose of simulation, data from common used protein-protein interaction networks data bases and functional catalogue data base were collected and integrated. In order to solve the problem that different protein identification is applied in different databases, a data colleting and integrating tool were designed and implemented. In the stage of simulation, two data sets were used. One of them was from a literature, the other one was obtained by integrating data of DIP-core and FunCat 2.1 using the tool mentioned above. Simulation results show that both of the algorithms have good performance on protein function prediction.Additionally, fault-tolerant of the two algorithms was tested, both of them show good fault-tolerant for false positive and false negative data in the protein-protein interaction networks. In further analysis, convergence rate of global optimization model and local heuristic search guided global optimization model was compared by calling the same times of energy function. Results show that the convergence rate of the latter is significantly faster than that of the former.
Keywords/Search Tags:Protein Function Prediction, Protein-Protein Interaction Networks, Global Optimization Model, Ant Colony Optimization, Shuffled Frog-Leaping Algorithm
PDF Full Text Request
Related items