Font Size: a A A

Research Of Protein Netwoork Based On Random Walk Model

Posted on:2014-01-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:W PengFull Text:PDF
GTID:1220330431497898Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Most of cellular processes are not performed by single protein but by multiple ones. High throughput technique generates a large scale of protein interaction data. Constructing protein-protein interaction (PPI) network helps us understand molecule biological system, research molecule function and their interactions. Moreover it can also provide a new insight into research of biological evolution. This thesis focuses on several hot research topic of PPI network, such as essential protein identification, protein complex detection, protein function prediction and conserved protein complex detection by using random walk model and combining PPI data and other biological data.A novel method is proposed to predict essential proteins based on Page-Rank algorithm. This method identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. It initializes the ranking scores of proteins ac-cording to their orthologous properties. For sake of modularity of essential proteins, the edges connecting proteins are associated with weights by edge-clustering coefficient. Finally, the ranking score which stands for essen-tiality of a protein is linear combination of its ortholgous score and the score related to its neighbors’ features. Experimental results on S. cerevisiae (yeast) and E.coli show that our method outperforms all eight other existing centrality methods.We propose a new method WPNCA to detect protein complexes by con-sidering the core-attachment structure of protein complexes and global infor-mation of PPI networks. Firstly, We design a weighted PageRank-Nibble algo-rithm which assigns each adjacent node with different probability. After that, WPNCA partitions the PPI networks into multiple dense clusters by using weighted PageRank-Nibble algorithm. Then the cores of these clusters are de-tected and the rest of proteins in the clusters will be selected as attachments to form the final predicted protein complexes. The experiments on yeast data show that WPNCA outperforms the existing methods in terms of both accura-cy and p-values.According to known information about protein functional annotation, we investigate at which level neighbors of proteins tend to have functional associ-ations and at which level neighbors of GO Terms usually co-annotate some common proteins. Then, an unbalanced Bi-random walk (UBiRW) algorithm which iteratively walks different number of steps in the two networks is adopted to find protein-GO Term associations according to some known asso-ciations. Experiments are carried out on S. cerevisiae data. The results show that our method achieves better prediction performance not only than methods that only use PPI network data, but also than methods that consider at the same level neighbors of proteins and of GO Terms.Based on dividing-and-matching strategy, we propose a new method to detect conserved protein complexes via local network alignment. This method firstly partitions one of PPI network into subnetworks and then these subnet-works are mapped to the other PPI network to find common connected com-ponents. In the cause of finding common connected components, this method adopts a lenient criteria that is we locally extend a pair of homologous proteins if there exists at least one path of length not larger than2to connect one of node in the homologous protein pair in its corresponding network. We imple-ment network alignment between S. cerevisiae and D. melanogaster, H.sapiens and D. melanogaster respectively. The experimental results show that DAMA-lign outperforms other existing methods in recovering known protein com-plexes. Moreover, the conserved protein complexes that are detected by DA-MAlign from different PPI networks are also functional similar in terms of their GO semantic similarity.
Keywords/Search Tags:Protein-Protein Interactions, Protein-Protein Interaction Network, Random Walk Model, Functional Modules, Edge-Clustering Coefficient, Es-sential Proteins, Protein Complexes, Protein Function Prediction, NetworkAlignment, Conserved Protein Complex
PDF Full Text Request
Related items