Font Size: a A A

Clustering Research And Software Development In Protein-Protein Interaction Network

Posted on:2007-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ShiFull Text:PDF
GTID:2178360212973189Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowaday, research on the function of proteins in protein-protein interaction network has been a hot-spot field in science. By analyzing and researching protein-protein interaction network, scientists find that proteins which interact with each other tend to have similar cellular function. Based on the consanguinity between proteins, clustering methods can be used to mine the intrinsic modular property of protein-protein interaction network to divide proteins in the intricate network into smaller-scale groups of proteins, and to sort the perplexing information in it into smaller information groups, then solely analyze these groups separately.Consequently, clustering methods can study the functional modules of proteins and predict protein function, and naturally, these methods are good selections in researching on protein-protein interaction network. The paper mainly focuses on inventing a new clustering method, which can fit the modularizing property of protein-protein interaction network, and developing software of visualizing protein-protein interaction network.By researching on clustering methods in protein-protein interaction network, we employed a new topological clustering method --- modularized clustering method (MCM). The MCM method is suitable for the modularizing property of protein-protein interaction network, in which there are more interactions within a protein group than between groups.The MCM method is beyond traditional clustering method based on the proximility between proteins,but is based on the relationship of functional modules. The direct and second order interactions of modules are employed to define the proximity of clusters in the latest high-throughput data of protein-protein interaction network of yeast in order to predict the function of unknown proteins in the modules. P value of hypergeometric cumulative distribution of modules and the disturbance analysis on the data including adding, removing and rewiring interactions are employed to evaluate the prediction quality and robustness of the method. The results show that MCM has high prediction precise rate and coverage,and it is robust to the high false-positive data and missing data.The predicted results of unknown proteins with high prediction precise rate can be instructive in biological analysis and the algorithm can be generalized to other networks with the similar structures.Additionally,we develop the PINC (Protein Interaction Network Clustering),a software for protein-protein interactions network visualization by JBuilder. PINC, based on the visualizing idea of ADJW algorithm, integrates matrix representation with clustering tree and traditional visualization method to analyze and research the protein-protein interaction...
Keywords/Search Tags:protein-protein interaction network, clustering, visualization of network, prediction of protein function
PDF Full Text Request
Related items