Font Size: a A A

An Adaptive Clustering Algorithm For Module Detection And Its Application To Yeast Protein Network

Posted on:2009-11-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Q YeFull Text:PDF
GTID:1100360302978547Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
In a living cell, most biological functions rarely depend on individual proteins to perform particular cellular tasks; quite on the. contrary, they generally arise from interactions among multiple components to form highly-organized modules, where proteins often interact intimately and intensively. Ribosome, for instance, constituted by RNAs and proteins, are parts of protein-synthesis machines or translation modules of cellular organisms. It is widely believed that such biologically modular organization would correspond to a topological structure of interaction networks as a "community structure" or "module conformation" that groups proteins into dense connections. Communities (or modules) are of interest because they often correspond to functional units, such as cell cycles or metabolic pathways, and revealing these modular constituents in networks will undoubtedly bring richer biological information in gaining insights into dynamics of molecular systems on a new landscape. Many questions remain to be addressed in network analysis, in which detection of functional modules from networks is still a primary step required by much research work, and it is also a challenge task facing on in network biology.In this thesis, we first pointed out the drawbacks of traditional clustering algorithms for module detection in networks. Whatever the divisive clustering (GN-algorithm) or the agglomerative clustering (Newman-fast algorithm), one main drawback is that vertices have no chance to swap between modules that established in previous steps, but just to be manipulated within modules for further dividing or combining in next steps. These traditional clustering methods were cumbered with this point that lowers their qualities for further good performance. To overcome the above problem, we introduced a novel adaptive clustering algorithm capable of extracting modules from complex networks with considerable accuracy and robustness. In this approach, each node in a network acts as an autonomous agent demonstrating flocking behavior where vertex always travels toward its preferable neighboring groups. An optimal modular structure emerges from a collection of these active nodes during a self-organization process where vertices constantly regroup. In addition, we also implemented the above algorithm into a Java application as a tool to facilitate network researches.Our algorithm appears advantageous over other competing methods (e.g. the Newman-fast algorithm) whether from the performance or time efficiency. For example, our algorithm can reach~74.0% accuracy on artificial networks during a test procedure, while the Newman-fast algorithm only reaches~40.0%. Validation on two real-world networks also demonstrated that the result obtained by adaptive clustering method is closer to the correct classification in reality.We applied this approach to a yeast protein interaction network, and identified 36 modules that almost all show strong functional coherence supported by the analyses on GO-term distribution. Comparing with the Newman-fast algorithm, our algorithm is faster than it with about 40 folds on this protein network. Furthermore, analysis also indicates that the module structure is more biologically consistent than the partition from the Newman-fast method.
Keywords/Search Tags:protein interaction network, module structure, modularity, adaptive clustering algorithm
PDF Full Text Request
Related items