Font Size: a A A

Prediction And Functional Analysis Of Escherichia Coli O157:H7 Protein-Protein Interaction Modules

Posted on:2010-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2120360275462358Subject:Genetics
Abstract/Summary:PDF Full Text Request
With the development of experimental techniques and bioinformatics,the available data from protein-protein interactions(PPIs) are increasing exponentially.To analyze the huge amount and complicated PPIs data,it is more feasible and efficient to predict functional modules form the PPI network.These functional modules usually represent protein complexes or groups of proteins participating in same biological process.Complex cellular processes are modular and are accomplished by the concerted action of functional modules.It follows that the investigation of functional modules will generate a better understanding of cellular organization,processes, and functions.Modularity analysis requires PPIs data.However,experimental PPIs data are still limited,mostly in few mode organisms,such as Saccharomyces cerevisiae.No modularity analysis to species lacking experimental PPIs data has been reported,especially pathogens.So, our goal is to develop a method for modularity analysis of pathogens lacking known experimental PPIs data,and discuss what the modularity analysis can help on the study of pathogenicity and cellular process.The bacterium Escherichia coli O157:H7 sakai strain was selected in this study.The pathogen causes diarrhea and hemorrhagic,and can also induce hemolytic uremic syndrome (HUS).The death rate is between 2%-7%.E.coli O157:H7 has become a worldwide threat to public health,and there is no effective method for curing or preventing the infection.In 2001, E.coli O157:H7 sakai strain was sequenced by Japan,which made the modularity analysis in proteomic-scale possible.Firstly,in this study a domain-based PPIs prediction method was used to predict the PPIs inside E.coli O157:H7 cell.The principle of the method is that a domain interaction matrix was built based on 3722 creditable protein interactions from DIP,which were validated by two or more than two experimental methods,then the domain interaction matrix was used to predict PPIs inside O157:H7.After computation by this method,the raw PPIs data were obtained.Two post-processing steps were applied to eliminate directional repeated interactions and self-interactions.The final dataset includes 12130 PPIs,referring 1652 proteins.Topology analysis was performed on the predicted interaction network.It comes out that predicted interaction network has same topological structure as experimental protein interaction networks, such as scale-free and small-world property.This result also proved that predicted PPIs were reliable.Secondly,Markov Cluster algorithm(MCL) was used to predict modules form PPIs predicted above.The MCL algorithm simulates random flow on the PPI graph by constructing its adjacency matrix.The PPIs graph was then partitioned into high-flow regions corresponding to protein modules.The Inflation coefficient(â… ) is the most important value in MCL algorithm. We selected 5 groups of I values to predict modules,which are 1.4,1.8,2.2,2.6 and 3.0.We considered the MCL evaluation program and the author's suggestion for optimization,and finally chose I=1.8 for further research.After computation,172 protein modules were predicted. However,the modules predicted by MCL algorithm have no overlapping components,while in real organisms,there are some proteins that exist in multiple complexes or participate in several cellular processes at the same time.So we identified the proteins shared between modules by a post-processing step,and distributed them to corresponding modules.In the final dataset,the biggest module contains 83 proteins,while the smallest module only contains 2 proteins.Thirdly,by GO annotation analysis,comparison with KEGG pathway and conserved protein complexes,modules predicted by bioinformatics method above were evaluated.The result of GO annotation analysis shows that 77.3%of the modules have enriched GO term. While the result of comparison with KEGG pathway shows that 63.8%of the modules were reliable.By searching references and protein complex databases,33.1%of the modules have conserved complexes in other bacteria.These results indicate that our predicted modules are good for providing evidence of functional homogeneity and biological significance.These results also proved that our method of modularity analysis was reliable.According to the evaluation results,133 high-reliable modules were picked out from the 172 predicted modules, which gave instructions for further research.Last but not least,we discussed the results obtained by our modularity analysis method. First of all,by searching known virulence factors of E.coli O157:H7,6 pathogenicity-related modules were discovered.These modules included:novel module probably related to adhension, module related to iron acquisition,module related to shiga toxin,module related to typeâ…¢secretion system,module related to urease and module related to stress response to temperature. The study to these pathogenicity-related modules not only helps to discover new virulence factor candidates,but also provides new information on the pathogenicity of O157:H7,which gives clues for further experimental validation.Secondly,three kinds of information were integrated to investigate relations between predicted modules,they were direct interactions or overlapped components between modules,GO annotation and KEGG pathway data.According to the relations,we took cell division as a example and discussed the phenomena of modularity of cellular functions.The study of modularity of cellular function will facilitate the development of synthetic biology.Also,the cooperative effects among modules were discovered,and examples of positive cooperative effect and negative cooperative effect were shown.Moreover,by comparison with KEGG pathway data,our modularity analysis of O157:H7 can provide possible candidates for biological pathway extension and clues for discovering new cross-talk between pathways.It is of great importance in biological pathway research.In conclusion,this study developed a reliable method for modularity analysis,and gave the first modularity analysis of a pathogen,which sheds new light on the study of pathogenicity and cellular process.This study also provides a novel method for applying modularity analysis to any other sequenced organism,without any previous experimental data.
Keywords/Search Tags:protein-protein interaction, functional module, pathogenicity, bioinformatics
PDF Full Text Request
Related items