Font Size: a A A

Using Proteomics Data To Rank Key Proteins And Pathways In Cancer Based On Biological Network

Posted on:2020-08-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:J RenFull Text:PDF
GTID:1360330623964072Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Proteomics,which is a large-scale study of cellular proteins,including protein expression level,post-translational modification,protein-protein interaction,etc.,which can help us to obtain a comprehensive understanding of the processes of disease progression,cellular metabolism,etc.at the protein level.With the development of mass spectrometry technology and the rapid accumulation of high-quality proteomics data,more and more studies focus on how to make full use of large-scale proteomics data to understand the molecular mechanisms of diseases and find the therapeutic protein targets.According to data released by the World Health Organization,the incidence and death rate of cancer in the world are still rising rapidly.Therefore,how to use proteomics to learn the pathogenesis and development of cancer has also become an important field in cancer research.Currently,the identification and ranking of key proteins and network drivers has not yet been fully and systematically studied and evaluated.In this study,we systematically analyzed,compared and applied protein and pathway ranking algorithms by integrating biological network and the large-scale proteomics data published by Clinical Proteome Tumor Analysis Consortium(CPTAC).In the study of cancer protein ranking,we used random walk algorithm to evaluate the effectiveness of different integration strategies based on PPI network,prior knowledge of cancer genes/proteins and large scale protein expression profile.The prior knowledge used in this study included three initial protein sets,known disease proteins(KDPs)collected from the OMIM database,differentially expressed proteins(DEPs)obtained from protein expression profiles and known disease proteins with their neighbor differentially expressed proteins(eKDPs)on PPI network.We performed global rankings and local rankings with three kinds of initial protein sets on the proteomics data of colorectal cancer and breast cancer.We evaluated the results of the six methods using leave-one-out cross validation.The results showed that the global ranking method was superior to the local ranking method with proteomic data of cancer.While with the same ranking algorithm,the ranking method based on eKDPs performed better than based on KDPs and DEPs.We also annotated the top ranked candidate proteins by literature mining and queried the results of gene knock-out in cancer cells.It showed that the optimal method could find cancer-related proteins.Proteins are the basic functional units of life activities in living organisms,which exist and play role in the network or pathway of complex molecular interactions.Both protein expression and phosphorylation have a great influence on pathway.How to make full use of proteomic expression profile and phosphorylation information to find the key pathways in disease,especially the subtype-specific pathways,is an important challenge at present.Based on our previous work,we further put forward an algorithm that integrating proteomic expression and phosphorylation information to do pathway analysis for proteomics in cancer.We also systematically compared,evaluated and optimized the algorithm.Specifically,we chose three kinds of pathway analysis methods: Over-Representation analysis,Functional Class Score and topology-based pathway analysis.Besides,we proposed different omics information integrating methods for different pathway analysis methods.The results showed that pathway analysis using integrated information of protein expression and modification performed better than using single information.When using integrated information,target pathways of cancer ranked lowest with pathway topology based method.Furthermore,we processed topology-based pathway analysis with integrated information on four subtypes of breast cancer.The subtype-specific top-ranked pathways were observed and some results were consistent with previous research reports.For example,p53 pathway ranked lowest in the Basal-like breast cancer type and ranked lower in Luminal A than in Luminal B.Finally,we built and published comPath,an R package to integrate proteomic and phosphoproteomic data for pathway analysis,evaluate and visualize the different impacts between proteomic and phosphoproteomic data on a pathway.This study proposed strategies for key protein ranking,pathway analysis,integrating protein expression profiles and phosphorylation profiles of cancer at proteomic level,which may provide novel insight in understanding mechanism of cancer and providing reference to other integrating methods for omics data,so as to understand the underlying pathogenesis of the disease and provide a theoretical basis for disease diagnosis and treatment...
Keywords/Search Tags:proteomics, cancer, protein prioritization, PPI network, phosphoproteomics, integration of omics, pathway analysis
PDF Full Text Request
Related items