Font Size: a A A

Bioinformatic Studies Of Covalent Modifications And Zinc-binding In Proteins

Posted on:2014-02-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X LiuFull Text:PDF
GTID:1220330398464472Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Through temporal and spatial modifications of proteins, protein covalent modifications (PCMs) greatly expand the proteome diversity and play critical roles in regulating various biological processes. Although recent development of high-throughput proteomics greatly advanced the understanding of PCMs, it is still a great challenge to dissect the regulatory mechanisms and biological roles of PCMs. Currently, my major progresses are focused on computational studies of PCMs, which could be summarized as two aspects:Development of algorithms and softwares for prediction and analysis of PCMs; Resource construction and systematic studies of PCMs.Identification of site-specific substrates is fundamental to dissect the molecular mechanisms and biological functions of PCMs, while it is still a great challenge under current technique limitations. To date, the accumulation of experimental discoveries makes it possible to develop computational tools to predict and analyze PCMs, which could be helpful for further experimental considerations. The GPS series algorithms were firstly developed by our group to predict kinase-specific phosphorylation sites and further updated to the GPS2.0. Recently, I improved and refined the algorithm into version2.1,2.2and3.0for prediction of various PCMs including phosphorylation, calpain cleavage, S-nitrosylation, nitration, pupylation, palmitoylation and sulfation. The algorithms were summarized in our book chapter, while the softwares were reviewed and employed by a number of experts to get helpful computational analyses for their experimental considerations. Furthermore, I extended the GPS algorithm to predict APC/C recognition motifs of KEN-boxes and D-boxes, and successfully combined GPS algorithm with Gibbs sampling approach to predict I-Ag7and HLA-DQ8epitopes. Moreover, recently I self-developed Geometric REstriction (GRE) approach and the predictor GRE4Zn for structure-based prediction of zinc-binding sites in proteins, which achieved superior performance. In addition, I performed the electrostatic free-energy analysis of RNA recognition motif domains of Hu antigen R during RNA binding and participated in reviewing recent computational studies of phosphorylation.During development of these predictors, through manual reading thousands of literatures, I constructed the most integrative datasets for the PCMs, which provide the possibility to systematically analyze the molecular regulatory mechanisms and functions of different PCMs. The statistical analyses of gene ontology (GO) suggested that calpain substrates are enriched in response to a variety of stimulus, such as drug, corticosterone stimulus and organic nitrogen, and highly implicated in regulation of mitochondrial membrane and apoptosis. For the newly discovered PCM pupylation, statistical analyses suggested that pupylated proteins play important roles in translation, cellular amino acid biosynthetic process, glycolysis, response to stress, sulfate transport and proton transport.As two nitric oxid-dependent PCMs, cysteine S-nitrosylation and tyrosine nitration (PTN) are both highly implicated in a number of biological processes such as anti-apoptosis, protein folding. However, statistical comparisons showed that PTN substrates were enriched in transcription and translation while S-nitrosylation was over-presented in ion transport, glycolysis and multicellular organismal development. These results indicated that, in contrast to S-nitrosylation, PTN prefers to attack basic biological processes and functions. Furthermore, as a E3ubiquitin ligase critical for cell cycle, Anaphase-Promoting Complex (APC/C) recognize substrates through motifs such as KEN-box and D-box, while a number of KEN-box and D-box proteins are closely related to mitosis. However, statistical comparisons showed KEN-box proteins were preferentially involved in mitosis-related processes, while D-box proteins were implicated in broader spectrum of biological processes. In addition, previous studies suggested that mouse I-Ag7is equivalent to the human HLA-DQ8, while the cross-evaluation with our self-developed predictor showed that the binding patterns of I-Ag7and HLA-DQ8are highly similar and conserved.Recently, the rapid developments of "state-of-the-art" techniques especially genome sequencing and high-throughput mass spectrometry generate numerous data and make biological research entering the era of big data. However, due to the lack of comprehensive annotation resources, it is a great challenge to analyze these large-scale data, which limited the advancement of systematical studies of PCM. Recently, I contributed greatly to databases construction including MiCroKit, PhosSNP, CPLA, UUCD and EKPD, and performed a series of systematic studies of human lysine acetylation network, Plk-mediated phosphoregulation and PCM crosstalk on tyrosine.MiCroKit is an integrated database of proteins localized at three super-complexes organized during mitosis including midbody, centrosome and kinetochore, which orchestrate cell division process faithfully. Through manual reading thousands of papers, I constructed the updated dataset, while the PCM information was integrated. The annotation information in the database was employed for my recent studies. Furthermore, I participated in systematical studies of the genetic polymorphisms that influence protein phosphorylation (PhosSNP). Since most PCMs were reversibly regulated by enzymes, it is critical to identify the regulators for dissecting the molecular regulatory mechanisms. Recently, with the collaborators, I constructed two databases of UUCD for enzymes in ubiquitin and ubiquitin-like conjugation system among70eukaryotic species and EKPD for kinases/phosphatases among84eukaryotic species. These comprehensive resources will be helpful for further small or large-scale studies of PCMs.As a critical component of "histone code", recently lysine acetylation was discovered to target broad substrates and especially play an essential role in cellular metabolic regulation. To advance our understanding of protein lysine acetylation, I manually collected all the experimentally identified lysine acetylation sites from literatures, and integrated the data into the CPLA database. Although numerous acetylated lysines were discovered, their regulators remain to be identified. I tried to systematically investigate the enzyme-substrate relationships through introducing the protein-protein interaction (PPI) information. Combined with experimental and predicted PPIs, a potential human lysine acetylation network (HLAN) was discovered among histone acetyltransferases (HATs), substrates and histone deacetylases (HDACs). From HLAN, a number of potential HAT-substrate-HDAC triplet relations were retrieved, while at least13triplet relations were experimentally identified previously. It is anticipated that triplet relations could provide helpful information for further investigation of lysine acetylation especially the detail molecular regulatory mechanisms.Currently, large-scale identification of thousands of phosphorylation sites has become a popular and near-routine assay. However, it’s still a great challenge to dissect the detailed regulatory relationships for these sites. Recently, I systematically analyzed the Plk-mediated phosphoregulation based on currently available phosphoproteomics data. Statistical analysis of the results suggested that Plk phospho-binding proteins are more closely implicated in mitosis than their phosphorylation substrates, and Plk regulation favors the distributive model. Additional computational analysis together with in vitro and in vivo experimental assays demonstrated that human Mis18B is a novel interacting partner of Plkl, while phospho-binding sites pT14and pS48of Mis18B were identified as we predicted. In addition, functional analysis revealed that this interaction plays a significant role in maintaining the stability of Mis18B, and probably promotes the subsequently phosphorylation of Misl8B by Plkl. Recently, systematic analysis of the in situ crosstalk among tyrosine sulfation, nitration, and phosphorylation were performed. It is suggested that both sulfation and nitration prefer to in situ crosstalk with phosphorylation at the phosphorylation sites rather than non-phosphorylatable tyrosines, while sulfation and nitration preferentially crosstalk with phosphorylation in distinct biological processes and functions.Taken together, although numerous efforts have been contributed, it is still a great challenge to investigate PCMs. Since computational tools could provide helpful information with convenience, I anticipated that the combination of computational analyses and experimental studies will advance the understanding of mechanisms and functions of PCMs.
Keywords/Search Tags:protein covalent modification, zinc-binding protein, bioinformatics, computational prediction, systematical analysis, proteomics, predictor, database
PDF Full Text Request
Related items