Font Size: a A A

A Bioinformatics Study Of Human Ubiquitin Ligase-substrate Interactions

Posted on:2018-06-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:1314330518465213Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Ubiquitin, which is an abundant 76 amino acid polypeptide, can covalently conjugate to certain proteins by an isopeptide bond between its carboxyl and the amino group of a lysine residue. This process is mediated by a cascade of enzymes consistent with ubiquitin activating enzyme (E1), ubiquitin conjugating enzyme (E2) and ubiquitin ligase (E3). Ubiquitination regulates a wide spectrum of proteolytic and nonproteolytic cellular processes in eukaryotes, including proteasome-mediated protein degradation, signaling pathway in inflammation, DNA damage response and enzymatic activity regulation. Thus, ubiquitination is closely related to the development of many diseases like Alzheimer's disease, Parkinson's disease and various cancers.During the ubiquitination procedure, the interaction between ubiquitin ligase and substrate determines the specificity and fate of substrates. Multiple traditional experimental strategies (e.g. global protein stability profiling, protein microarrays,live phage display library and mass spectrometry) have been developed to identify E3-substrate interaction (ESI). However, because of E3s' low substrate levels and the intrinsically weak interactions between E3s and substrates, these methods are laborious,time intensive,expensive and low efficient. As a result,although there are more than 30,000 ubiquitin sites on over 5,700 substrates in the newest ubiquitination site database mUbiSiDa,thre are only -861 human E3-substrate relationships in the current database, which indicates that only a small proportion (-15%) of ubiquitinated proteins have the known corresponding ubiquitin ligases. Therefore, a robust computational strategy is desirable to systematically identify the potential E3-substrate interaction at proteome-scale.To address this challenge, we tried to construct a prediction model for human E3 -substrate interaction. First, we build a workflow to extract E3-substrate interactions from literature for golden standard postive dataset. Abstracts potentially containing E3-substrate interactions were downloaded from both PubMed and Web of Science, and futher mined by a literature mining tool called E3miner. Then these potential E3-substrate interactions' information was mannually curated by double check. Finally, we established a high confidence dataset containing 1,315 E3-substrate interactions, which is the largest E3-substrate interaction dataset.We performed network topological analysis for this network and found ESI network is of the scale free property.The ESI dataset was divided into golden standard positive (GSP) dataset (before Jan 1st, 2010) and independent test dataset (after Jan 1st, 2010) according to the published time of source literature. Since it is difficult to find an experimentally validated golden standard negative (GSN) dataset, we randomly selected pairs of E3 and its interacting proteins which were not included in GSP dataset as a GSN dataset.Then, five heterogeneous features were established for ESI prediction, including homology E3-substrate interaction, enriched domain and GO term pair, protein interaction network loops and inferred E3 recognition consensus motif. We found E3s may interact with substrate by recognizing specific domains or motifs. E3s and substrates tend to form three- or four-interaction loops in protein interaction network.Golden standard datasets were used to assess the prediction ability of each feature.After the assessment, we found that all these features can be used to predict E3-substrate interactions. These features are also helpful to discover potential E3-substrate interaction domains and E3 recognizing motifs. For eaxaple, the predicted E3 recognizing domain of "TP53 DNA-binding domain" was reported to interact with the E3 of WWP1 (Enrichment ratio: 7.21),and the reported "KEN"motif recognized by ubiquitin ligase complex APC/C was also found in our motif dataset (motif score: 16.13).Next, naive Bayesian classifier was used to integrate multiple biological evidences.The efficacy of integrated Naive Bayesian classification model was assessed using five-fold cross validation and indicated by the area under ROC curve. We found that our Bayesian model has the AUROC approximating to 0.827 against the cross-validation, indicating the integrated model has the satisified performance. The area under ROC curve against the independent test is 0.733, also indicating our model has the ability to predict novel ESIs.Based on the construced E3-substrate interaction prediction model, we implemented a proteome-wide E3-substrate interaction prediction, and constructed an online platform UbiBrowser (http://ubibrowser.ncpsb.org) to present human ubiquitin ligase-substrate interaction network. UbiBrowser provided the prediction results together with supporting evidence. UbiBrowser has three main views: network view, list view and sequence view . Network view shows the predicted E3-substrate interactions and list view shows literature reported E3-substrate interactions. Sequence view shows ubiquitination sites from literature, predicted E3-substrate interaction domain pairs and potential E3 consensus motifs.Using UbiBrowser, we tried to predict the interaction between the disease promoters and their potential upstream regulatory E3 ligases. And these predictions (ITCH-TAB1, CHIP-EGFR and NEDD4-HER3 )have been confirmed by the latest papers. We also experimentally validated a pair of predicted E3-substrated interaction (Smurfl and Smad3). Our result showed overexpressed Smurfl mediated the ubiquitination of Smad3, illustrating the usage of our model for potential E3-substrate interactions.In conclusion, in order to reveal the proteome wide E3-substrate interaction network,we performed a series of work containing data collection, construction of predictive model,and development of online browsing platform.The first human E3-substrate interaction browser we provided in this paper are helpful to detect E3-substrate pairs efficiently and understand the mechanisms of ubiquitination.
Keywords/Search Tags:Bioinformatics, Ubiquitin ligase, E3-substrate interaction, naive Bayesian classifier
PDF Full Text Request
Related items