Font Size: a A A

Prediction Of Phosphorylation Sites In Human Proteins

Posted on:2019-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y W ZhaoFull Text:PDF
GTID:2310330563454136Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Protein phosphorylation refers to the process of transferring the phosphate group of adenosine triphosphate(ATP)or guanosine triphosphate(GTP)to amino acid residues of a specified protein,which is catalyzed by protein kinases.Studies have shown that phosphorylation is the most common and most important type of protein post-translational modification.It participates in a variety of signal transduction and cellular metabolic pathways and plays an irreplaceable role in regulating the activities of living organisms.With the rapid development of high-throughput mass spectrometry technology,protein phosphorylation site data is rapidly accumulating,and a large number of accurate phosphorylation site data provide us with the opportunity to systematically study protein phosphorylation sites.Therefore,it is very important to build a model with high accuracy and robustness to predict protein phosphorylation sites.Firstly,the experimentally validated human protein phosphorylation data was collected from the UniProt,and the positive and negative sample sets were constructed after removing redundant sequences.Subsequently,we statistically analyzed the positional conservation,secondary structure,and accessibility of the residues surrounding the phosphorylation site and the non-phosphorylation site,as well as the distribution of the physicochemical properties of these amino acids.The results obtained in the above analysis are shown below: residues surrounding the phosphorylation site are more conserved than those surrounding the non-phosphorylated site;residues around the phosphorylation site have a greater mean of accessibility than those surrounding the surrounding the non-phosphorylated site and the residue around the phosphorylation site do prefer to form loop structure;for the nine physicochemical properties,positive samples have larger fluctuation than negative samples;the distribution of these properties flanking the phosphorylation sites are not symmetrical.The above analysis results show that these properties are important for the recognition of phosphorylation sites.Based on the above analysis results,we first constructed a prediction model of phosphorylation sites based on window-optimal strategies according to different types of the features.Jackknife cross-validation results show that the model has a good performance.Secondly,in light of the consideration of location dependence,we have built a model based on location-related information,which shows higher prediction performance(auROC)in all three sets of sample data sets.Finally,based on the model,we built a web server called PhospSitePred which can be freely accessed from http://lin-group.cn/server/PhospSitePred/.
Keywords/Search Tags:phosphorylation of protein, feature selection, optimal window strategy, location association, support vector machine
PDF Full Text Request
Related items