Font Size: a A A

Prediction Research Of Protein Function Based On Sequence

Posted on:2013-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:S Y HuangFull Text:PDF
GTID:2230330374964328Subject:Applied Chemistry
Abstract/Summary:PDF Full Text Request
With the completion of the Human Genome Project and the development of sequencing technology, a large number of protein sequence data are emerging at an explosive pace. In-depth study of these data to obtain protein functional information is one of the important objective of biological research. In this paper, multiple feature extraction algorithms and multiple clssifiers were proposed to predict protein function based on protein primary sequence. The main contents are listed as follows:(1) A novel method that couples the discrete wavelet transform (DWT) with support vector machine (SVM) is developed for predicting protein subcellular localizations based on amino acid (AA) physicochemical properties. The results indicate that DWT_SVM not only significantly enhances the accuracy of prediction protein subcellular localizations, but also possesses obvious and effective character in the aspect of resistant sequences identify.(2) A method called PredSulSite, which incorporated protein second structure, physicochemical properties of amino acids and residue sequence-order information, is developed to predict sulfotyrosine sites. Using the independent test, PredSulSite significantly enhances the accuracy compared with the existing methods. PredSulSite is available as a community resource at http://bioinfo.ncu.edu.cn/inquiries_PredSulSite.aspx.(3) A new method is proposed to predict phosphoserine sites of tau proteins based on information entropy and SVM. We respectively investigated the impact of the individual feature on the prediction of phosphoserine sites, indicating that AA physicochemical properties, SS and disorder all contributed to the sulfation site determination. The predictive result indicatied that our methods can effecitve predict phosphoserine sites of tau proteins.
Keywords/Search Tags:protein primary sequence, discrete wavelet transforms, protein subcellularlocalization, sulfotyrosine sites, phosphoserine sites, support vector machine, classification prediction
PDF Full Text Request
Related items