Font Size: a A A

Based On Sequence And Structural Composition Information To Predict The Subgolgi Apparatus Location Of Protein

Posted on:2019-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2370330563456860Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Golgi appartus is an organism that is commonly found in eukaryotic cells.It plays an indispensable role in eukaryotic cells.Many studies have demonstrated that the function defects of Golgi apparatus are associated with many kinds of neurodegenerative diseases,such as Parkinson's disease and Alzheimer's disease.Thus,corrective identification of the types of Golgi-residents proteins is very important for understanding their molecular functions in various biological processes.In this paper,we constructed an objective benchmark dataset based on the SwissProt database,including 79 cis-Golgi proteins and 247 trans-Golgi proteins.The structure of the protein determines the function of the protein.By searching and analyzing the structure domains in the dataset,we found that proteins located at two positions in the Golgi apparatus have some common structure domains and some specific domains that perform different functions at different locations.These characteristic information could be used for the prediction of Golgi-resident proteins.In order to find the more effective characteristic parameters for predicting the Golgi-resident protein types,the features were extracted from protein sequence and structural information.For protein sequence information,amino acid composition,gene ontology(GO)of homologous sequences,stickiness,position-specific scoring matrix(PSSM),protein blocks and physicochemical properties were used.Protein structural information includes domain information and the autocovariance average chemical shifts(acACS)deducing from the protein secondary structure information.The minimum Redundancy Maximum Relevance(mRMR)feature selection method was used to select informative parameters.The subgolgi apparatus locations of proteins are predicted by using the support vector machine algorithm.The accuracy is 93.87% and 90.91% in jackknife test and independent test,respectively.These results show that our method is effective to identify the subgolgi apparatus locations of proteins.
Keywords/Search Tags:subgolgi apparatus location, structure domain, average chemical shifts, gene ontology of homologous sequences, support vector machine
PDF Full Text Request
Related items