Font Size: a A A

The Prediction Of Metal Ion-binding Site For Membrane Protein Based On Ensemble Learning

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:J Z SongFull Text:PDF
GTID:2370330563453722Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The prediction of protein structure,which is one of the most popular fields in bioinformatics,can provide great help for understanding the structure of protein.It can be further used to deduce the biological functions based on the difference of protein structures.The proteins can bind with different compounds to form the binding proteins such as metal-ions,nucleic acid and phosphoric acid.Our research target is the protein binding with metal-ions which is called metalloprotein.More one third of proteins can bind with metal-ions to fulfill its functions.So,it is essential to identify the binding sites in the protein sequences.The traditional methods use the biological and chemical experiments to find out the position of binding sites,but it is quite time-consuming and money-consuming,which makes it hard for large-scale application.However,using the computational method to predict the binding sites has a much brighter prospect,and can be used for the research of protein structure.The membrane protein,as the gateway and carrier between the biological membrane and the external environment,is crucial to sustain the membrane structure and material transportation.We can have a deeper understanding between the structure and the corresponding function of membrane protein by predicting the metal ion-binding sites in membrane protein sequences.We develop a computational method to predict six kinds of metal ion-binding sites including Ca~2,Cu~2,K,Mg~2,Naand Zn~2based on the sequence information.The dataset contains all the membrane protein sequences which have the clear metal ion-binding records in Protein Data Bank(PDB).Four types of features are extracted based on these sequence information including the topology feature,the evolution conservation feature,the evolutionary covariation feature,and the surface accessibility feature.The classifier is developed based on the ensemble learning algorithm and the feature matrix is sent into it.Compared with the traditional method which use the sliding window to reflect the conservation of amino acids and predict the continuous binding sites in the sequence,our method focus on the continuity of binding sites in protein 3D structure.Because of the secondary structure of protein,the continuous binding sites in protein structure may be discrete in protein sequence.The prediction results on validating set shows that our method has a better generalization ability and can comparatively accurately predict the metal ion-binding sites of membrane protein sequences.At the same time,our method has a better prediction on testing set compared with the S-SITE which also predict the metal ion-binding sites based on the sequence information.
Keywords/Search Tags:metal ion-binding sites, membrane protein, support vector machine, random forest, ensemble learning
PDF Full Text Request
Related items