Font Size: a A A

Predicting Protein Metal Ion Binding Sites Based On Deep Learning

Posted on:2021-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:S J ZhangFull Text:PDF
GTID:2480306107962549Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Proteins perform their functions by interacting with other ligand molecules.At present,it is confirmed that more than half of the proteins have binding sites with metal ions.These metal ions not only to stabilize the protein structure but also adjust the biological function of the protein.For example,the binding of metal iron ions(Fe3 +)to hemoglobin is critical to its function of carrying and transferring oxygen through the blood.The binding of metal Zn2 + ions to nucleases and transcription factors plays a crucial structural role in the formation of zinc domains,so accurate recognition of protein ion binding sites is very important for understanding protein functional mechanisms and discovering new drugs.The prediction of protein and metal ion protein binding sites,which essentially belongs to the prediction of protein residue levels.and that means its result is related to specific residues,Residues that are neighbors in the 3D structure may be far apart in the sequence.These local or non-local dependencies are crucial for residue-level attribute prediction.In order to model these dependencies to improve prediction performance,the ACNN +GRU hybrid deep learning method is used to capture local dependency features and long correlation features.ACNN learns local dependent features of protein sequences,and uses GRU to learn long-dependent features of proteins.Then,the fusion features are classified,and the prediction result of the binding site on the protein sequence is directly obtained.The hybrid deep learning model is an end-to-end model that does not require artificial feature screening and data pre-processing and post-processing.Input is the protein sequence and the corresponding output is protein sequence predictionresult.In order to solve the problem of data imbalance between protein binding sites and non-binding sites,the cross-entropy function loss function is improved so that it can balance positive and negative samples as well as focus on difficult and misclassified samples.On the extracted metal ion data set,the data set is divided according to the ratio of 8: 1: 1,and finally the performance on the data set has been greatly improved compared with other methods.The indicators on the training set are better than the previous methods.On the test set,most of the ions' s Pre are higher than the comparison method by about 20%,and the ACC and Sp are also higher than about 10%.The model in this paper uses MCC,Precision,Accuracy,Sp,Sn as evaluation criteria.
Keywords/Search Tags:CNN, RNN, metal ions, Proteins, Binding sites
PDF Full Text Request
Related items