Font Size: a A A

Research On Protein Function Prediction Based On Deep Learning

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2370330611488197Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the development of biomedical big data,mining potential biological value through proteomics data is of great significance for studying the mechanism of intermolecular interactions,drug design,and human disease prevention.The web-lab appoarches are time-consuming and resource-consuming,using computational methods is of great importance to predict protein function.With the vigorous development of deep learning,predicting protein-protein interactions(PPIs)and drugtarget interactions(DTIs)through deep learning have become a research hotspot in bioinformatics.This topic is to predict PPIs and DTIs based on deep learning and the main research contents are as follows:1.We propose a protein-protein interactions prediction method based on ensemble residual convolutional neural network called EResCNN.First,fusing pseudo-amino acid composition(PseAAC),auto-covariance descriptor(AC),pseudo position-specific scoring matrix(PsePSSM),encoding based on grouped weight(EBGW),multivariate mutual information(MMI),conjoint triad(CT)to extract protein physicochemical property information,evolutionary information and sequence information.Secondly,the high-level feature of PPIs can be mined via layer-by-layer learning ability of the residual convolutional neural network.And we ensemble fully connected networks,LightGBM and extremely randomized trees to predict PPIs.Five-fold cross-validation shows that the accuracy vaules on S.cerevisiae,H.pylori and Human-Y.Pestis datasets are 94.88%,88.24% and 97.88%,respectively,which are superior to the state-of-theart PPIs prediction methods.The ACC values of EResCNN on H.sapiens,M.musculus,C.elegans,and E.coli are 95.25%,96.49%,92.08%,and 92.13%,respectively.The PPIs network prediction results indicate that EResCNN can be used to explore the topology and biomedical significance of protein-protein interactions networks.2.We propose a drug-target interaction prediction method based on deep neural network called DNN-DTIs.First,using pseudo-amino acid composition(PseAAC),pseudo position specific scoring matrix(PsePSSM),conjoint triad(CT),composition,transition and distribution(CTD),Moreau-Broto autocorrelation and secondary structure feature to characterize target information,and using the molecular substructure fingerprint in PubChem database to characterize the drug information.Secondly,XGBoost feature selection is employed to eliminate redundant and irrelevant features,and synthetic minority oversampling technology(SMOTE)is used to balance the sample dataset.Finally,a prediction model of drug-target interactions based on deep neural network(DNN)is constructed.Five-fold cross-validation shows that the prediction accuracies of DNN-DTIs on Enzyme,Ion channel(IC),GPCR and Nuclear Receptor(NR)datasets are 98.78%,98.60%,97.98% and 98.24%,respectively,which are superior to other drug-target interactions prediction methods.In order to further evaluate the pros and cons of DNN-DTIs,we predict and plot the drug-target interactions network,which can provide new ideas and methodologies for drug design and identification of new DTIs.
Keywords/Search Tags:deep learning, protein function, protein-protein interaction, drug-target interactions, residual convolutional neural network, deep neural network
PDF Full Text Request
Related items