Font Size: a A A

Deep Learning-Based Classification Towards Compound-Protein Interactions

Posted on:2019-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2334330566964604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The identification of interactions between compounds and proteins has an important status in network pharmacology and drug discovery.However,traditional biochemical methods are generally time-consuming and expensive.With the rapid development of computer software technology,it is possible to use computer software to simulate biochemical experiments.This method is faster and cheaper than traditional biochemical experiment methods,so computational methods are becoming popular.However,computational methods require researchers to have a strong background in chemistry and the accuracy is not high,such as molecular docking techniques,a theoretical simulation method for studying the binding patterns and affinities between compounds and proteins requires a profound understanding of the structures of compounds and proteins.In recent years,machine learning technology has become more and more widely used in daily life,such as face recognition,machine translation,and driverless driving.Each of these applications uses a technique called deep learning.Because deep learning can automatically extract features,does not require researchers to have relevant background knowledge,is low in entry barriers,and has strong learning ability,and has achieved higher accuracy than traditional machine learning techniques in many tasks,so deep learning has achieved great success in computer vision,speech recognition and natural language processing with its powerful fitting ability.At the same time,the application of deep learning in medicine,chemistry and biology has also gradually developed.In this paper,based on compound-protein interaction data extracted from BindingDB,SDF(Structure Data File)of compounds and protein sequences are recruited for structure-based classification.Then generate negative samples with the same number of positive samples using random generation algorithm and use deep neural network in deep learning to learn training data.The input of the deep neural network is the structural data of the compound and protein,and the output is the binding probability of compound and protein.Through a large number of experiments to adjust the hyper-parameters,the deep neural network structure is MultipleNet.MultipleNet are divided into feature extraction network and classification network.The feature extraction network respectively extracts the features of compounds and proteins,it has 3 hidden layers and 2000 neurons per layer.The classification network classifies the interactions between compounds and proteins based on the features extracted by the feature extraction network,it has only 1 hidden layer with 1000 neurons.The number of parameters of the MultipleNet reached 27.2 million,and the test accuracy rate of 96.73% can be achieved.The work done in this paper aims to introduce deep learning techniques into the classification study of compound-protein interactions.Although the work of this paper cannot be applied directly to reality,it plays an instructive role in the future development.
Keywords/Search Tags:deep learning, machine learning, drug discovery, compound-protein interactions
PDF Full Text Request
Related items