Based On Deep Learning For Predictive Research Of Protein-compound Binding

Posted on:2021-04-10

Degree:Master

Type:Thesis

Country:China

Candidate:F Hou

Full Text:PDF

GTID:2404330611452013

Subject:computer science and Technology

Abstract/Summary:

PDF Full Text Request

Modern pharmacological research has accumulated a large amount of data on the binding of protein-compound.However,to date,there have been still many compounds lacking information on binding to proteins,which limits the further rapid development of pharmacology.The traditional research methods based on pharmacological experiments demand heavy investment,long experimental period,and less data can be collected.Deep learning is a technique to build a neural network model for learning based on huge amounts of historical data collected from a specific problem.The rapid development of computer hardware technology(CPU and GPU,etc.)makes the process of deep learning possible.Therefore,based on the existing massive protein-compound binding data,a deep learning model is constructed for training,from which proteincompound binding features are extracted,and predictions can be made based on these features.The model training can be completed in a short time and any proteincompound binding can be predicted,which provides new clues for pharmacological research.This thesis mainly uses the TensorFlow framework developed by Google to build a neural network model for training and predicting.All data comes from the International Bioinformatics Database��BindingDB.By processing the original data,each piece of data finally obtained indicates whether a compound can bind to a protein.The positive samples marked as 1 are protein-compound binding data,while the negative samples marked as 0 mean that the compound cannot bind to the protein.A total of about 7 million samples are used in this thesis and divided into three parts,with ten thousand samples used for verification,ten thousand samples for testing,and the remaining samples for training.This thesis trains two models,one is a convolutional neural network model M1,and the other is a fully-connected neural network model M2.Each model is divided into two parts,in the M1 model,the first part utilizes three different types of convolution kernels to extract atomic blocks in compounds,chemical bond blocks,and characteristics of proteins represented by amino acid sequences,respectively.The first part of the M2 model extracts features of atomic blocks,chemical bond blocks and protein blocks through three fully-connected networks.The second part of the two models is the fully-connected neural network with several hidden layers,and the number of nodes in each hidden layer decreases layer by layer.The final output layer has two nodes,with one-hot coding to indicate whether the compound and protein can bind.The work of this thesis includes the following stages: theoretical preparation,downloading the original data,analyzing the data,processing the data,determining the model,writing the code,and training the model until the result is obtained,the longest duration of a single experiment is nearly 400 hours.It is found that the performance of the M2 model is better than the M1 model,and the accuracy rate in the test set is 89%.It can be seen that deep learning has high credibility for the prediction of unknown protein-compound bindings,and has certain reference value for pharmaceutical research and development.

Keywords/Search Tags:

deep learning, protein-compound, binding, convolutional neural network, fully-connected neural network

PDF Full Text Request

Related items

1	Research And System Implementation Of Diabetes Diagnosis Based On Deep Learning
2	Study On Classification Algorithm Of ECG Arrhythmia Based On Deep Learning
3	Research On Intelligent Medical Diagnosis Based On Neural Network
4	Evaluation Of Protein-ligand Binding Effect Based On Deep Learning
5	Research On CT Image Segmentation Of Liver Tumor Based On Fully Convolution Network
6	Liver Lesion Segmentation Based On Convolutional Neural Network
7	The Study Of Full Mammography Images Segmentation Based On Fully Convolutional Network
8	Research On Lung Nodule Detection Based On Deep Convolutional Neural Network In Computed Tomography
9	The Research Of X-ray Image Analysis Based On Deep Convolutional Neural Network
10	Research Of MRI Segmentation Based On U-shaped Deep Network