Font Size: a A A

Research On Drug Property Prediction Algorithm Based On Deep Learning

Posted on:2020-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y S FanFull Text:PDF
GTID:2404330623951395Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the emergence of new diseases such as SARS,avian flu,and swine flu,the development of new drugs for these diseases is facing increasing challenges.The design and development of drugs is a costly and inefficient task.When biological research finds that a particular molecule can achieve therapeutic activity,the molecules found are often unable to become due to tox icity,low activity and low solubility.Potential drugs.In recent years,with the development of pharmacokinetics,the efficiency of traditional drug screening has gradually increased.However,testing a new compound with potential medicinal value require s in vivo detection and experiment,which takes a lot of time and cost.Therefore,there is an urgent need for an accurate,efficient and low-cost way to quickly identify whether a molecule has side effects on organisms and whether it can be a potential dr ug.In the past few years,deep learning has made great progress in computer vision,speech recognition,and natural language processing.Since Alex Net in 2012,the Convolutional Neural Network(CNN)has become more brilliant.In response to the above problems,this paper uses an improved graph convolutional neural network(GCN),which is applied to the prediction of drug properties and has achieved good results.The specific work of this paper is as follows:1.This paper uses a graph-based convolution polynomial to improve the graph convolutional neural network model,which can better learn the characteristics of SMILES(Simplified Molecular Input Line Entry Specification)format molecules compared with the previous graph convolutional network.The improved graph convolutional network can handle the molecular structure of the SMILES format and learn complex molecular features through two graph-based features(atomic and structural features).2.This paper proposes a method for extracting the structural features of molecules.In this paper,we use the RDkit chemical toolkit to generate a matrix of adjacency matrices and degrees of molecules by traversing all the atoms and valence bonds in the molecule,and calculate the Laplacian matrix.The Laplacian matrix is used as a structural feature of the molecule.While lea rning the structural features,for the characteristics of each atom in the molecule,the text uses the idea of natural language processing to express the word vector.Molecular data sets in the SMILES format are segmented according to the SMILES format specification and the characteristics of the atoms and groups,and then Word2 vec is used to train the word vectors to characterize each atom in the molecule.At the same time,based on the prevailing pre-training ideas,this paper uses a large number of data sets based on SMILES format molecules to train word vectors,thus improving the representation ability of word vectors.3.In this paper,the improved graph convolutional neural network combined with word vector is used in the prediction of drug properties(A DMET characteristics,including absorption,distribution,metabolic excretion and toxicity),and the effectiveness of the method is compared with traditional methods.
Keywords/Search Tags:Graph Convolutional Network, Word Embedding, Word2vec, Drug Properties, SMILES, Laplacian Matrix
PDF Full Text Request
Related items