Font Size: a A A

Study On Key Problems Of Protein-Ligand Docking Based On Machine Learning

Posted on:2018-09-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:H O LiFull Text:PDF
GTID:1310330542465267Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Protein is an important part of human cell tissues,it is the main undertaker of life activities as well.It plays a main position in the regulation of human function only if they bind together with multiple proteins to form a stable complex.Protein-ligand docking is a study of how to predict a stable structure of protein-ligand complex using computational techniques.Generally,protein-ligand docking has two difficult problems.During protein docking process both receptor and ligand will change their relative position and their own conformation,so the flexibilities of receptor and ligand become a big problem.The position of the docking binding site will directly affect the size of docking space.Unfortunately,there is no perfect binding site prediction method at the moment.Combined with the above two difficult problems,the main contents of our paper revolves around the following four points.1.Optimization of local flexibility of protein-ligand docking We divided the docking interface into multiple loop regions,and developed a loop sampling algorithm based on cyclic coordinate descent.The sampling algorithm can quickly move the original loop to its target position under the sampling space.During the closure movement,it keeps the loop close to its N-terminal and C-terminal residues.Finally,we put the loop sampling algorithm into a parallel protein-peptide docking method which developed by myself before.We replaced the original loop sampling algorithm to the parallel protein-peptide docking method.The results show that the newly developed loop sampling algorithm achieved better docking results than the old one.2.Study on the optimization of protein-ligand docking backbone flexibility We extracted a variety of frequently-used feature information from protein sequence.At the same time,for the specific characteristics of protein torsion angle prediction problem we also found two new features.We encoded the features on a set of independent protein sequence sets and packaged them into training data and testing data for the machine learning methods.To the calculation methods,we used two kinds of deep recurrent neural networks and two kinds of non-recurrent neural networks.Finally,we built a protein torsion angle prediction framework based on deep learning models.The results form a multiple validation experiments show that the proposed prediction framework can obtain more accurate protein torsion angles.And it provides an effective solution for protein backbone structure prediction.3.Study on the optimization of protein-ligand docking full-atom flexibility We proposed a method to reconstruct the whole structure of protein.It provides a set of possible conformations for the receptor and ligand which located in the sampling space for protein ensemble docking.Our structural reconstruction method stacking multi-layer auto-encoder models in the deep learning framework.The data set comes from homologous template structures of the target protein sequence,and the feature consists with the three-dimensional coordinate information from the homologous template structures after alignment.The experimental results show that the proposed protein structure reconstruction method can effectively avoiding the problems from traditional template-based protein prediction methods.Because,traditional template-based sampling algorithms are complex and the scoring function are not accurate.The experimental results show that our method obtained higher quality protein structure in the test cases with high sequence similarity homologous template.4.Study on protein-ligand binding site prediction based on recurrent neural network We first put forward a prediction method of protein-ligand binding site based on recurrent neural network.We used two sets of comprehensive data sets containing unbound and bound complexes.The validation experiment including treatment of unbalanced data,handle of over-fitting,comparison between recurrent neural network methods and normal machine learning methods and the internal comparison between recurrent neural network methods.The experimental results show that our proposed protein-ligand binding site prediction method based on recurrent neural network achieved better prediction than those ordinary machine learning methods.The main contribution of this paper is study and explore the critical computational problems during protein-ligand docking: optimization of protein flexibility and binding sites prediction.We developed a kinematic loop sampling algorithm,we used a variety of popular deep learning models as well.The experimental results show that the research and the results of these methods will help researchers to study the further development of protein structure and function prediction.At the same time,it has great reference values for the future research of bioinformatics based on deep learning.
Keywords/Search Tags:Protein-ligand Docking, Protein Structure Prediction, Machine Learning, Deep Recurrent Neural Network, Long and Short-term Memory Model
PDF Full Text Request
Related items