Font Size: a A A

Study On Sampling Technologies Based On Deep Learning For Protein Structure Prediction

Posted on:2017-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:S LuoFull Text:PDF
GTID:2180330488461927Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In bioinformatics, protein structure prediction refers to predicting three dimensional protein structures from amino acid sequences by computational methods. As a supplementary method to traditional experiments, it helps us to learn about and take advantage of biochemical functions of proteins without experimental determined structures. One of the most difficult tasks in predicting protein structures is sampling the conformational space. Sampling is defined as searching the conformational space for status with the minimum free energy. In this dissertation, we propose a novel sampling scheme based on deep learning algorithms to help improve the accuracy of protein structure predictions.In this dissertation, Hybrid Monte Carlo(HMC) method from deep learning algorithms is introduced to better sample the conformational space of protein structures with 100, 200, or even more residues according to the probability distributions, while traditional sampling methods succeed in cases that proteins usually have less residues by assigning each value of free degrees directly. But they often fail the situation in which proteins have more than 100 residues, because of the large conformational space. In addition, residue distance constrains are added to the sampling algorithm to optimize a maximum 75 percent(40 percent on average) of residue pairs in each structure.Energy function is the foundation of sampling methods in protein structure prediction schemes. In this dissertation, a novel Convolutional Neural Networks(CNN)with optimized network model is introduced to learn from and predict atom contacts in protein structures. The proposed multi-layer network model achieves excellent precision regression in predicting the GDT-score with a loss of approximately 0.002 in benchmarks.
Keywords/Search Tags:Protein, Structure Prediction, Sampling, HMC, CNN
PDF Full Text Request
Related items