Font Size: a A A

Research On Prediction Of Glycosylation By Deep Neural Network

Posted on:2017-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:J H SongFull Text:PDF
GTID:2370330488979900Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protein is an important part of living cells,and it changes dynamically inside body,there exist a variety of Protein Post-translational Modifications including glycosylation,phosphorylation,ubiquitin,lipid modification,etc,therefore,it will make sense to study these modifications.In these modifications,glycosylation is one of the most important,and it plays an important role in life activities.Glycosylation happens in more than half of all proteins,and it is relevant with cell immunity,protein translation regulation,protein degradation,and many other biological processes.Based on the attached pattern between protein and carbohydrate,four kinds of glycosylation were defined:N-glycosylation,O-glycosylation,C-glycosylation and glypiation.This thesis focused on the former three kinds of glycosylation and developed computational methods to predict the glycosylation sites based on deep learning.It mainly included the following three parts.(1)Use deep learning in the feature extraction and classification of a glycosylation site prediction,the nonlinear network structure in deep learning can help to achieve complex function approximation,through unsupervised feature learning can get the essential features of a dataset,and the classification results would be better than the results of shallow classification methods such as support vector machine(SVM).In this thesis we put forward a new feature coding method for the sequence of protein glycosylation-centre position information encoding,the method takes the structure and location of a glycosylation sequence into consideration,through comparing the result of this method with other encoding methods,we found our new method get a better classification result in O-linked glycosylation sites prediction.Then,we compared with other hybrid models which is based on deep learning and include different modules to analyse our new method.(2)When the number of samples is small,the predetermined weight and offset will lead to a local minimum hybrid,so we can't learn the essence of our dataset,thus lead to an unstable classification.In order to further improve the classification of glycosylation sites,we proposed using genetic algorithm to optimize the weight and offset of deep web,through a global adaptive search algorithm,and performing operations like selection,crossover and mutation with probabilities,we expanded the search space,get a better global optimization performance,and improved the problem we mentioned above.In this thesis we used 5-fold cross validation in the prediction of O-glycosylation site,and got a better experiment result than traditional deep learning methods.(3)For the convenience of researching glycosylation sites through computing science,we built a system for glycosylation sites prediciton based on B/S.This work would not only facilitate the identification of glycosylation sites on proteins based on computational methods,but also provide some insights towards the prediction of other post-translation modification based on deep learning.
Keywords/Search Tags:glycosylation site prediction, deep learning, coding method, feature extraction, genetic algorithm
PDF Full Text Request
Related items