Font Size: a A A

Study On Computational Modeling Of Protein Mutation Pathogenicity

Posted on:2023-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:L P NieFull Text:PDF
GTID:2530306629976479Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Proteins are the basic organisms that make up living matter,and protein mutations are important factors in causing disease.Distinguishing neutral mutations from disease-associated mutations allows rapid screening of potential disease-causing sites.In this thesis,we investigate the computational model of protein mutation pathogenicity as follows.A deep model BMBQA based on multiscale convolutional and bidirectional gated recurrent neural networks was used to evaluate the quality score of protein prediction structures.Test results on a classical dataset show that BMBQA is competitive in terms of scoring accuracy,ability to rank predicted structures,and selection of the best structure.BMBQA helps to improve the accuracy of the following prediction models.Pathogenicity scores for frameshift mutations and nonsense mutations are predicted by TransPPMP,a deep Transformer-based model.The method uses an ESM pre-trained model to characterize protein context sequences and uses focus loss to optimize the training of the model.It has excellent performance advantages on ten-fold cross-validation and independent blind test sets.Functional points that have a large impact on mutant amino acids are captured through the multi-headed attention mechanism of TransPPMP.The pathogenicity score of missense mutations is predicted by MMPDL,a deep model based on a graph convolutional neural network and Transformer.The amino acid embedding is learned using a graph convolutional network for protein structure,while the DNA sequence representation is learned using a self-attentive mechanism.The fusion of such features including both DNA and protein levels resulted in a better performance on the test set.Computational models for predicting the pathogenicity of protein mutations have been constructed based on deep learning techniques in this thesis.By trying to interrupt the model in terms of biochemical semantics,useful clues are provided to study the pathogenic mechanisms of complex human diseases.And,we try to extend the important applications of deep learning techniques in the field of bioinformatics.
Keywords/Search Tags:Deep learning, Quality assessment of protein prediction structure, Frameshift mutation, Nonsense mutation, Missense mutation
PDF Full Text Request
Related items