Font Size: a A A

Study On The Deep Learning Based Prediction Method For Tumor Neoantigens

Posted on:2022-03-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:J C WuFull Text:PDF
GTID:1484306506499704Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
Tumor have been one of the most important diseases that threaten the human health.Recently,tumor immunotherapy such as immune checkpoint inhibitors,Antibody-Drug Conjugates(ADCs),Chimeric Antigen Receptor T-Cell Immunotherapy(CAR-T)and individualized tumor vaccine have demonstrated unprecedented anti-tumor therapeutic efficacy and has become a hot spot of anti-tumor research and clinical treatment.The key factor to lead the success of tumor immunotherapy is the selection of tumor antigen target.Though tumor neoantigens are considered as perfect targets for tumor immunotherapy,it remains a question to identify the tumor neoantigen correctly.This thesis uses deep learning and other new-generation artificial intelligence algorithms to analyze the key physiological processes in the formation of tumor neoantigens,such as the interaction between peptides and major histocompatibility complex(MHC)molecules,and peptide-MHC complex(pMHC)and T The cell receptor(TCR)interaction.Systematic tumor neoantigen prediction has been carried out through the construction of tumor neoantigen prediction software,large-scale tumor sample tumor neoantigen analysis,and tumor neoantigen database construction.The research mainly includes the following three parts:First,this thesis considers both the possibility of mutant peptide presentation and the potential immunogenicity of pMHC,and proposes a new recurrent neural network(RNN)-based method,DeepHLApan,for high-confidence neoantigen prediction.We proved that the binding model can achieve good performance on untrained HLA alleles,and has comparable performance to other recognized tools on the latest IEDB benchmark data set and independent mass spectrometry data set.Using the immunogenicity model on the neoantigen dataset collected from Ko(?)alo(?)lu-Yal(?)(?)n et al.,we proved that the predicted immunogenicity score can significantly improve the prediction accuracy of neoantigens.Finally,the results of applying DeepHLApan to mutations that can cause T cell responses show that in the case of TPM>2,its prediction performance is comparable to the best existing EDGE model.Secondly,this thesis studies the interaction between pMHC and TCR.A specific analysis of the binding TCRs of four pMHC complexes(A0301?KLGGALQAK,B0801?RAKFKQLL,A1101?AVFDRKSDAK and A0201?GILGFVFTL)that bind more than 1000 TCRs revealed that A0201?GILGFVFTL has the highest specificity for binding TCRs,and the pMHC-restricted TCR model constructed based on it has better performance.The pMHC restriction model was subsequently applied to guide the transformation of TCR affinity.We predicted the effects of mutations on the affinity of TCR-pMHC,respectively.The prediction results showed that there are 6 mutations in the single point mutation that can increase the affinity of the complex,and one of the mutations has been experimentally verified.In addition to the pMHC restricted model,we further discussed the construction and application of the HLA-A02:01 restricted TCR prediction model.The gene usage analysis of TRAV and TRBV of TCRs bind to HLAA02:01 allele shows that its specificity is not as high as those bind to A0201?GILGFVFTL.However,the HLA-A02:01 restriction model based on the data performs better than the A0201?GILGFVFTL restricted model on 799 positive data sets derived from the VDJdb database.The application on tumor samples with T-cell singlecell transcriptome data shows that the HLA-A02:01 restricted model combined with DeepHLApan could predict the tumor neoantigens of patients and obtain the possible TCRs that bind with corresponding neoantigens.Finally,this thesis uses the developed tumor neoantigen prediction algorithm to carry out software construction,tumor neoantigen distribution analysis,and database development based on the genome data of tumor samples.First,we use the developed tumor neoantigen prediction method DeepHLApan and a series of tumor somatic mutation identification software to construct a one-stop tumor neoantigen prediction software TSNAD V2.0,which can provide predictions from the whole exome or the whole genome.It is the first one-stop integrated software that provides a complete process for neoantigen prediction,which provides significant help for the wide application of tumor neoantigen prediction.In addition to predicting tumor neoantigens derived from point mutations(SNV)and small insertion deletions(INDEL),TSNAD V2.0 can also predict tumor neoantigens derived from gene fusion(Fusion)on the premise of providing transcriptome sequencing data.In order to facilitate the use of TSNAD V2.0 software,we provide locally installed software packages and webserver.Using the clinical verification data provided by the TESLA database,the actual prediction effect of TSNAD V2.0 was verified.The analysis results showed that of the 23 peptide-MHC combinations predicted by TSNAD V2.0 as high-confidence tumor neoantigen combinations and verified by TESLA,5(21.7%)were verified to be immunogenic.It is significantly higher than the ratio of 37 immunogenic combinations(6.1%)obtained from 608 combinations by Tumor Neoantigen Selection Alliance(TESLA)after synthesizing the results of 28 software,which reflects the predictive reliability of TSNAD V2.0.We further used TSNAD V2.0to analyze tumor neoantigens from SNV,INDEL and Fusion on large-scale tumor samples in the TCGA database.Based on the results of tumor neoantigen analysis of large-scale tumor samples,we constructed the tumor neoantigen database TSNAdb,which stores the prediction results of tumor neoantigens derived from SNV in 7748 samples of TCGA.Researchers can use the database to study the distribution of SNV-derived tumor neoantigens in different tumor types.The database also provides the analysis of shared neoantigens derived from high-frequency mutations(at least 3 times in all samples)and high-frequency HLA alleles.In conclusion,this thesis starts from the two key steps in the process of tumor neoantigens inducing T cell activation,uses the deep learning algorithm to construct the tumor neoantigens prediction model,and discusses its clinical application,providing a solid foundation for tumor neoantigen-based immunotherapy.
Keywords/Search Tags:tumor neotantigen, deep learning, T cell receptor (TCR), polypeptide-MHC complex(pMHC), immunotherapy, software, database
PDF Full Text Request
Related items