Font Size: a A A

Screening And Identification Of Tumor-associated Antigen In Pancreatic Cancer And Construction Of Diagnostic Models

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2504306326954409Subject:Public Health
Abstract/Summary:PDF Full Text Request
Pancreatic cancer is a highly malignant digestive tract tumor with poor prognosis in all common cancers.The morbidity and mortality of pancreatic cancer are increasing year by year,posing a serious threat to human health.Due to the occult early symptoms of pancreatic cancer and the low sensitivity of clinical diagnostic markers,90%of patients were in advanced stage when diagnosed.Therefore,it is of great significance to find tumor markers with high sensitivity and specificity for the diagnosis of pancreatic cancer.In recent years,with the discovery of immune cells and inflammatory factors in the tumor microenvironment,the role of the body’s immune system in monitoring and killing cancer cells has been gradually revealed,the rapid rise of cancer immunotherapy,human tumor associated antigen(TAAs)has been pushed to the peak of cancer diagnosis and treatment.TAA is a kind of antigen molecule,the expression of them is significantly increased in the process of cell carcinogenesis.The screening and identification of tumor associated antigens is an important way for cancer immune diagnosis.Based on bioinformatics technology,the preliminary diagnosis model of pancreatic cancer was constructed by using the screened TAAs,which has certain clinical application value for the diagnosis of pancreatic cancer.Objective1.to construct a cDNA expression library of pancreatic cancer tissue in Chinese population using SMART technology,and to provide a platform for screening tumor associated antigens of pancreatic cancer.2.to screen and identify tumor associated antigens of pancreatic cancer based on SEREX(Serological analysis of recombinant cDNA expression library)technology,which laid a theoretical foundation for exploring tumor markers of pancreatic cancer.3.to construct and verify the diagnosis model of pancreatic cancer based on the bioinformatics and machine learning technology,which will provide a new direction for the diagnosis of pancreatic cancer.Methods1.Construct the cDNA expression library of pancreatic cancer in Chinese population based on SMART technology:(1)Library Construction:Total RNA was extracted from two patients with newly diagnosed pancreatic cancer and the first strand cDNA was synthesized.Long distance(LD)-PCR was used to amplify the cDNA.The purified cDNA fragment was ligated withλtriplex2 phage vector.(2)Quality identification:the titer of the library was calculated by counting the number of independent clones in the Petri dish,and the size and recombination rate of the insert were detected by PCR and agarose gel electrophoresis.2.Screening and identification tumor associated antigens in pancreatic cancer:(1)The E.coli lysate was prepared and the mixed serum of five pancreatic cancer patients was preabsorbed.The cDNA expression library of pancreatic cancer tissue in Chinese population was screened by serum immunology technology.After three rounds of immune screening,positive clones were selected.(2)The positive clones were incubated and amplified by PCR.The PCR products were analyzed by agarose gel electrophoresis,and the false positive clones were excluded.(3)The PCR products of the positive clones were sequenced,and the functions of the genes and their coding proteins were identified by Blast and Gene websites.3.Establish and evaluate the diagnostic model of pancreatic cancer by using bioinformatic information and machine learning technology:(1)Based on the GEPIA website,the differential expression of tumor associated antigens screened by SEREX technology was analyzed at the gene level.One-way ANOVA was used to analyze the m RNA expression difference of 36 antigens genes that encode 36 TAAs between pancreatic cancer and healthy controls.(2)Model construction based on differential genes:the samples are from TCGA,GTEx and ICGC databases and were randomly divided into training set and verification set with a ratio of 7:3;the diagnosis model is constructed in the training set combined with machine learning and cross validation method,and the optimal model is determined according to the accuracy,sensitivity,and specificity of the model in the training set and verification set.4.Statistical analysis was conducted with R language the difference was statistically significant while P<0.05.Results1.Construct the cDNA expression library from tissue of pancreatic cancer base on SMART technology:The total RNA extracted from two tissue of pancreatic cancer had obvious 28S,18S and 5S bands,and the brightness ratio of 28s to 18S was close to 2:1.After purification,the OD260/280of total RNA was 1.97,ranging from 1.7 to 2.1.The purified cDNA showed continuous smear distribution without specific bands.The titer of the constructed cDNA expression library was 3x107pfu/ul,and the minimum insert size of the library was about 400 bp,the maximum insert size was about 2,000bp,and the recombination rate was nearly 100%.2.Screen and identify the tumor associated antigens based on SEREX technology:(1)A total of 96 positive clones were found by three rounds of serological immunological screening.By agarose gel electrophoresis,24 false positive clones(the vector without insert)were discarded,and 72 positive clones were sequenced.(2)Through blast and gene site sequence alignment,we found that 72 positive clones were 43 different genes,of which,36 were known functional genes and 7 were unknown functional genes.Among the 36 known functional genes,four genes are related to histone regulation;11 are related to cell proliferation,invasion,adhesion and differentiation;six genes are related to insulin regulation and lipid metabolism;15 are related to protein metabolism,regulation,modification and binding.3.Establish a diagnostic model of pancreatic cancer by using machine learning technology and verify the diagnostic capability of the models:(1)Among the 36 tumor associated antigens screened by SEREX technology,the m RNA expression levels of 21 genes were different between the case group and the control group,and the cancer group was higher than the control group,the differences were statistically significant(P<0.05).(2)Six model genes of CKS2,ERGIC2,NQO1,SGTB,EIF2AK2,and PAM were selected from 21 antigen genes by using Exhaustive Attack method and ten-fold cross-validation characteristic selection method.And Support Vector Machines(SVM),Random Forests(RF),Naive Bayes(NB),and Neural Network(NN)diagnostic models were constructed.The accuracy of each model in the training set exceeded 97.00%,of which the accuracy,sensitivity,and specificity of the RF model were 100.00%respectively;the accuracy of each model in the verification set exceeded 96.00%,of which the accuracy of the RF model was 98.88%,the sensitivity was 97.53%,and the specificity reach 100.00%.Conclusions1.the cDNA expression library of pancreatic cancer tissue in China was successfully constructed by SMART technology.The quality of the library meets the requirements when screening tumor related antigens by SEREX technology.2.36 tumor associated antigens of pancreatic cancer were screened by SEREX,and 21 of them were differentially expressed(m RNA profile)between pancreatic cancer and healthy controls.3.Using CKS2,ERGIC2,NQO1,SGTB,EIF2AK2,and PAM as model variables,a pancreatic cancer diagnostic model with an accuracy rate of more than 96%was successfully constructed.Among them,RF is the model with the best diagnostic ability,which has important reference value for clinical diagnosis of pancreatic cancer.
Keywords/Search Tags:Pancreatic cancer, cDNA library, SEREX, Tumor-associated antigens, Diagnosis models
PDF Full Text Request
Related items