Font Size: a A A

Identification Of A Multidimensional Transcriptome Prognostic Signature For Lung Adenocarcinoma

Posted on:2020-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J YeFull Text:PDF
GTID:1360330623957935Subject:Geriatrics
Abstract/Summary:PDF Full Text Request
purpose With the continuous improvement of the social and economic level,the aging of the population,the gradual change of people's lifestyles,malignant tumors have become the primary factor that harms the health of our residents.The incidence and mortality of lung cancer are the first in malignant tumors,most of which are non-small cell lung cancer.Non-small cell lung cancer is the most common non-adenocarcinoma,accounting for about 50% of all lung cancer patients,and the incidence rate Year by year,due to its high degree of malignancy,easy recurrence,and easy to metastasize,the vast majority of patients have a poor prognosis,and the survival time is no more than 5 years.In order to prolong the life cycle of patients with lung adenocarcinoma and reduce mortality,early diagnosis and treatment are the key.Therefore,in addition to the existing clinical and traditional diagnostic factors,there is an urgent need to develop new molecular prognostic signals for predicting the risk of lung adenocarcinoma recurrence and identifying highrisk patients with lung adenocarcinoma who may benefit from various adjuvant therapies.In recent years,the continuous improvement and development of high-throughput sequencing technology and bioinformatics have provided new ideas and methods for studying tumor markers of lung cancer.It is well known that the form of tumors is a complex biological process involving multiple genes involved in multiple factors,in which a variety of protein-coding genes,long-chain non-coding RNAs,and the like are involved.Existing studies have shown that long-chain non-coding RNA can regulate the transcriptional expression of protein-coding genes.The advantage of the combination of IncRNA and PCGs as biomarkers is multidimensional,which can show the process of tumor development in more detail and thus more effective prediction.The prognosis of the patient.The purpose of this study was to screen and construct a multi-dimensional transcriptome PCG-lncRNA signaling molecular model by mining the lung adenocarcinoma data in the GEO database,combined with the application and integration of bioinformatics and statistical methods,and performing gene ontology enrichment analysis.KEGG pathway enrichment analysis to understand the predictive function of PCG-IncRNA signaling molecular model for the prognosis of lung adenocarcinoma and the biological processes that may be involved.Materials and Method In this study,the GE database was systematically searched,and the gene expression profiles of GSE31210,GSE37745 and GSE30219 were included,as well as the clinical information of the corresponding patients.The GPL570 probe sequence was then aligned with the transcript sequences of human protein-encoding genes and long-chain noncoding RNAs in the GENCODE database,and PCGs and IncRNAs associated with prognosis were screened using COX proportional hazard regression analysis.Univariate analysis and prognosis-related variables were further reduced using randomized survival forest(RSFVH),and finally 9 PCGs and lncRNAs were screened.Univariate or multivariate Cox case regression was used to assess gene expression,model scores,and prognosis correlations.The predictive effects of pathological staging and PCG-lnc RNA signal were compared using ROC and time-ROC analysis.The Pearson's test calculates the co-expression relationship between the selected PCG and IncRNAs and other proteincoding genes,and uses the clusterProfiler package of GO and KEGG to perform enrichment analysis on these significantly related genes.result We have established a model that predicts the prognosis of patients with lung adenocarcinoma,including three protein-coding genes(NHLRC2,PLIN5,GNAI3)and a long-chain non-coding RNA(AC087521.1).This model divides patients with lung adenocarcinoma into low-risk and high-risk groups based on significant differences in survival rates from the training dataset(GSE31210,n = 226,log-rank test P < 0.001).The risk stratification of the model was then verified in the other two test data sets(GSE37745,n = 106,log-rank test P < 0.001;GSE30219,n = 85,log-rank test P =.006),resulting in the gene Express relationship with survival.In the training datasets GSE31210,GSE37745,and GSE30219,patients with lung adenocarcinoma with high expression or low risk scores of NHLRC2 and PLIN5 had a better prognosis,while patients with high risk scores or lung adenocarcinoma with high expression of AC087521.1 and GNAI3 had longer survival time.Short,the prognosis is even worse.To further assess whether the PCG-lncRNA model is an independent prognostic risk factor for lung adenocarcinoma.In the training data set GSE31210,the PCG-lnc RNA model is an independent risk factor for lung adenocarcinoma independent of clinical features including gender,age,and pathological stage(HR=20.84,95% CI: 5.00-86.93,P <.001).Similarly,the clinical prognostic value of the PCG-IncRNA model was also validated in two validation data sets,GSE37745 and GSE30219.Among the validation datasets GSE37745,high risk group and low risk group,HR= 2.27,95% CI: 1.42-3.63,P< 0.001;validation data set GSE30219 HR= 2.39,95% CI: 1.28-4.48,P= 0.01.We compared the predictive efficiency of the pathological stage and the PCG-lncRNA model for survival by ROC curve analysis.It was found that the prognostic evaluation of the PCG-lncRNA model was significantly better than the pathological stage.The AUC value of the pathological stage of the GSE31210 data set was 0.652,the AUC value of the PCG-IncRNA model was 0.762,the AUC value of the pathological stage of the data set GSE37745 was 0.616,and the AUC value of the PCG-IncRNA model was 0.678.Similarly,the time-ROC analysis found that the PCG-IncRNA model predicted survival performance at 3,5,and 8 years better than pathological stage.In the GSE31210 dataset,the predicted AUC value for the 3-year survival of the PCG-IncRNA model was 0.73,the predicted AUC for the 5-year survival was 0.78,and the predicted AUC for the 8-year survival was 0.84.The predicted AUC value for pathological staging for 3-year survival was 0.75,the predicted AUC for 5-year survival was 0.64,and the AUC for 8-year survival was 0.73.We also observed the same results in the GSE37745 data set.In the GSE37745 data set,the predicted AUC value for the 3-year survival of the PCGIncRNA signal was 0.64,the predicted AUC value for the 5-year survival was 0.63,and the predicted AUC value for the 8-year survival was 0.62.The predicted AUC value of pathological stage for 3-year survival was 0.58,the predicted AUC value for 5-year survival was 0.55,and the predicted AUC value for 8-year survival was 0.57.To explore possible biological functions of the PCG-lncRNA model,2654 coding genes were screened in the GSE31210 and GSE37745 datasets(Pearson correlation coefficient > 0.3/<-0.3,P < 0.05).GO analysis and KEGG enrichment analysis of 2654 coding genes revealed that 38 biological processes were significantly enriched,including transcription of non-coding RNA,response to insulin,and transcription of small nucleotides by RNA polymerase II.(P < 0.05,).conclusion In this study,a systematic search of the GEO database and the use of bioinformatics related methods have established a model that predicts the prognosis of patients with lung adenocarcinoma.This model has a significant correlation with the survival prognosis of patients,indicating that the PCG-IncRNA model is A risk risk factor for the prognosis of lung adenocarcinoma.Time-dependent receiver operating characteristic curve(timeROC)analysis showed that the model was closely related to disease progression,and the prediction of prognostic ability was superior to pathological type.The Cox proportional hazard regression analysis showed that this signal can be used as an independent predictor of clinical outcome in patients with LUAD.The results of GO and KEGG enrichment analysis preliminarily revealed the biological processes that may be involved in the lung adenocarcinoma PCG-IncRNA model.In summary,the method of using bioinformatics for tumor prediction model construction is feasible.Of course,large-scale additional research must be carried out before the constructed model is applied to the clinical environment.
Keywords/Search Tags:long non-coding RNA, lung adenocarcinoma, prognostic, protein-coding gene, signature
PDF Full Text Request
Related items