| The rapid development of high-throughput sequencing technology has led to the exponential generation of biomedical big data,which has laid a solid foundation for the data-driven identification of prognostic marker and prognostic model research of cancer,and the prognostic study of pediatric tumors has also developed deeply.Neuroblastoma(NB)is the third common malignant tumor in children aged 0-14 years.Patients in the high risk group will still have recurrence,progression,and even death despite multimodel intensive therapy.The long-term overall survival rate of patients in the advanced stage is less than 50%,and the survivors are often accompanied by chronic diseases for life.In this thesis,NB is regarded as a representative of data mining research on childhood tumors due to its common characteristics of insidious onset,high malignancy and heterogeneity,relatively sensitivity to radiotherapy and chemotherapy,and its particularity of spontaneous regression.The main purpose of this study is to construct prognostic models based on the transcriptome data through computational methods to better stratify the diagnosis and treatment of patients to assist clinical treatment decisions,or to try to find drug therapy targets for biological verification to develop new treatment methods,which is also the focus of researchers’ attention and the difficulty in the field.In recent years,there have been many explorations and applications of machine learning algorithms in the study of NB prognosis,and some progress has been made.However,studies on the discovery of new drug targets,improvement of stratified diagnosis and treatment,and development of new treatment methods are still limited.A large number of studies have shown that the factors affecting the prognosis of cancer are diverse and complex,among which tumor microenvironment,body immunity,and noncoding RNA can not only play a huge role in cancer occurrence,development,metastasis,and treatment response but also have a great impact on the prognosis of cancer.However,the effects of the tumor microenvironment,immune-related genes,and immune-related lnc RNAs on prognosis and their crucial roles in the development and progression of pediatric tumors have not been systematically described.Based on the above problems,this dissertation takes gene expression data as a research object,uses data mining technology and experimental methods as a research strategy,conducted relevant research on important issues such as the screening of key gene prognosis model of NB,the construction of immune-related prognosis models,the description of the tumor microenvironment and the biological verification of functional genes,mainly includes:1.Establishing a key gene risk score prognostic model for neuroblastomaGiven the poor prognosis of high risk NB patients,the lack of effective therapeutic targets,and the inability to benefit from existing stratified diagnosis and treatment methods,this study proposed a key gene prognostic model to remove redundant genes by mining key candidate genes,also provided multiple potential therapeutic targets for NB.Survival analysis combined with random forest algorithm was employed to construct a risk score prognostic model containing only four genes.In this progress,this study integrated the characteristics such as survival time and category label and discussed NB prognosis from the number of characteristic genes,classifier,and model parameter setting.The experimental results showed that the key gene prognostic model has good discrimination and calibration,and can distinguish patients with different survival in multiple independent datasets.The clinical subgroup analysis also obtained consistent results.In addition,the cytohubba and MCODE methods were used in this study to identify the co-expression gene of ERCC6 L,the best gene in the key gene prognosis model,revealing that this gene interacts with multiple genes to participate in the occurrence and development of NB.In addition,compared with other prognostic models constructed by other feature selection algorithms,the model proposed in this paper performs as well as other models on the premise that the number of features is relatively small.2.Proposing the immune-related prognostic model for neuroblastomaImmune-related genes and immune-related lnc RNAs affect the occurrence,development,treatment response,and prognosis of cancers.Aiming at the problem that the roles of immune-related genes and lnc RNAs in NB and their impact on the prognosis of NB patients are still unclear,this study proposed an immune-related prognostic model based on the machine-learning algorithm.The survival-associated immune genes were screened by Cox regression analysis and incorporated into the random forest model to establish a risk score five genes(RS5_G)prognostic model.A prognostic model RS_Lnc(risk score Lnc RNAs)involved eleven immune-related lnc RNAs has been established by co-expression analysis and LASSO(Lease Absolute and Selection Operator)algorithm,and their excellent performance has been verified in multiple independent datasets.Finally,performance comparison experiments were conducted in two high risk NB datasets.The results show that the three prognostic models proposed in this paper outperform other prognostic models and can be used as independent prognostic risk factors.3.Constructing a cell prognostic model of neuroblastoma based on quantified tumor microenvironmentStudies have shown that the tumor microenvironment(TME)affects the malignant biological behavior of NB,and targeting the cellular components of TME may provide a new option for NB therapy.In view of the incomplete research on the tumor microenvironment of pediatric tumors,this study quantified the cellular components in the TME of NB based on the transcriptome data and constructs a prognostic model at the cellular level.Based on gene expression data,this study quantified the cellular components of NB TME and constructed a prognostic model at the cellular level.Using RNA-seq and microarray data as research objects,x Cell algorithm was employed to map the x Cell fraction of 64 cell types in NB TME,quantify the proportion of various cells,and display the cell composition of the disease TME in detail.Subsequently,ten cell types were selected for the prognostic model p CRS(prognostic cell risk score)in NB.The results showed that the model could be used as an independent prognostic risk factor for overall survival and event-free survival in external datasets,especially in high risk patients,p CRS was the only independent risk factor,and its performance as a prognostic marker was superior to MYCN amplification.In clinical subgroups,the model was able to distinguish patients with different survival.4.Biological function verification of key genesGiven the question of whether the genes screened by data mining have biological significance,molecular biology experiments were designed in this study to verify the effect of one of the key genes,HMGB3,on the proliferation,migration,and invasion of NB tumor cells.Firstly,HMGB3 silenced significantly suppresses cell proliferation,migration,and invasion in two NB cell lines was verified by establishing a gene loss-offunction model;animal experiments also showed that tumor growth was significantly inhibited by HMGB3 knockdown.Subsequently,genes that may interact with HMGB3 were identified through gene co-expression analysis.The changes of co-expression genes were detected in the HMGB3 silenced cell line,and TPX2 was screened out.Further,the gene gain-of-function model was obtained to verify that HMGB3 may play an oncogenic role by mediating TPX2.Finally,the superposition of the above two genes on the survival prediction of NB patients was demonstrated by survival analysis.In conclusion,this study constructed prognostic models for neuroblastoma by integrating gene expression data and computational methods,interpreted the significance of key genes,immunity,and tumor microenvironment on NB prognosis,and verified the oncogenic effect of one of the key genes,HMGB3,on NB through molecular biological experiments.This dissertation establishes prognostic models at the molecular and cellular levels,these models can be supplements to the existing riskstratification diagnosis and treatment system,and can provide some assistance for treatment decision-making of NB.This study offers ideas and references for the research on other types of childhood tumors. |