Font Size: a A A

Analysis Of Lung Cancer Prognosis Data Based On Particle Filtering And Random Survival Forest

Posted on:2024-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Q TanFull Text:PDF
GTID:2544307076492164Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
According to the data from the National Cancer Center in 2022,there are about828,100 new cases of lung cancer and 657000 deaths every year.The number of new cases and deaths is the first among malignant tumors in China.Despite the continuous progress of existing lung cancer treatment technologies,the 5-year survival rate of lung cancer is still relatively low.In addition to improving treatment methods,more efforts need to be made in the prognosis evaluation of lung cancer.Existing research prognosis models often use Cox proportional risk models,which cannot accurately calculate the survival rate of a single patient.Moreover,the predictive effect of the established models is difficult to improve as the sample size increases.There are few studies on model universality analysis.When conducting model universality analysis through simulation studies,simulation data is often obtained through custom distribution or random sampling in real data distribution.This study finds out the influencing factors of lung cancer prognosis from multiple types of clinical data,and establishes a model with interpretability,high predictability and high robustness to provide decision support for clinical diagnosis of lung cancer and improve patient survival rate.This study is based on 1288 prognostic data of lung cancer patients,and uses simulated data simulation analysis and real data application analysis to develop a prognostic model to predict the risk rate of lung cancer patients and the impact of influencing factors on lung cancer prognosis.The main achievements include:(1)exploring the individual and interactive effects of comprehensive factors on the prognosis of lung cancer,and extracting simulated data features for lung cancer prognosis.The interaction relationship between influencing factors was obtained using a random survival forest model.Using the Weibull regression model and introducing particle filter algorithm to optimize Weibull regression parameters,a conclusion was obtained on how influencing factors affect the prognosis of lung cancer.(2)Build multiple lung cancer prognosis models based on simulated data,and compare and analyze the performance of various models on different datasets.Fifteen datasets with different sample sizes and deletion rates were generated,and random survival forest models,Weibull regression models,and Weibull regression particle filter optimization models were constructed.Research has shown that the random survival forest model and Weibull regression model predict an increase in AUC values with increasing sample size,while the particle filter optimized Weibull regression model maintains a higher level of AUC values than the other two models,unaffected by sample size.When the deletion rate is small,the Weibull model has better prediction performance than the random survival forest model,but this difference decreases with the increase of sample size.The Weibull regression model optimized by particle filter is more sensitive to changes in deletion rate,with irregular fluctuations in AUC values,but this fluctuation weakens with the increase of sample size.(3)Evaluate the effectiveness of three types of models on real datasets.Research has shown that the AUC value of the random survival forest model is 0.737,the AUC value of the Weibull regression model is 0.774,and the AUC value of the particle filter optimized Weibull regression model is 0.887.During the simulated patient enrollment process,the AUC value of the Weibull regression model increased by 2.842%,while the AUC value of the random survival forest model decreased by 11.533%.However,the AUC value of the Weibull regression model optimized by particle filtering increased by12.740%,showing a significant increase.The analysis of interaction terms using random survival forests found an interactive relationship between clinical staging CTNM and cancer tissue classification.The innovation points of this study include:(1)Comprehensive simulation research and application analysis to evaluate and verify the predictive effects of random survival forests,Weibull regression models,and Weibull regression particle filter optimization models.Compared with existing simulation studies,using random sampling or randomly defined variable distribution of real data,simulation data has inherent characteristics similar to lung cancer prognosis data.The conclusions obtained from simulation studies can be used for conclusions related to lung cancer prognosis.(2)The random survival forest model belonging to the tree model and the Weibull regression model belonging to the parameter model were introduced,and the influencing factors of lung cancer prognosis were analyzed.This not only analyzed how the influencing factors affect the risk rate of lung cancer patients’ prognosis,but also evaluated the interaction between the influencing factors.(3)The predictive performance of the basic model cannot be improved with the continuous increase of patient information.The particle filter algorithm is introduced to optimize the parameters of the Weibull regression model,in order to achieve the goal of high predictive ability of the dynamic model.
Keywords/Search Tags:Prognostic model, Survival analysis, Weibull regression model, Particle filtering
PDF Full Text Request
Related items