Font Size: a A A

Research On Hematological Malignances Screening Model In High-risk Population Based On Blood Routine Indicators

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Y DaiFull Text:PDF
GTID:2404330605969769Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Background:Hematological Malignances(HM),including Hodgkin's lymphoma,non-Hodgkin's lymphoma,leukemia and myeloma is a major public health problem that seriously threatens human health.Early diagnosis of HM is crucial for improving the patient survival rate and saving medical expenses.The establishment of an early,cheap,non-invasive,enough sensitive,enough specific,and efficient HM screening model to screen high-risk individuals early is the prerequisite for achieving this goal.However,almost all the currently existing HM screening models were trained on sample data collected from hospitals whose prevalence level is much higher than the real prevalence in the real-world community.For example,if one study used a training dataset that contained an equal number of positive samples and negative samples.The corresponding prevalence of this dataset was 50%and was significantly higher than the real HM prevalence ratio in the real-world community.As a result,a HM screening model constructed based on this kind of dataset will demonstrate a very low positive predictive value even if it achieved high sensitivity and specificity,which will reduce its practical valueObjective and methods:Based on the "Shandong whole population life course healthcare big data cohort " and "Shandong Multi-center Health Management Cohort",a modeling dataset that conforms to the actual distribution of age and gender HM prevalence in the population was built;further,by comparing the three ensemble Learning algorithms including XGBoost,LightGBM and Random Forest that used blood routine indicators,the optimal HM screening model suitable for real-world target high-risk population is selected,and an online assisted screening APP tool for HM is developed.This tool provides low-cost,safe,and easy-to-operate assisted screening way for basic medical units or community population,which aim to screen for high-risk individuals as early as possible,shorten the time for medical treatment,improve the rate of early diagnosis and treatment,save medical treatment costs,and avoid patients and their families unnecessary physical and psychological traumaResults:1?The prevalence ratio of HMThe prevalence ratio of HM is 94.02 per 100000 population.The prevalence of HM in both men and women gradually increases with age.From the age of 50,it became higher than 100 per 100000 population.Therefore,we defined the people who were older than 50 years as HM high-risk group and make it as the target population of the HM screening model constructed in the current research.The prevalence ratio of HM in this high-risk group was 143.16 per 100000 population2.Dataset construction for HM screening model3971 HM patients with complete blood routine indicators in the high-risk population for HM(over 50 years old)were used as the case group,to construct the HM screening model.According to the proportion of age,sex,and the prevalence of HM(the total prevalence rate is 143.16/100,000)in a high-risk population(over 50 years old),a control group was selected from the non-HM population with complete blood routine indicators in the cohort by the age and sex proportion(2769780 people)Thus,a simulated modeling population that consistent with the prevalence of real-world random sampling from the high-risk HM target population is constructed,which ensured the practicability of the constructed HM screening model in the real-world community population3.Optimal screening model for HM(1)Based on the modeling dataset constructed above,the comprehensive comparison about the positive predictive value,sensitivity,specificity,negative predictive value and AUC of the random forest,LightGBM and XGBoost model showed that:? When verifying in the training set T(real-world prevalence(0.143%)),with the decreased gap between HM prevalence in the training set and validation set,the models built from training set B(50%),C(30%),D(10%),E(5%),and A(0.143%)(from high to low prevalence)showed an increased ability to extrapolating and generalizing.When the prevalence of HM in the validation set is consistent with the training set,the positive predictive values of all of the three models obtained the best performance;however,the performance of positive predictive values of the XGBoost model was still better than the LightGBM model and random forest model.? As the gap between the prevalence of HM in the training set and the validation set decreased,the specificity of the three screening models gradually decreases and the XGBoost model is superior to the LightGBM model and the Random Forest model.? As the gap between the prevalence of HM in the training set and the validation set decreased,the specificity,negative predictive value,and AUC of the three screening models are unchanged and maintained at a high level;Among them,XGBoost is still performing well.(2)Modeling strategy and model screening criteria were constructed with thepositive predictive value of the model as the core evaluation index,and with sensitivity,specificity,negative predictive value and AUC as the auxiliary evaluation indexes.The XGBoost HM screening model constructed by training set(A)and test set(T)according to the prevalence rate in line with the HM prevalence of real-world community population was selected as the optimal screening model.Positive predictive value of the model was 86.81%,sensitivity was 83.39%,specificity was 99.98%,negative predictive value was 99.98%,and AUC was 0.9914.APP online tool for HM screeningBased on Flutter,APP online tool for HM screening has realized the online identification and early warning of HM high-risk individuals which provided suitable tools for the early detection of HM.Conclusions(1)In a real-world community population,the prevalence of HM high-risk groups(over 50 years old)is 143.16/100000.There is a serious data imbalance between the positive group(HM patients)and the negative group(non-HM patients).(2)For such data imbalance problems,the HM high-risk individual screening model based on the XGBoost algorithm has good performance and is the best screening model for individuals with high HM risk in the real world community population;its positive predictive value is up to 86.81%,sensitivity is 83.39%,specificity is 99.98%,negative predictive value is 99.98%and AUC is 0.991.(3)The developed APP provides a low-cost,safe and easy operating screening method for high HM risk individuals in primary medical institutions or community groups.The APP is a convenient online tool for early screening individuals with high HM risk which will shorten the time limit for medical treatment,improve the rate of early diagnosis and treatment,save medical costs,and avoid unnecessary physical and mental trauma for patients and families.
Keywords/Search Tags:Hematological Malignancies(HM), Early Screening Models, APP online tool for HM screening
PDF Full Text Request
Related items