Font Size: a A A

Research On Prediction Model Of Gestational Diabetes Mellitus Based On Integrated Learning Algorithm

Posted on:2020-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:2404330590994747Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Gestational diabetes is one of the risk factors for fetal dysplasia and maternal pregnancy difficulties.Therefore,the prediction of gestational diabetes mellitus in ultraearly stage can assist the doctor to diagnose and change the pregnancy outcome.In this paper,the integrated model algorithm is used to construct the prediction model of gestational diabetes mellitus,and the shortcomings of Cascade classifier algorithm in disease prediction model are perfected and improved,and the model is personalized interpretation and important feature mining,and establishing a personalized prevention scheme.In order to build a better performance based on integrated learning algorithm for gestational diabetes Prediction model,I did the following research work:(1)The sample data for cleaning and data description,and according to the IV value of the variable screening,in the screening process for discrete value replacement and continuous variable isometric segmentation,and according to the IV value 0.1~ 0.5 as a threshold,the nine variables of the model are selected,and(2)in the aspect of improving the accuracy of the model,the sample data is divided into training set and test set in proportion to 7:3,and the prediction model of gestational diabetes mellitus is constructed by using xgboost,lightgbm and catboost three integrated algorithms,The F1 fraction is selected as the model Evaluation index,and the network style is used to participate in the crossverification method to adjust the parameters,and the model is compared,and the cascade principle is further applied to improve the accuracy of F1 fraction prediction,and the threshold value of cascade structure is determined to be 70% by studying the relationship between sample coverage and fractional accuracy,and three cascading structures,The Cascade structure(3)with Catboost as the main model is selected according to the accuracy rate.In the aspect of model interpretation,the SHAP framework is used to explain the catboost model from the granularity of the sample.For each sample,the function size of the feature and the direction of the feature can be given.The shap-value is used as the evaluation value to calculate,and the flexibility of the SHAP framework supports the characterization of each sample,in addition to evaluating the direction and size of the action of a single feature under the full sample.The calculation of feature importance will also be sorted according to Shap-value in the granularity of the whole sample aggregation,and the direction of the feature on the model results will be given.
Keywords/Search Tags:gestational diabetes mellitus, predictive models, integrated learning algorithms, cascading methods, shap framework
PDF Full Text Request
Related items