Application Of Machine Learning Algorithm Based On Parameter Optimization In Diabetes Prediction

Posted on:2021-03-08

Degree:Master

Type:Thesis

Country:China

Candidate:C F Zhang

Full Text:PDF

GTID:2404330602470702

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

With the improvement of people’s living standards and the aggravation of the aging of the population,diabetes and its complications have gradually become one of the main challenges affecting people’s health,seriously affecting people’s living standards.Diabetes can not be cured.Only early detection and treatment can reduce its complications and mortality.Therefore,it is of great significance for saving medical resources and reducing family burden to build diabetes predictive diagnosis model through machine learning,carry out risk assessment,carry out comprehensive screening,and intervene potential influencing factors.In the past research,scholars mainly used statistical methods to build prediction models,such as Logistic regression model,Cox proportional risk model,etc.;now,with the growth of medical data blowout,more and more machine learning algorithms can be used for disease prediction,such as decision tree and support vector machine,Xgboost,neural network,etc.As a chronic disease with many patients and no cure,the application of advanced science and technology to predict diabetes deserves special attention.On the basis of previous studies,this paper uses Python language to process diabetes data and build an algorithm framework for diabetes prediction.The main research work is as follows:1)Exploration and processing of diabetes data.Analyzing the prediction models related to the diagnosis of diabetes at home and abroad to understand the data types and model inputs.Get the original data from UCI and Tianchi platform,combine with the corresponding medical knowledge,conduct exploratory analysis of the data,find the correlation between features,and explore the structure and law of the data.Then data preprocessing,including missing value and abnormal value processing,dirty data cleaning,data standardization,sample balance,data specification,and so on,finally forming efficient and available modeling data.2)Study the disease prediction model,select the appropriate model for different problems.For the regression problem,linear regression,decision tree,support vector regression,neural network and xgboost are selected to model and predict,and the xgboost with the best performance under the regression task is selected through comparison;for the classification problem,logic regression,decision tree,support vector machine,random forest and xgboost are selected as the base classifiers,and model fusion is carried out through stacking,The method of integrated learning is used for prediction,which improves the accuracy of prediction.3)In view of the fact that machine learning model has many parameters and complex parameters,GA xgboost model is proposed to optimize the algorithm model.Based on the Xgboost model training,a genetic algorithm is introduced to set the parameters for encoding,and multiple peaks can be searched in parallel,ultimately improving the accuracy of the model.The experimental results show that in the regression task,Xgboost has obvious advantages in the three evaluation criteria of MAE,MSE,and MAPE,and the GA-Xgboost model further improves the prediction accuracy.In the classification task,the fusion model based on Stacking is also better than the single base classifier.

Keywords/Search Tags:

Diabetes prediction, Machine learning, Parameter optimization, GA-Xgboost, Ensemble learning

PDF Full Text Request

Related items

1	Research On Gestational Diabetes Risk Prediction And Online Calculation Based On Machine Learning Algorithm
2	Research On Classification And Prediction Of Diabetes Based On Ensemble Learning
3	Study On Algorithms Of Upgrading Dimension And Ensemble Learning For Diabetes Prediction
4	Research And Implementation Of Neuropsychological Test Analysis And HIV-associated Dementia Degree Analysis Method Based On Machine Learning
5	Prediction Of Gestational Diabetes Based On Machine Learning
6	Application Research Of Fusion Model Based On Ensemble Learning In Blood Glucose Prediction
7	Research And Implementation Of Ensemble Learning Methods In Cytotoxicity Prediction
8	Ensemble And Machine Learning-based Chemometrics For Metabolomics Data Analysis Associated With Inborn Errors Of Metabolism
9	Application Of Ensemble Learning Algorithm In Diabetes Prediction
10	Research On Diabetes Prediction Model Based On Ensemble Learning