Font Size: a A A

Research On XGBoost Decision Tree Optimization And Its Application

Posted on:2023-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:X M HuFull Text:PDF
GTID:2558306914954509Subject:Engineering
Abstract/Summary:PDF Full Text Request
The eXtreme Gradient Boosting Decision Tree(XGBoost)which is an outstanding representative of ensemble learning is efficient,flexible,convenient,and can run in a distributed environment,so it is widely used in many fields,such as network attack detection,case analysis and credit risk assessment and so on.However,the prediction effect of XGBoost without parameter optimization is often not ideal,because it has a poor fit with the dataset,which results in poor generalization performance and adaptability.The XGBoost has more than thirty hyperparameters,and its prediction performance is highly dependent on the adjustment of these parameters,so it is necessary to propose an effective training method to train these hyperparameters.Based on Moth-flame Optimization algorithm(MFO)and Cuckoo Search algorithm(CS),the thesis proposes two novel XGBoost training methods.The first training method uses the MFO to simultaneously train nine hyperparameters of the XGBoost for the first time,and called the trained XGBoost as MFO-based XGBoost which is applied to breast cancer diagnosis in the thesis,and we carry out simulation experiments on two real medical datasets.The second training method uses the CS to optimize the structure of the XGBoost,and the XGBoost trained based on the second training method,called CS-XGBoost,is applied to The method is applied to employee turnover prediction in human resource management,and a simulation experiment is carried out on a real-world human resources analysis dataset.Finally,the two models proposed in the paper,which are MFO based XGBoost and CS-XGBoost respectively,are compared with the existing XGBoosts,which are trained by Grid Search,Bayesian Optimization,Particle Swarm Optimization,and Genetic Algorithm,and four general classifiers including GBDT,RF,SVM,and KNN.The experimental results and corresponding discussions show that in breast cancer diagnosis dataset,MFO-based XGBoost with the merits of fast convergence,small error and high-accuracy outperforms the above comparison models in model performance evaluation metrics such as accuracy,precision,and recall.And CS-XGBoost with the merits of small error and high-accuracy outperforms the above comparison models in model performance evaluation metrics such as accuracy,precision,and F1 score in HR analysis dataset.
Keywords/Search Tags:Extreme Gradient Boosting Decision Tree, Moth-flame Optimization Algorithm, Cuckoo Search Algorithm, Breast Cancer Diagnosis, Human Resource Management, Employee Turnover Prediction
PDF Full Text Request
Related items