Font Size: a A A

Research On WTBagging Algorithm For Medical Insurance Fraud Detection

Posted on:2023-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:J H YaoFull Text:PDF
GTID:2544306848462134Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Basic medical insurance is a social insurance system to protect workers from economic losses caused by disease risks.Medical insurance coverage in China has exceeded 95%,and medical insurance expenditure has reached more than 2 trillion yuan.With the vigorous development of the medical insurance industry,medical insurance fraud also appears and shows an upward trend year by year.In the early research,researchers used probability theory and statistical methods to detect fraud.Later,the research changed to using machine learning algorithms for medical insurance fraud detection and gradually began to detect medical insurance fraud through ensemble models.There are some problems in these researches,such as lack of feature selection analysis,and lack of basic model selection method of ensemble model.To solve these problems,this paper uses a new ensemble learning algorithm to detect medical insurance fraud and makes corresponding research on feature selection,class imbalance and model selection.First,we need to solve the problem of feature selection in medical insurance data.The medical insurance data set is selected by different feature selection methods,and the training model uses the generated data to observe the prediction effect of the model.Select the feature selection method with the highest F1 value.Second,we need to solve the problem of class imbalance.The experiment uses four sampling methods to sample part of the data and finds out the best sampling method of the model.Thirdly,we need to solve the problem of model selection.Use the same medical insurance dataset to train multiple alternative models and find out the models with the highest F1 value.These models are arranged and combined to obtain ensemble model.Then use the ensemble algorithm to build the ensemble model of these module combinations,and find out the model combinations with the best prediction effect.Finally,we propose an improved algorithm based on the traditional Bagging algorithm-WTBagging algorithm.The algorithm assigns weights according to the prediction ability of the basic model,sums the weights of the prediction results,and then determines whether the sum of weights is greater than the threshold to determine whether there is fraud in the record.The experiment applies WTBagging algorithm and Bagging algorithm to the proposed model combination to verify whether WTBagging algorithm can improve the fraud detection effect of the model.In this subject,the experiment is based on the reimbursement data set of real medical insurance providers published by the American Centers for Medicare and Medicaid Services.Use data cleaning,feature selection,and data sampling to process data.Select the model to build the ensemble model,and apply WTBagging algorithm and Bagging algorithm to build the ensemble model.The experimental results use the F1 value as the Performance Metric.The experimental results will verify the effectiveness of the model in this subject.
Keywords/Search Tags:Medicare Fraud Detection, Feature Selection, Data Sampling, Model Ensemble, WTBagging Algorithm
PDF Full Text Request
Related items