| Since the 1990s,China has been committed to the establishment of the basic social medical insurance system and has now generally achieved universal coverage.However,with the increasing expenditure of medical funds,the cases of medical insurance fraud have emerged in an endless stream.As a result,the safety of medical insurance funds has been seriously threatened.Due to the highly specialized nature of the medical industry and the highly asymmetric nature of medical information,medical insurance fraud is quite concealed.Conducting the fraud risk assessment for the social basic medical insurance can effectively save the human and time cost of the claim bill review,provide scientific references for the audit experts,and support effective decision-making for the regulation of fraud risk as well as the formulation of anti-fraud measures.This paper aims to study the fraud risk of basic medical insurance participants based on data mining technology.The large-scale real data of China’s social basic medical insurance system is adopted.It includes more than 1.83 million medical history records of hospitals in some parts of areas.The XGBoost algorithm combined with the Easy Ensemble method is used to construct an integration model for the fraud risk assessment of basic medical insurance participants.It is able to predict the probability of the insured persons’fraudulent risk and issue the early warning of fraudulent behavior.On this basis,the potential characteristics of fraud actors are further identified and quantified to construct a fraud risk assessment indicator system.Finally,this paper compared the differences in the characteristics between the deceivers and the normal persons.The results of the study show that:(1)95%of the model’s predicted results were consistent with the actual results,and 85%of the insured can be correctly evaluated for the possibility of fraud.Among all the insured persons who actually committed fraud behavior,83%of them could be correctly identified through this model,which can prevent 91.27%loss of the medical insurance fund.The value of F1 and AUC are respectively 0.89 and 0.92,indicating that the model is stable and its’prediction performance is good.Therefore,the fraud risk assessment indicator system for the insured of social basic medical insurance which developed based on this model can be effectively used to identify potential frauds.(2)In the medical records,the indicators of the cost information rank high in importance.It is the most important one that can reflect whether the insured has committed fraud.The importance of the number of treatment items and the number of bills rank second.In the insurance reimbursement records,the declaratory expense and out-of-pocket payment of each item and the type of paying account are relatively important.(3)The differences between the characteristics of fraudulent and normal person are mainly reflected in the number of drug items,the number of bills,the total cost,the drug cost,the drug cost at the end of each month,and the frequency to visits the no.180 hospital. |