| With the continuous expansion of the coverage of basic medical insurance,the increasing number of designated medical service institutions and designated retail pharmacies,measures are needed to deal with the growing illegal use of medical insurance funds,and the means of medical insurance fraud are more hidden.It is difficult to strengthen the monitoring of medical insurance funds only by using traditional medical insurance audit methods.In recent years,with the continuous improvement of medical insurance information construction and the continuous development of new technologies such as big data and data mining,how to use these new technologies and new methods to effectively improve the supervision of medical insurance funds and realize the integration of traditional manual audit and big data technology is a serious theme to be discussed and studied.As suppliers of medical insurance services,designated medical institutions and designated retail pharmacies are the main places for illegal medicine trafficking to resell medicines.Medicine traffickers frequently purchase medicines by using their own medical insurance card or others to designated medical institutions or designated retail pharmacies,and then sell the purchased medicines to obtain high profits.During the 12 th Five Year Plan period,the implementation plan for deepening the reform of the medical and health system clearly requires that the medical insurance departments at all levels should strengthen the supervision of designated medical institutions and designated retail pharmacies,and increase the punishment for insurance fraud.Therefore,in order to innovate the supervision mode,improve the supervision effectively guarantee the rational use of medical insurance fund,this paper,based on the medicine purchase data and insured audit data of designated medical institutions and designated retail pharmacies in Shanghai,combined with machine learning and medical insurance supervision experience,uses data mining technology to intelligently identify suspicious medical insurance abnormal medicine purchase personnel from massive data.The abnormal behavior of medical insurance is complex and changeable.Firstly,this paper describes the medicine purchase data and the insured’s audit data by cleaning and basic statistics.Combined with the Shanghai Medical Insurance Outpatient treatment and medical cost supervision measures,it selects and determines the research sample data set for the year 2018,and extracts the abnormal purchase due to the characteristics of designated medical institutions and designated retail pharmacies medicine behavior characteristic variables,and then through the hypothesis test to observe whether the selected characteristic variables have significant differences between abnormal and normal medicine consumers.In the phase of establishing the abnormal medicine purchase model,the training set data are used to establish the model using logistic regression,Ada Boost and XGboost algorithms,and then the test set data are used to calculate and generate the confusion matrix using the established regression and classification model,and the accuracy,error rate,accuracy,specificity,sensitivity and other analysis indexes of the model are obtained through the confusion matrix.,In order to evaluate the classifier more intuitively and effectively,ROC and AUC are calculated.The evaluation results show that the XGboost algorithm has the best effect.Finally,in order to improve the effect of the model and propose a scheme to improve the accuracy of the model,association rule algorithm is used to further mine the medicines with association in the purchase information of fixed-point retail pharmacies,and it is used as a secondary feature variable to model again using classification algorithm.The experimental results show that the Apriori algorithm proposed in this paper is combined with the XGboost algorithm.The mixed model has a good effect on the analysis of abnormal medicine purchase behavior of basic medical insurance. |