Font Size: a A A

Research On Medical Insurance Anomaly Detection Based On SOFM Neural Network And Random Forest

Posted on:2021-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:H F CaoFull Text:PDF
GTID:2404330614971825Subject:Information management
Abstract/Summary:PDF Full Text Request
With the continuous development and deepening of China's medical reform,the scope of medical insurance coverage is becoming more and more extensive,and at the same time,the frequent occurrence of medical insurance frauds has caused the country to lose a large amount of medical insurance funds every year,so it is particularly important to establish an efficient fraud identification mechanism.At present,the review of medical insurance fraud and violations is mainly carried out by manually sampling samples and judging from the experience and knowledge of experts.In the face of the increasing medical insurance data,the audit method has problems such as large workload and low efficiency.In this regard,many scholars have introduced data mining techniques such as classification or clustering to conduct medical insurance anomaly detection research.The detection models constructed by these two types of algorithms have achieved certain results in medical insurance anomaly detection,but combined classification and clustering algorithms In-depth analysis of the principle and characteristics of abnormal samples can reveal that there are still some shortcomings in using these two types of algorithms for abnormal detection.For the classification algorithm,the various forms of medical insurance fraud violations make the medical insurance samples exhibit abnormalities on different indicators,which makes it difficult for the classification algorithm to distinguish all positive and negative samples by a dividing hyperplane;In terms of similar algorithms,concealed medical insurance fraud and violations are often accompanied by the participation of medical personnel.The sample data generated by these behaviors will have a certain degree of similarity with the normal sample data,so that the normal samples in the clustering process And abnormal samples will be mixed together.Based on this background,this paper uses clustering and classification algorithms to construct a new medical insurance abnormality detection model,and optimizes the core link of the detection model to improve its detection effect on medical insurance abnormality samples.The main research work of this article is as follows:(1)A medical insurance anomaly detection model combining Self-Organizing Feature Map(SOMF)clustering and Random Forest(RF)classification algorithm is proposed.Based on the existing medical insurance hospitalization data,the medical insurance samples are classified abnormally and normally according to the medical insurance review rules.The processed medical insurance samples are first clustered by the SOFM neural network algorithm.The abnormal samples will be classified according to the way of medical insurance fraud and violation Go to different clusters,and then use the random forest algorithm to train the positive and negative samples in the cluster to generate a classification model.In this way,it is equivalent to establishing multiple partition hyperplanes in the original medical insurance samples,so that Improve the identification effect of abnormal samples of medical insurance.(2)In order to optimize the clustering process,this paper uses principal component analysis(PCA)to improve the SOFM neural network clustering algorithm.Performing PCA processing on medical insurance samples,eliminating redundant information between variables and reducing sample dimensions,inputting the processed medical insurance samples into the SOFM neural network algorithm for training,effectively reducing the convergence time and number of iterations in the clustering process.(3)In order to solve the data imbalance problem,a new sampling model is proposed,and SMOTE oversampling is performed on the basis of clustering,which leads to the SOFM-SMOTE combination algorithm model.(4)To better exploit the classification effect of the random forest algorithm in the health care sample,the random forest algorithm was improved using the weighted base classifier method,which assigns weight to each decision tree algorithm based on the performance metric of the classification effect,taking full advantage of the classification ability of high performance decision trees and reducing the negative effects of low performance decision trees.The paper applies the proposed medical insurance abnormality detection model to actual medical insurance hospitalization data,and compares the effectiveness of the model by evaluating the accuracy rate,recall rate,F1 value,etc.The results prove that the medical insurance abnormality detection model proposed in this paper is feasible and efficient in practical application.
Keywords/Search Tags:medical insurance anomaly, principal component analysis, SOFM neural network, unbalanced data, random forest algorithm
PDF Full Text Request
Related items