| Medical insurance violations and fraud mainly refers to the actions that violating medical insurance management rules and policies,taking fictitious insurance accidents and other methods to defraud the Medicare Fund directly or indirectly for economic purposes.These illegal actions seriously interfere with the normal operation of medical insurance system,endangering the safety of the health care fund,and damaging the interests of the insured.With the advancement of medical insurance information,medical institutions at all levels have accumulated a mass of medical service data including medical diagnosis information,diagnosis and treatment details,prescription details and digital medical records generated during medical service,which hides the medical service knowledge and rules,but also hides a very small amount of fraud records.Medical insurance fraud detection needs to be carried out in the large medical data,from the vast majority of normal and reasonable medical data to distinguish a very small amount of fraud records.For medical insurance data have the characteristics of large data volume,fast generation of data,high data dimension,unbalanced distribution of data,obscure illegal behavior and other characteristics,medical insurance fraud detection has become a challenging work.First,the rules of treatment process are hidden in the patient’s medical diagnosis information,diagnosis and treatment details,and prescription details.But because of the va.st treatment/medication items that represent medical treatment activities,frequent patterns mining process is prone to produce high-dimensional curse phenomenon,the number of frequent mode which can be found decline sharp,the treatment process can not be identified effectively.Second,Collusive Fraud is one of the most common fraud types existing in numerous applications within areas such as medical,law enforcement,finance,etc.However,Collusive Fraud detection is a difficult problem because fraudsters take only a very small part of the population and fraudsters do everything to bypass fraud detection constraints.Most existing fraud detection studies focus on finding normal behavior patterns and treat those who possess behaviors that violate behavior patterns as fraudsters.However,these methods generally have high false positives because normal people may also sometimes behave contrary to normal behavior patterns.Third,Submerged medical malpractice usually involves the participation of medical professionals,who will try all means to avoid violating the rules of detection,and traditional clustering or outlier detection methods are difficult to detect.And the number of medical professionals(such as doctors)is less than the patient’s treatment data,a variety of information that is recorded in the medical insurance system is not comprehensive,and it is difficult to make effective clustering of doctors merely based on medical data.This paper focuses on fraud detection in large medical insurance data,which aims to identify the medical behavior of medical services consumption and medical treatment process that is in a large deviation,reduce the suspected fraud doctor/patient collection,in order to facilitate the precise positioning of suspected fraud.This paper studies from a number of challenging practical problems systematically and deeply,such as abnormal medical record detection,abnormal medical process detection and doctor fraud detection.The main contributions are summarized as follows:This paper proposes an abnormal medical process discovery method based on coarse-grained behavior pattern,clustering the vast medical activities,and making frequent pattern mining in the coarse-grained activity sequence to obtain the medical process rules,and then to detect the abnormal medical process which violate the law,avoiding high-dimensional curse,and improving the efficiency of rules mining.Converting the fine-grained and vast medical activities into a weighted behavior graph,through a semi-supervised manifold learning method SSIsomap to map the behavior to the European space,and using the knowledge of the relevant behavior data category to cluster the medical activity to obtain the coarse-grained activity class.Then frequent pattern mining method is used on the re-encoding coarse-grained behavior sequence to find the behavior model that is the law of medical processes for to improve the efficiency of regular mining.Finally,for the record to be discriminated,according to its similarity with the behavior pattern to obtain the probability of fraud.The method can effectively avoid the high-dimensional curse;the experiment on the large medical insurance data set shows its effectiveness and efficiency.Collusive Fraud is one of the most common fraud types existing in numerous applications within areas such as medical,law enforcement,finance,etc.However,Collusive Fraud detection is a difficult problem because fraudsters take only a very small part of the population and fraudsters do everything to bypass fraud detection constraints.Most existing fraud detection studies focus on finding normal behavior patterns and treat those who possess behaviors that violate behavior patterns as fra.udsters.However,these methods generally have high false positives because normal people may also sometimes behave contrary to normal behavior patterns.To address this issue,we propose an abnormal group based Collusive Fraud detection method named AGBCFD.This method can distinguish suspicious fraudsters from normal persons who have unusual behaviors by abnormal group mining in person similarity adjacency graph so that the occurrence of false positives caused by non-fraudulent abnormal behavior can be reduced.Extensive experiments using medical insurance data show that our approach has improved the precision of Collusive Fraud detection by more than 20%compared to conventional methods.This paper proposes a method of doctor fraud detection based on heterosexual network community outlier detection.By using the relationship among the entity in the field of medical insurance,the similar physician cluster is obtained by community division,analyzing the subtle changes of the sequence of physician behavior characteristics in the cluster,and constructing a feature correlation probability model to detect outliers in the cluster,so that we can discover fraudulent doctors.The physician-medicine heterogeneous information network is constructed by entities of doctors,patient prescriptions,drugs and their mutual relation.Then,the paper proposes community division algorithm with constraint and overlap based on label propagation,the doctors who make similar prescriptions are divided into the same community.Analyzing the correlation and variation range of the behavioral characteristics of the doctors in the same community,constructing the feature correlation probability model,and using the subtle changes of the feature value sequence to detect the hidden medical malpractice and the fraudulent doctors.This method uses the results of community division to find the outlier records,the complex problems are divided,reducing the complexity of the algorithm,and improving the time efficiency and outlier recognition efficiency. |