With the rapid development of e-commerce,consumers begin to pay by credit card, which led to more and more merchants use POS client.According to the main difference between the merchant business, merchant acquiring institutions set up a Merchant Category Code(MCC) for them. “Illegal Using of Merchant Category Code†refers to that the third party payment agencies apply the Merchant Category Codes illegally to enjoy low fees.Specifically, we have accomplished the following works in this paper.1) Establishing standard behavior patterns based on Merchant Category CodeThe different industries have different rules of business industry, it is called the "Industry Behavior Pattern". This paper uses the hierarchical clustering algorithm to get N behavior patterns of the same Merchant Category Code.2) Establishing Fraud detection modelIn this paper, the features could be extracted from business transactions and merchant information. This paper uses the probability classifier- logistic regression to solve the problem. Experimental results show that the proposed algorithm has recall and accuracy both over 80%.3) Establishing a distributed system for fraud detection modelIn the face of the explosive growth of data, distributed computer system is a trend of technology development. Therefore, this paper designs a distributed version of the fraud model, which is based on the Hadoop platform. We use the HDFS distributed file system to store massive data files, and use Map Reduce to improve the detection efficiency. This part was also introduced in details in this paper.To sum up, the fraud model proposed in this paper not only has excellent accuracy, but also has a good time efficiency. This design not only has practical significance, but also provides a good reference for other big data problem in the field of finance. |