Font Size: a A A

Research On Online Transaction Fraud Detection Based On Imbalanced Data Stream

Posted on:2020-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:M XuFull Text:PDF
GTID:2428330596476652Subject:Engineering
Abstract/Summary:PDF Full Text Request
Due to the processing of the computer performance and data science in recent years,to promote the rapid development of the Internet,represented by the third party payment service online trading mode has become one of the mainstream consumption method,all the time people from every corner of the Internet through the online trading platform based on network for the tens of thousands of transactions.With the expansion of online payment order scale and the substantial increase of the total transaction amount,various types of online transaction fraud is becoming more and more common.Online transaction risk management has always been an issue of great concern to the industry,and as an important part of risk management,it is of great practical significance to study online transaction risk identification.Online transaction fraud has the characteristics of low occurrence frequency and great harm,and the general anti-fraud means are difficult to effectively identify and prevent.With the maturity of machine learning and data mining related technologies,it has become a research trend in the relevant fields in recent years to apply it to online transaction risk identification.In the research on online transaction risk identification,there are two main difficulties: First is fraud accounts for only a small portion of all trading activity,the imbalance of the data type distribution is highly,the unbalanced proportion can reach one over ten thousand,even normal class and part of the fraud has the same feature information.So that we need to use unbalanced data classification method to solve this problem.In order to solve the two major problems of imbalanced data in online transaction fraud detection and concept drift of data stream,this paper proposes an improved algorithm M-XGB-SMOTE based on XGBoost and SMOTE technology on the basis of previous research results.The core idea is to combine the strong binary classification ability of XGBoost algorithm with the strong robustness of SMOTE.We chose classifiers from the multi-round resampling training classifier with the AUROC evaluation score as the index,and build the prediction model of integrated classification results based on the classifiers,in order to improve the comprehensive performance of model prediction.On the basis of the above algorithm construction,an MS-XGB-SMOTE based on the law of historical sample sampling decline is proposed in order to reduce the negative impact of data stream concept drift on the model by using the law of historical sample decreasing in importance over time.According to Experiments based on transaction risk identification data sets,it can be seen in result of the comparison with other traditional algorithm that the AUC of the result of M-XGB-SMOTE is significantly higher than the results of other algorithms;Compared with the traditional unbalanced data stream classification algorithm,it can be seen that the comprehensive prediction ability of MSXGB-SMOTE algorithm has been significantly improved.
Keywords/Search Tags:Imbalanced Data, Concept Drift, Online Transaction
PDF Full Text Request
Related items