Font Size: a A A

Research On The Classification Method Of Imbalanced Data For Fraud

Posted on:2021-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2428330614971343Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,credit card payment has become a popular method of payment.However,credit card fraud is on the rise,causing huge losses worldwide.In the field of risk prevention and control,although there are still a large number of traditional rule-based risk control systems,many researchers have begun to develop systems based on machine learning,and more and more attention has been paid to them.The datasets of credit card transactions and bill repayment are highly imbalanced in quantity,which is shown in the real data as the number of legitimate transactions is far more than the number of fraudulent transactions,and the number of payment on time is far more than the number of overdue bills,which will bring great impact on the detection of risk control system.The existing methods mainly consider how to balance the two classes according to the amount of data,without considering the complexity of the user's behavior in the credit card transaction,and ignore the connection and change of the user's behavior,that is,every consumption and bill repayment behavior of the user may be influenced by other similar behaviors.Based on this,the main work of this paper is as follows:(1)This paper defines a Behavior noise and put forward a behavior-cluster based imbalanced noise reduction method(CNR),aiming at the complexity of the user behavior in imbalanced data,from the perspective of user behavior,through the analysis of the transaction with same label,summarizes some group behavior,and remove noise samples do not conform to the behavior patterns,in this way not only completed the data re-sampling,and guarantees the rationality of the distribution of user behavior.Compared with the existing outstanding imbalance data processing methods on 18 imbalanced datasets of UCI,F1 score of our method in 13 datasets has achieved the best results,AUC in 10 datasets has achieved the best results.At the same time,F1 score optimal results were obtained on the data sets of credit card fraud and credit card default.(2)Graph attention network classification model based on user behavior correlation with noise reduction(GAT?CNR)was proposed.GAT?CNR constructs the graph structure from the perspective of user behavior.The graph neural network(GNN)is used to learn the connections between nodes in the graph and enrich the representation of node information.During data classification,GAT?CNR uses attention mechanism toorganize node information aggregation in the graph to form effective prediction characteristics.Compared with the existing imbalanced data processing methods,the graph attention network method based on noise reduction has obtained F1 optimal experimental results on both credit card fraud and credit card default datasets,which proves the effectiveness of the proposed GAT?CNR.
Keywords/Search Tags:Imbalanced data, Behavioral noise, Attention mechanism, Graph neural network
PDF Full Text Request
Related items