Identification Of Fraud Detection Based On High Dimensional And Unevenly Distributed Online Transaction Data

Posted on:2021-01-29

Degree:Master

Type:Thesis

Country:China

Candidate:S Gao

Full Text:PDF

GTID:2428330647459594

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

People's lives are entering the digital times gradually,online transaction data is soaring day by day.At the same time,financial fraud crimes have also increased sharply,leading to huge losses in the financial institution industry.Combined with advantages of supervised learning and unsupervised learning,this article mainly explores three existing problems in anti-fraud identification.First,the problem of class imbalance.Based on the conjecture of DBSMOTE algorithm,the GMM-SMOTE algorithm is proposed to oversample the positive samples with linear interpolation.In this paper,we designed a comparison test verified that DBSMOTE performs better than GMM-SMOTE on this data set.Second,the Covariate Shift problem.In order to address the problem,transfer learning is introduced,we conjecture and verifies that the estimation of probability density ratio(Based on Kullback-Leibler divergence algorithm)can more effectively solve the problem of Covariate Shift compared with the machine learning confirmation method,which makes the model performance better.Third,the limitation of time.With the passage of time,the scams have been renovated,the unsupervised learning and supervised learning have complementary advantage in the domain of outlier detection.This article combines the Cat Boost algorithm with the advantages of isolated forests to improve the recall rate in the anti-fraud field and designs Isolation Forest,Hybrid Isolation Forest,Extension Isolation Forest comparison experiments.The result shows that the hybrid isolated forest algorithm performs better in fraud detection.In terms of feature selection,in addition to processing with conventional correlation and outlier conditions,it also incorporates the concept of temporal consistency to filter features,then drop variables with poor predictive performance over time spans.In terms of feature addition,this paper considered sliding time window features and aggregated grouping features.

Keywords/Search Tags:

Fraud Detection model, Transfer Learning, Anomaly Detection, Class imbalance, Covariate Shift

PDF Full Text Request

Related items

1	Research On Credit Fraud Detection Based On Machine Learning
2	Design And Implementation Of Credit Card Fraud Detection Based On Vertical Federated Learning
3	An Imbalanced Approach Towards Credit Card Fraud Detection Using Proximity Based Resampling And Classifier Ranking
4	Design And Implementation Of Fraud Detection System For Online Transaction
5	Reaerch On Credit Card Fraud Detection Model Based On Global Anomaly Detection
6	Research On Credit Card Fraud Detection Model Based On KBX Integrated Learning
7	Research On Anomaly Detection Methods Of One-Class Classification Model
8	Research On Transfer-sampling Based Method For Class-imbalance Learning
9	Research On Clustering And Anomaly Detection Based On Soft Computing
10	Research On Network Traffic Anomaly Detection Methods Based On Deep Learning