| The detection of fraudulent credit card transactions has always been a hot research topic in the academic research community.The purpose of the research is to determine whether a user has a fraudulent situation,based on the user’s transaction behavior,and give timely feedback to protect the property security of credit card issuers and cardholders.However,due to the highly imbalanced nature of credit card transaction data samples,it is sometimes difficult for fraud detection systems to identify and stop frauds in time.With the development of machine learning,there are many scholars using machine learning techniques to research in the field of fraud detection,and they have proposed many methods to improve the accuracy of credit card fraud detection systems,but we always face the problems of data imbalance and difficulty in detection in credit card fraud detection.In this thesis,to solve the above problems,resampling is combined with anomaly detection techniques to construct a detection model for credit card fraudulent transactions,and the specific research work is as follows.(1)Proposing a Mixed-sampling sampling method based on the nearest neighbor geometric space(referred to as K-G-SMOTE)to construct a balanced sample set of credit card transaction data.For the problem of unbalanced credit card fraud sample data set classes,the thesis proposes K-G-SMOTE on the basis of combining the advantages of over-sampling and under-sampling,the method clusters the normal transaction samples(majority class samples)by sample distance,constructs a planar geometric space according to the clustering center,draws 80% of the sample quantity in the space,and completes under-sampling;then,through the nearest neighbor sample selection strategy for Fraudulent transaction samples(minority class samples)are constructed as a hypersphere geometric space as the generation space,and hyperparameters are set to construct truncation surfaces to continuously reduce the size of the sample generation space,and finally synthetic samples are generated in the space to complete the oversampling.The experimental results show that the proposed K-G-SMOTE method outperforms other experimentally selected methods in terms of comprehensive accuracy and achieves an accuracy rate of 93%,and is effective in solving data imbalance,as well as the fitting problem caused by using resampling techniques alone.(2)Proposing Global Anomaly Detection based Credit Card Fraud Detection Model(G-ADOA)to improve the accuracy of credit card fraud detection.To address the difficulty of credit card fraud detection,the thesis improves the anomaly detection with a partially observed anomalies algorithm(ADOA)based on its ability to better retain sample anomaly features and proposes the G-ADOA fraud detection model.The model uses the K-G-SMOTE method to increase the number of samples of fraudulent transactions,and then,the thesis extends the detection method of ADOA for partial anomalies to global anomaly detection,which overcomes the problem that the algorithm has limitations in information acquisition in the detection process.The model uses the Local Outlier Factor(LOF)algorithm and Isolation Forest(IF)algorithm to improve the calculation of the degree of anomaly of samples,and forms the fraudulent transaction sample class by clustering,calculates the similarity degree between samples and the fraudulent transaction sample class,and finally constructs labels according to the degree of anomaly and similarity of samples,sets the confidence of labels,and constructs a multi-classification credit card fraud detection model.The experimental results show that the G-ADOA model improves about 4% in all the experimental indexes compared with the unimproved AODA model,and the Precision rate exceeds 98%,which means the model effectively improves the accuracy rate of the fraud detection model and is effective in overcoming the difficult problem of credit card fraud detection. |