Font Size: a A A

Research On Anomaly Detection Methods For Financial Data

Posted on:2020-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:W LiangFull Text:PDF
GTID:2428330578455271Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Aiming at the problems of low accuracy and high false alarm rate of current anomaly detection algorithms in dealing with sequential and unbalanced massive financial data sets,this paper studies feature selection and unbalanced classification,and proposes a feature selection algorithm for financial data based on conditional dynamic mutual information,an anomaly detection algorithm for financial data based on isolated forest,and a hybrid anomaly detection algorithm for financial data based on SVM and KNN.The primary research works of the thesis are given as follows:1.Aiming at the problem that one-sided evaluation criteria of feature selection algorithm can not get the optimal feature subset when dealing with massive sequential datasets,a feature selection algorithm for financial data based on conditional dynamic mutual information(CDMIFS)is proposed.Combine the timing characteristics of the data,this algorithm measures candidate features from many aspects,and measures the correlation between features and anomaly classes by conditional dynamic mutual information on unrecognized samples to obtain feature subsets.The experimental results show that this algorithm can effectively remove irrelevant data from financial datasets and improve classification performance.2.Aiming at the problem of low accuracy and high false alarm rate caused by the randomness of node partition in iForest,a node partition standard based on abnormal cost information gain ratio was proposed,and an abnormal detection algorithm for financial data based on isolated forest was proposed(FA-iForest).This criterion considers the weighted information entropy of attributes and anomaly classes in historical data,and sets a cost function to increase the penalty of misjudgement of anomaly classes.The experimental results show that this algorithm can effectively improve the ability of anomaly detection of iForest.3.Aiming at the poor classification performance of SVM and KNN algorithms on time series and unbalanced datasets,a hybrid financial data anomaly detection algorithm based on SVM and KNN is proposed(SVM-KNN).This method introducestemporal features,and uses feature weighted kernels,seting weights for penalty factors of different samples.SVM-KNN uses SVM to classify preliminarily,and uses KNN algorithm to classify quadratically.The experimental results show that the algorithm can effectively improve the anomaly detection ability of SVM and KNN in financial data.The research contributions of this paper are as follows.The contribution of features to classification in historical data is introduced to improve the feature selection algorithm;the weighted information entropy of attributes and anomaly classes in historical data is introduced to optimize the criteria for separating forest nodes;The temporal feature is introduced to propose a hybrid anomaly detection algorithm based on SVM and KNN;the experimental results on financial datasets show that the proposed algorithms are effective.
Keywords/Search Tags:Anomaly Detection, Financial Data, Feature Selection, Unbalanced Classification, iForest, Support Vector Machine, K-Nearest Neighbor
PDF Full Text Request
Related items