Research On Anomaly Detection Methods For Financial Data

Posted on:2020-06-04

Degree:Master

Type:Thesis

Country:China

Candidate:W Liang

Full Text:PDF

GTID:2428330578455271

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Aiming at the problems of low accuracy and high false alarm rate of current anomaly detection algorithms in dealing with sequential and unbalanced massive financial data sets,this paper studies feature selection and unbalanced classification,and proposes a feature selection algorithm for financial data based on conditional dynamic mutual information,an anomaly detection algorithm for financial data based on isolated forest,and a hybrid anomaly detection algorithm for financial data based on SVM and KNN.The primary research works of the thesis are given as follows:1.Aiming at the problem that one-sided evaluation criteria of feature selection algorithm can not get the optimal feature subset when dealing with massive sequential datasets,a feature selection algorithm for financial data based on conditional dynamic mutual information(CDMIFS)is proposed.Combine the timing characteristics of the data,this algorithm measures candidate features from many aspects,and measures the correlation between features and anomaly classes by conditional dynamic mutual information on unrecognized samples to obtain feature subsets.The experimental results show that this algorithm can effectively remove irrelevant data from financial datasets and improve classification performance.2.Aiming at the problem of low accuracy and high false alarm rate caused by the randomness of node partition in iForest,a node partition standard based on abnormal cost information gain ratio was proposed,and an abnormal detection algorithm for financial data based on isolated forest was proposed(FA-iForest).This criterion considers the weighted information entropy of attributes and anomaly classes in historical data,and sets a cost function to increase the penalty of misjudgement of anomaly classes.The experimental results show that this algorithm can effectively improve the ability of anomaly detection of iForest.3.Aiming at the poor classification performance of SVM and KNN algorithms on time series and unbalanced datasets,a hybrid financial data anomaly detection algorithm based on SVM and KNN is proposed(SVM-KNN).This method introducestemporal features,and uses feature weighted kernels,seting weights for penalty factors of different samples.SVM-KNN uses SVM to classify preliminarily,and uses KNN algorithm to classify quadratically.The experimental results show that the algorithm can effectively improve the anomaly detection ability of SVM and KNN in financial data.The research contributions of this paper are as follows.The contribution of features to classification in historical data is introduced to improve the feature selection algorithm;the weighted information entropy of attributes and anomaly classes in historical data is introduced to optimize the criteria for separating forest nodes;The temporal feature is introduced to propose a hybrid anomaly detection algorithm based on SVM and KNN;the experimental results on financial datasets show that the proposed algorithms are effective.

Keywords/Search Tags:

Anomaly Detection, Financial Data, Feature Selection, Unbalanced Classification, iForest, Support Vector Machine, K-Nearest Neighbor

PDF Full Text Request

Related items

1	Research On Network Intrusion Detection Based On Support Vector Machine Combine With K Nearest Neighbor Method
2	Research On Machine Learning Methods For Intelligent Decision-Making
3	Research On The Joint Classification Based On Support Vector Machine And K-nearest Neighbor
4	Text Sentiment Analysis Based On Text Classification
5	High-dimensional Unbalanced Data Set Classification Algorithm Based On Support Vector Machine And Its Application
6	Anomaly Detection Methods Based On Deep Support Vector Machines And The Corresponding Applications
7	Research On Image Classification Based On Support Vector Machine
8	The Research Of Intrusion Detection Based On Support Vector Machine
9	Automatic Classification Research On Chinese Web Document Orientation
10	Anomaly Detection System For Big Data Of Mobile Printed Circuit Board Industry