Outlier Detection By Using Synthetic Strategy Support Vector Machine

Posted on:2011-08-20

Degree:Master

Type:Thesis

Country:China

Candidate:M L Liu

Full Text:PDF

GTID:2178330332461513

Subject:Control theory and control engineering

Abstract/Summary:

Increasing attention is being paid to data mining, which is an effective tool for discovering useful information and potential knowledge from mass data. Outlier detection technology aims to detect data objects that do not comply with the general behavior of the data in the datasets and find useful information. Because of adapting to the extensive applications such as crime detection in financial activities, medical diagnosis, image processing, computer network intrusion detection, it has become an important research area of data mining and made a great progress. However, with the expansion of the application fields, the existing methods can not meet the requirements and encounter problems in some aspects such as the generalization ability and robust stability of the models. So the more effective and stable methods should be presented for outlier detection. Moreover, in related applications of outlier detection, there are usually plenty of normal data and very few outliers which are difficult or expensive to obtain. So most of outlier detection methods only focus on using the normal data to establish a model or function and rarely using the labeled outliers. So the models could not reflect the actual situation in the classification.This paper does researches in outlier detection methods for the problem with large amount of normal data and few labeled outliers. And a new outlier detection method called synthetic strategy support vector machine is presented. It is based on support vector machine and kernel theories. The basic idea of the algorithm is stated as follows:the problem is firstly taken as a binary classification problem with imbalanced data, a hyperplane, which can separate the normal data from the outliers with maximum margin, is built in the feature space. As it usually costs too much to classify the outliers wrong, at the same time the hyperplane is placed as close as possible to the normal data, in order to improve the detection accuracy of outliers. The paper presents the detailed description of the algorithm from the aspects including mathematical model, dual problem, and solution.At last we carry out simulation experiments on three kinds of datasets which are six datasets about outlier detection problems, and use three evaluation metrics called g-means, true positive rate and false positive rate to evaluate the algorithm from different aspects. The experimental results show that, in comparison with existing methods, the algorithm not only promotes the performance of detecting outliers, but also can classify the normal data correctly.

Keywords/Search Tags:

Outlier Detection, One-Class Classification, Support Vector Machine, Kernel Function

Related items

1	Research On Outlier Detection Based On Support Vector Machines
2	The Research Of Classification Algorithm Based On Support Vector Machine
3	Support Vector Machine Learning Algorithms Based On Within-Class Structure
4	The Research About The Relationship Of Classification And Regression Of The Support Vector Machines
5	Research On Kernel Function And Parameter Selection In Support Vector Machine And Its Application
6	Research On Engineering Applications Of Suppor Vector Machine
7	Study On Support Vector Machine Based On Classification Nosie Detection
8	High-dimensional Anomaly Detection Based On Neural Networks Dimensionality Reduction And Support Vector Machine Classification
9	Research On Support Vector Machine Algorithm For Binary Classification Problem
10	Study On Some Support Vector Machine Algorithms And Their Applications