Font Size: a A A

A Study On Algorithm For Classification Based On Support Vector Data Description

Posted on:2010-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z YangFull Text:PDF
GTID:2178330338475886Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Statistical learning theory(SLT)is a small-sample statistics by Vapnik et al.,which concerns mainly the statistic principles when samples are limited,especially the properties of learning procedure in such cases.SLT provides us a new framework for the general learning problem, and a new method called support vector machine (SVM).SVM can solve practical problems such as over learning, high dimension and local minimal, which exist in most of learning methods.Practical comparisons have shown that SVM is competitive with existing methods such as neural networks and decision. Being the optimal learning theory for small samples,Statistical Learning Theory and SVM are attracting more and more researchers and becoming a hotspot in the fields of artificial intelligent and machine learning.However,because the appearance of late,many aspects of SVM are immature and incomplete currently, and more researches and improvements should be done.Among the many studies, researchers attach importance to Support Vector Data Description(SVDD) because of its peculiar property.This paper is organized as follows:First and second chapter introduces the background and actuality of SVM, and the statistical learning theory. Then, we discuss the main ideas and the usual methods of two classes SVM.In the third chapter,we introduce the Support Vector Data Description. Then we discuss many kinds of method about SVDD,and conducts experiments to compare them.The basic idea of the SVDD method is to construct a Minimum Enclosing Bal(lMEB)to describe a set of given data.The SVDD problem computes the ball of minimum radius enclosing a given set of points. The model can be rewritten in a form comparable to the support vector classifier(SVC).It offers the ability to map the data to a new,high dimensional feature space without much extra computational costs. By this mapping more flexible descriptions are obtained.Traditional algorithms for finding super-sphere of SVDD do not scale well with the dimensionality d of the points.Consequently,recent attention has shifted to the development of approximation algorithms.A new method show that an (1 +ε)-approximation of the SVDD can be efficiently obtained using core sets.Generally speaking,in an optimization problem,a core set is a subset of input points such that we can get a good approximation to the original input by solving the optimization problem directly on the core sets.A surprising discovery is that the size of its core sets can be shown to be independent of both d and the size of the point set.In the fourth chapter,we introduce Core Vector Machine and propose a new method.Standard SVM training has O ( m3)time and O ( m2)space complexities, where m is the training set size.It is thus computationally infeasible on very large data sets. Core Vector Machine (CVM) first show that the quadratic optimization problem involved in SVM can be formulated as an equivalent Hard-Margin SVDD problem.Experiments demonstrate the CVM is as accurate as existing SVM implementations,but is much faster and and handle much larger data sets than existing methods.This paper show that the quadratic optimization problem involved in SVM can also be formulated as an equivalent Soft-Margin SVDD problem.Through emulator,we compare it with the known method and find this technique is effective.In the fifth chapter,we introduce the properties of Gauss kernel function, the effects it has on the SVDD,and a novel parameter-optimizing algorithm.In a great of kernel functions,many researchers attach importance to Gauss kernel function because of its peculiar property .However,many applications demonstrate that the performance of SVDD with Gauss kernel is influenced greatly by the kernel parameter.So,the optimal parameter should lead to the distribution of the mapped data in the feature space to be a hyper-sphere shape.The experiments on simulated data demonstrate the effectiveness of the method.In the last chapter, we summarize the paper's contents and propose some suggestions of the future work.
Keywords/Search Tags:Support Vector Machine, Support Vector Data Description, Core Sets, Support Vector Clustering
PDF Full Text Request
Related items