Font Size: a A A

Research On Training Algorithm And Preprocessing Algorithm Of Support Vector Machine

Posted on:2010-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y T HeFull Text:PDF
GTID:2178360275958664Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Support Vector Machine(SVM) has many advantages such as perfect generalization performance and simple form,it has been widely used in the fields of pattern recognition, signal processing and image processing.SVM is equivalent to quadratic programming, so it's confronted with the poor generalization and performance bottleneck in imbalanced or large data sets.In the case of imbalanced data set(IDS),the introducing of the preprocessing algorithm eliminates the redundant samples and shortens the training times.But the size differences between the training sets aren't considered in the preprocessing algorithm, which results in low efficiency.In the case of large data set,the decomposition algorithm uses working set strategy to reduce the complexity of SVM training.But the existing working set selection algorithm doesn't fully use the information of objection function which leads to slow convergence.In this thesis,the existing preprocessing and learning algorithms are investigated, and the solutions are given to the two problems.1:Analyzing the reason of low generalization for IDS and the difficulty of choosing the k value for preprocessing algorithm.2:Introducing the sample set's distribution to preprocessing algorithm and improving the parameter selection algorithm which eliminate the redundant samples and increase the generalization performance for IDS.3:Comparing the working set selection strategy of two decomposition algorithms, and analyzing the deficiencies of working set selection algorithm used in svm-light.4:By combining the working set selection method of libsvm with that of svm-light, a new working set selection algorithm based on second order information is proposed. The effectiveness of algorithm is proved by using UCI data sets.
Keywords/Search Tags:support vector machine, k-nearest neighborhood, imbalance data set, feasible direction method, second order information
PDF Full Text Request
Related items