Research On Pattern Classifier Based On Large-Scale Datasets

Posted on:2009-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Chang

Full Text:PDF

GTID:2178360245486440

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

In the process of pattern classification on large-scale datasets , both redundant features and many training samples will lead to low classification speed and require high performance of computer memory.So it's necessary to process the datasets using feature selection and sample selection before pattern classification,discard the redundant features from the datasets and select the samples deciding the nonlinear separating surface of the classifier.Then train the classifier using the simplified training datasets to enhance the classification accuracy and reduce the computer memory requirement.Orthogonal design and uniform design are two widely used experimental design methods.Both can find out the optimal combination of factors with fewer experiments . Besides dealing with small samples , Support Vector Machine(SVM) has a good generalization ability and is immune to the restriction of the data dimension.Considered the listed advantages of the three kinds of theory,the thesis takes SVM as a classifier and proposes two feature selection methods,which are feature selection based on orthogonal design and feature selection based on uniform design.Training and testing are arranged according to the feature numbers of datasets and the structure of orthogonal table or uniform table.Finally,experiments are carried out on the selected subsets of features.The results indicate that the proposed methods can not only discard the redundant features but also gain better classification accuracy than that on the datasets with full features.One modified algorithm about SVM is Reduced Support Vector Machine(RSVM).It uses a very small random subset of the dataset as support vectors to sovle unconstrained optimization problem and construct the nonlinear separating surface.Compared with the constrained nonlinear programming problem of solving the original SVM,it cuts the computational difficulty and computation time,reduces the computer storage requirement,and besides,its performance is better than the standard SVM to some extent.However,because of the random samples lack of representative characteristic,the results are unstable.In this thesis,an effective method is proposed.Firstly,find out the optimal clustering numbers of each kind of datasets using subtractive clustering,then select the samples belonging to each clustering center of each kind by FCM method,extract some samples and apply them to RSVM algorithm to get the Modified Reduced Support Vector Machine(MRSVM) algorithm in order to enhance the classifier's stability.The simulation results indicate that the time running the program is less,training error and testing error are also smaller than that of using original RSVM on the same dataset.

Keywords/Search Tags:

feature selection, sample selection, orthogonal design, support vector machine, fuzzy clustering

PDF Full Text Request

Related items

1	The Research Of Support Vector Machine Based On Sample Selection
2	Support Vector Machine Based On Boundary Sample Selection
3	The Key Techologies Of Fuzzy Support Vector Machine
4	Research On Method And Application Of Fuzzy Support Vector Machine With Feature Selection
5	A Study On Feature Selection Algorithms Based On Support Vector Machine And Its Application
6	Research On Feature Selection Algorithm Based On Similarity
7	Sample Selection And Fuzzy Rough Set Based Soft Margin Support Vector Machine
8	L_p Regular Izat Ion In Support Vector Machine For Features Selection
9	Research On Human Activity Recognition Based On Triaxial Accelerometer
10	Research And Application Of Network Intrusion Detection Technology Based On Active Learning Support Vector Machine