Font Size: a A A

Research On Training Error Minimized Subspace Algorithm For Classification

Posted on:2009-03-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y ShenFull Text:PDF
GTID:1118360242495812Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As a significant research direction, Subspace Methods attract wide attention from the scholars in the field of Pattern Recognition. Fisher Linear Discriminant Analysis (FLD or LDA) and other related subspace methods exert outstanding effects in classification problems. However, these subspace methods have some shortcomings. The main shortcoming is that traditional subspace methods, such as LDA, do not relate the feature extraction criteria directly to the training error, but relate to the statistical feature of the distribution (generally assumed as Gaussian) of training data. Thus, when the statistical feature is not able to reflect the data distribution properly, these methods will probably fail. As a result, traditional subspace methods are not competent for the problems with complex data distribution. In this dissertation, all proposed methods are to solve this problem.In chapter 3, we first point out that in multi-class problems, even that each class bearing a Gaussian homoscedastic distribution, LDA may fail in some cases. Then, through analyzing the relationship between the data distribution and the projection directions of LDA, we discuss that LDA result relates to the eigenvalues of the inter-class scatter matrix and the intra-class scatter matrix. Based on this observation, we propose a modified LDA method based on Genetic Algorithm. Aiming for the minimum training classification error, utilizing Genetic Algorithm, our method pertinently adjusts the eigenvalues of the inter-class scatter matrix to find the optimal feature subspace. Experiments on both synthetic data and real data show that the proposed method is superior to other linear subspace method.AdaBoost (Adaptive Boosting) algorithm, derived from the ensemble learning theory, is a learning method that directly related the training performance to the construction of classifier. In chapter 4, we propose a feature extraction algorithm based on boosting bootstrap LDA projections, which combines AdaBoost algorithm and LDA algorithm, to solve 2-class problems. AdaBoost algorithm is a learning framework that can boost a number of weak hypotheses to a strong classifier and it needs the weak hypotheses to be unstable and diverse. Therefore, we first utilize the Bootstrap Sampling method from the Bagging (Bootstrap Aggregating) algorithm to randomly sample the original training data into a number of bootstrap training subsets. Then we employ LDA and Nearest Neighbor (NN) classifier to make the same number of weak hypotheses from these training subsets, which are to be boosted by AdaBoost algorithm into a final classifier. This method overcomes the shortcoming of traditional subspace methods mentioned above. At the same time, it is proved to have good generalization performance and is qualified for classification problems with complex data distribution. Experiments on 2-class problems with complex distribution prove the feasibility and superiority of this method.Research on multi-class problems, such as face recognition tasks, is very valuable in application. Hence, in chapter 5, we use AdaBoost.M2 algorithm to generalize the method proposed in chapter 4 to solve multi-class problems and propose the method of boosting bootstrap LDA subspaces. In this method, we sophisticatedly improve the bootstrap sampling step to ensure more concentration on hard-to-classify samples by AdaBoost.M2 algorithm. Meanwhile, the diversity of weak hypotheses is kept so that the LDA based hypotheses can be boost and combine more efficiently. In experiments we compare our algorithm with the traditional subspace methods and other ensemble learning based algorithms on handwritten digit image recognition and face image recognition. The results show that our algorithm is superior or comparable to other methods.
Keywords/Search Tags:dimentionality reduction, subspace, linear discriminant analysis, genetic algorithm, ensemble learning, face recognition
PDF Full Text Request
Related items