Feature extraction and classifier design are important research problems in the field of pattern recognition and machine learning. In 1995, based on the structural risk minimization theory, support vector machine(SVM) algorithm was proposed by Vapnik et al. It becames one of the most effective algorithms to deal with small sample size classification problems for its perfect mathematical theory and well avoiding the curse of dimensionality. Although SVM shows many unique advantages in tackling small sample size and high dimensional problems, it will appear some problems such as slow training speed and inefficiency when the size of training set is too large. Two articles of the generalized eigenvalue proximal support vector machine(GEPSVM) and twin support vector machine(TWSVM) were respectively published on the top international journal of artificial intelligence(TPAMI) in 2006 and 2007, which means the thought of parallel hyperplane support vector machine transfer to nonparallel hyperplane support vector machine. In recent years, the nonparallel hyperplane support vector machine algorithms have been widely and further researched, which gradually becomes a new hot research topic in pattern recognition. In addition, feature extraction is one of the crucial steps in the process of pattern recognition and how to effectively extract features is still one of the hot research issues in pattern recognition. Thus, the algorithms of nonparallel hyperplane support vector machine and the methods of feature extraction are studied. In general, the main works of this thesis include the following several aspects.(1) Some novel twin support vector machine algorithms are proposed. Based on the empirical risk minimization theory, twin support vector machine(TWSVM) algorithm was proposed, which needs to solve two relative small size quadratic programming(QP) problems. However, when the scale of dataset gradually increasing, the solutions to QP problems maybe more time-consuming. In oreder to solve above problems, inspired by the idea of fitting, we propose a novel twin support vector machine(NTSVM) algorithm. In addition, successive overrelaxation(SOR) algorithm is presented to solve the QP problems in NTSVM quickly.(2) Some novel projection twin support vector machine algorithms are proposed. First, projection twin support vector machine(PTSVM) algorithm was proposed by using the idea of linear discriminant analysis. PTSVM seeks the two optimal projection directions to deal with binary classification problems. In order to obtain the two optimal projection directions which can well preserve local geometric feature of the data, we propose a locality preserving projection twin support vector machine(LPPTSVM) algorithm by introducing the idea of locality preserving projection and regularization technique into PTSVM. The algorithm also gives a nonlinear form by using the method of empirical kernel mapping, makes up for the imperfection of the original PTSVM which did not give the nonlinear form. Second, the linear form and nonlinear form of PTSVM are established by solving two different optimization problems. In oreder to overcome this shortcoming, we propose an improved version of projection twin support vector machine(IPTSVM) algorithm. The algorithm firstly constructs the linear IPTSVM, and then uses kernel technique to extend it to nonlinear form directly, which inherits the essence of traditional SVM. In addition, it solves the problem of PTSVM which needs to compute inverse matrix before training.(3) An improved generalized eigenvalue proximal support vector machine(IGEPSVM) is proposed. GEPSVM is the earliest research about nonparallel hyperplane classification algorithm, it exists the inconsistency problem in the process of training and decision making. For solving this problem, we propose a proximal support vector machine algorithm based on eigenvalue decomposition. At first, IGEPSVM is proposed for binary classification problem, the solution to IGEPSVM is transformed into solving the standard eigenvalue decomposition, which reduces the computational complexity of IGEPSVM. Then, IGEPSVM is extended to multi-class classification problem based on "one versus all" strategy, which expands the application range of the model.(4) Multiple birth least squares support vector machine(MBLSSVM) algorithm for multi-class classification problem is proposed, which is an improved version of multiple birth support vector machine(MBSVM). The solution to QP problem in MBSVM is transformed into solving system linear equations in MBLSSVM, so that it reduces the computational complexity and improves the performance of MBLSSVM. According to K class classification problem, based on the idea of "one versus all" strategy, we construct a least squares twin support vector machine(LSTSVM) algorithm to obtain one classification hyperplane for each class. After getting all the K hyperplanes, classification decision making is based on the farthest distance. It is different from Multi-LSSTVM, which directly extends LSTSVM to multi-class classification problem based on "one versus all" strategy.(5) A new pattern classification method is proposed by construcing within-class auxiliary training samples. Firstly, we adopt the linear interpolation method to construct within-class auxiliary training samples as new training samples in each class, and then based on the new training samples to extract features using PCA or 2DPCA. In addition, two different strategies are given to extend PCA to KPCA. To a certain extent, the proposed method both considers the overall divergence and within-class divergence of training sample set, which can extract the features with identification information. Experimental results obtained on several datasets illustrate the effectiveness of the proposed method. |