Font Size: a A A

Study On Key Problems In Data Local Structure Information Based Feature Extraction

Posted on:2010-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ChengFull Text:PDF
GTID:1118360302471843Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Many real-world applications deal with high-dimensional data, especially in pattern recognition, machine learning, computer vision, and data mining. As a result, feature extraction and dimensionality reduction plays an important role in these fields to find the optimal low-dimensional data representation. Since linear feature extraction methods possess outstanding advantage over other methods, it has been regarded as the useful tool to achieve such purpose. In recent years, with the advances in manifold learning, local data structure has received much attention to exploit the latent information hidden in the data.Different from the global dimensionality reduction methods, manifold learning aims to find the low-dimensional manifold structure embedded in the input space. Traditional manifold learning algorithms, such as ISOMAP, LLE, and Laplacian Eigenmaps, seek the low-dimensional manifold in an unsupervised way. In order to exploit the local discriminative manifold structure, local discriminant analysis methods identify the underlying submanifold structures while employing discriminative information. Mathematically, they all can be reduced to a graph embedding framework with different construction criteria for the intrinsic similarity and penalty graphs. Nevertheless, these local discriminant methods still confront several limitations, such as the curse of dimensionality, incremental learning and so on.In addition, nonnegative matrix factorization (NMF) is a useful tool for data mining, which also mainly depends on the local structure of data and can be explained as an optimization problem with nonnegative bound constraints. In the literature, several NMF extensions have been developed with additive costs in application of machine learning and computer vision. In order to overcome the slow convergence rate problem of the popular multiplicative learning algorithms, several projected gradient based NMF algorithms are developed recently. However, due to the nonnegative bound constraints and the additive costs, the maladjusted learning problem in these NMF methods.In this dissertation, some key issues on the local data structure based feature extraction have been studied. The main contributions are listed below:①To address the curse of dimensionality, Principal Component Analysis (PCA) is adopted in the local discriminant analysis method to reduce the dimension primarily that may destroy the original manifold structure. Different from the existing algorithms, this work consider the discriminant embedding as a kernel approach in the sample space, and a kernel-view based discriminant method is proposed for the embedded feature extraction, where both PCA preprocessing and the pruning of data can be avoided. Extensive experiments on the high-dimensional data sets show the robustness and outstanding performance of our proposed method.②Though local discriminant subspace method is developed in accordance with the LDA formally, their theoretical properties are quite different from each other. Taking the local null space method as an example, the computational and theoretical foundation of local discriminant subspace methods is investigated. Firstly, this work seeks for the local discriminant null space according to the null space LDA (NSLDA) approach and reveal that its computational expense mainly depends on the quantity of connected edges in graphs, which may be still unacceptable if a great deal of samples are involved. To address this limitation, an improved local null space algorithm is proposed to employ the penalty subspace to approximate the whole local discriminant subspace so that more efficiency can be achieved. A comparative study on classification shows that the performance of the approximative algorithm is very close to the genuine one.③Since there is no available updating rule for local discriminant analysis with the additive data, it is difficult to design incremental learning algorithm. To overcome this limitation, this work devises a supervised learning method, called local discriminant subspace embedding (LDSE), to extract discriminative features. Then, the incremental mode algorithm, incremental LDSE (ILDSE), is proposed to learn the local discriminant subspace with the newly inserted data, which applies incremental learning extension to the batch LDSE algorithm by employing the idea of singular value decomposition (SVD) updating algorithm. Experiments on face data sets show the incremental method can achieve the similar recognition results with the batch method, while less computational cost is required.④The existing marginal learning method tries to find the local margin by seeking the whole extra-class data pairs, while supposing that the discriminative margin have the smallest between-class distances. Nevertheless, such learned margins still fails to provide the most discriminant power especially if the original data are multimodally distributed. Since the limited data pairs are employed in marginal learning to determine the discriminative margin, the obtained margin may not reflect the true distribution of data, and the projected samples can not be correctly classified in the low-dimensional subspace. Without loss of generality, the linear subspace learning algorithms can be explained as the enhancement of the affinity and repulsion of several data pairs in the low-dimensional space. Based on this point of view, a novel linear discriminant method, termed marginal discriminant projections (MDP), is proposed to learn the marginal subspace. Rather than the existing marginal learning method, the inter-class margins are attained by adopting a hierarchical fuzzy clustering approach, where the discriminative margin can be found adaptively and the iterative objective optimization is avoided.⑤In order to exploit the informative components hidden in nonnegative data, a novel NMF based learning method corresponding to information theoretic learning, namely ITNMF, is presented and the conjugate gradient method is used to enhance the iterative learning approach. Nevertheless, the maladjusted learning problem is unavoidable as some other extended gradient NMF algorithms during the iterations due to the nonnegative bound constraints. To remedy this limitation, a modified linear search criterion is proposed, which prevents the null trap with insured conditions while keeping the feasible step descendent. In addition, different from other gradient descent algorithms, the numerical stopping rule that is effective for nonnegative bound optimization problem, is preferred instead of the popular gradient based one. Experiments on several data sets with varieties of pose and illumination conditions reveal the proposed method possesses preponderant performance over other methods.
Keywords/Search Tags:Feature Extraction, Local Data Structure, Local Discriminant Analysis, Incremental Learning, Nonnegative Matrix Factorization
PDF Full Text Request
Related items