Font Size: a A A

Study Of Graph-based Feature Extraction And Feature Selection With Their Applications

Posted on:2018-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:M D YuanFull Text:PDF
GTID:1368330542473068Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
High-dimensional data exists widely in many practical applications of data mining,computer vision and pattern recognition.It provides people with a lot of new opportunities,but also brings numerous challenges.On the one hand,the rich information contained in high-dimensional data expands people's understanding of the objective things.On the other hand,high-dimensional data also increases the time and space complexity of the data processing procedure and storage,and leads to potential“curse of dimensionality”and“overfitting”issues.In addition,a large number of redundant,unrelated and even noise features in high-dimensional data can seriously deteriorate the classification,clustering,and visualization performance of data.Dimensionality reduction of high-dimensional data is an important technique and effective means to solve the above mentioned problems,whose purpose is to obtain the compact and effective low-dimensional representations of data.As two different dimensionality reduction methods,feature extraction?or feature transformation?and feature selection have attracted widespread attention.Feature extraction transforms the original high-dimensional feature space into a lower dimensional feature space,and belongs to the process of feature generation.The resulted new features are the linear or non-linear combination of the original features.Based on certain criterion or basis,feature selection selects an optimal subset of features from a large number of high-dimensional features,and the selected features maintain the physical meanings of the original features.The theoretical framework of graph embedding reduces most of the dimensionality reduction methods to graph construction and their embedding forms,of which graph construction serves as the most important thing.Different ways of graph construction reflect different aspects of the data.In this dissertation,we take the graph construction and its applications as the main line and high-dimensional and small sample size?SSS?data as the research goal,and propose several feature extraction and feature selection methods to address the problem such as larger reconstruction error and insufficient discriminative power existed in some methods.The main work and contributions of this dissertation are summarized as follows:?1?Inspired by locally linear discriminant embedding?LLDE?,we propose a collaborative representation discriminant embedding?CRDE?method to address the insufficient discriminative power of collaborative representation based projections?CRP?,and apply it to image feature extraction.CRDE firstly performs the graph construction by collaborative representation and produces the cost function to model the collaborative reconstruction relationship of data based on the resulted graph;it then takes the modified maximum margin criterion?MMC?as the regularization term to explicitly introduce the discriminant information.Therefore,it is more suitable for classification tasks.In addition,further analysis of CRDE from the perspective of graph embedding demonstrates that many popular feature extraction methods,such as locality preserving projections?LPP?,neighborhood preserving embedding?NPE?,sparsity preserving projections?SPP?,CRP,discriminant sparse neighborhood preserving embedding?DSNPE?and etc.,can all be unified into CRDE framework.The effectiveness of CRDE is verified from three aspects,namely recognition rate,parameter analysis and time for feature extraction.?2?By a deep analysis of the problems such as larger reconstruction error?or lower reconstruction accuracy?and insufficient discriminative power existed in regularized least square based discriminative projections?RLSDP?,we propose an enhanced method ERLSDP.In ERLSDP,for each sample,we use all the representation coefficients of the related sample for reconstruction,and thus solve the problem of larger reconstruction error in RLSDP.Then,we build a diagonal weight matrix?corresponding to the intra-class compact graph?to characterize the within-class geometric structure of data,and utilize it to explicitly minimize the distances between all the intra-class samples?More specifically,each sample and all other reconstructed samples with the same class label as it?.Therefore,the within-class samples will be more compact.Finally,the optimal projection matrix of ERLSDP is obtained by maximizing the between-class divergence information and simultaneously minimizing the distance between all within-class samples.The effectiveness of ERLSDP is verified by face recognition experiments under no occlusion and occlusion conditions.?3?There are two issues in traditional linear discriminant analysis?LDA?that the local information of data is lost and the number of the available projection vectors is limited.To deal with these two issues,we integrate collaborative representation graph?2L-graph?and put forward a collaborative preserving Fisher discriminant analysis?CPFDA?method.Due to the local property of the collaborative representation coefficients,CPFDA can be viewed as a new method that fuses both the local geometry and the global discriminant information.The advantage of CRDE is that it can preserve the collaborative reconstruction relationship of data and inherits the global discriminant characteristic of LDA simultaneously,and thus it can obtain better performance.Theoretical and experimental results show that LDA can get more meaningful projection vectors than CPFDA?Specifically,the number of the available projection vectors of CPFDA is twice that of LDA?.Further analysis of CPFDA reveals that both LDA and marginal Fisher analysis?MFA?can be considered as special cases of CPFDA.The performance of CPFDA is further improved by applying the Gabor features to CPFDA.?4?To overcome the problem that simultaneous orthogonal basis clustering feature selection?SOCFS?fails to exploit the local geometric information of data and integrate L2,p norm,we propose a locality preserving orthogonal basis clustering feature selection?LPOCFS?method.LPOCFS is based on SOCFS,but it possesses more important characteristics.Firstly,LPOCFS constructs a local affinity graph to explicitly characterize the local geometric structure of data,So that a more discriminant feature subset can be selected.Secondly,a L2,p?0?27?p?1?norm constraint is imposed on the feature selection matrix of LPOCFS,and thus it has more flexibility in controlling the sparseness of the feature selection matrix.Finally,in the optimization process,the cluster indicator matrix?CIM?is usually difficult to meet the orthogonality and nonnegativity at the same time.We propose two optimization strategies,and denote the corresponding algorithms as nLPOCFS and oLPOCFS.oLPOCFS keeps more emphasis on the role of orthogonality of CIM while nLPOCFS is more focused on the nonnegativity of CIM.Experimental results demonstrate the effectiveness of both oLPOCFS and nLPOCFS.
Keywords/Search Tags:dimensionality reduction, feature extraction, feature selection, face recognition, graph embedding, collaborative representation, sparse representation
PDF Full Text Request
Related items