Research On Manifold Embedding Matrix Factorization Algorithm  Posted on:20180808  Degree:Doctor  Type:Dissertation  Country:China  Candidate:X Li  Full Text:PDF  GTID:1368330575978865  Subject:Control Science and Engineering  Abstract/Summary:  PDF Full Text Request  With the rapid development of information technology and the wide spread use of Internet,the processed datas have many characteristics such as massive information,highgrowth,highdimensionality,and nonlinearity.How to quickly and effectively deal these complex massive data and extract the valuable information needed by users have always been a common concern in the field of pattern recognition and computer vision.As a new nonlinear dimensionality reduction method,matrix factorizationmethod,which decomposes a high dimension matrix into two or more matrixes,has become a new focus in machine learning.From manifold learning perspective,we explore how to effectively extract the valuable information of data,which embedded in the highdimensional space.Because of defects of traditional manifold learning methods,such as the simplistic graph,the complex distribution of samples,the limitations of single layer decomposition,and these methods do not make full use of the geometric structure information or neglect the label information of labeled data,thus,we give an analysis of these problems and provide the solution in this paper.The major contributions in this thesis are summarized as follows:1.The Global Data Nonnegative Matrix Factorization(GDNMF)is proposed.KNearest Neighbor is employed to code the structure information of data in traditional graph methods,which is not enough to completely extract the geometric structure information due to the complex distribution of the samples.In addition,the optimal value of K is changed in different datasets,thus,how to select an appropriate K is always a difficult problem in graph learning methods.GDNMF keeps the topological relations between a sample and other points constantin low dimension space.Additionally,to learn better data representation and reduce redundancy,we also require that different bases should be as orthogonal as possible.Therefore,the data structure informations in new representation space are consistent with original datas via GDNMF.Finally,GDNMF is verified on ORL,USPS,and OUTEX datasets;2.The Structured Discriminative Nonnegative Matrix Factorization(SDNMF)is proposed for hyperspectral unmixing.SDNMF preserves the structural information of hyperspectral data by introducing structured discriminative regularization terms to model both local affinity and distant repulsion of observed spectral responses.Therefore,SDNMF takes advantage of local affinity property of data to guarantee similar raw data having similar abundance,and simultaneously ensure that dissimilar data have different estimated abundance.In addition,due to the low resolution of the spectral imager and the complex distribution of hyperspectral data,SDNMF is inaccurate to determine their endmembers only takes the distances between pixels as an evaluation criterion,especially for the materials which distribute at the junction.Therefore,we futher propose Global centralized and Structured discriminative Nonnegative Matrix Factorization(GSNMF)method for hyperspectral unmxing.In GSNMF,the structured discriminative regularization term and the global centralized clustering are imposed to NMF framework,which is helpful to discover the underlying geometrical data structure and the characteristic between different categories of signatures,respectively.Through maintaining the global centralized clustering and local structured discriminative regularization,GSNMF drives a discriminative representation of the spectra and the obtained fractional abundances can well coincide with the real distributions of constituent materials.The experimental results on synthetic dataset and the real hyperspe ctral image datasets(Urban and WashingtonDC)have demonstrated the effectiveness of the proposed SDNMF and GSNMF;3.The traditional Concept Factorization(CF)may yield inferior results as their factorization procedures only perform on single layer,to solve this issue,we propose Multilayer Concept Factorization(MCF),based on the hierarchical data representation.MCF is a cascade subsystem to decompose the observation matrix iteratively into a number of layers.With the sequential decomposition process,the feature matrix obtained via MCF is a cascade system which befinites the performance.Inspired by the manifold learning,we propose an extension of MCF,namely Graph regularized Multilayer Concept Factorization(GMCF),GMCF further incorporates graph Laplacian regularization in each layer to efficiently preserve the manifold structure of data.Generally speaking,multilayer matrix factorization methods consistently achieve better performance than their corresponding single layer methods.The experiments results on TDT2 corpus,COIL20,and NJUrobt datasets have demonstrated the proposed multilayer methods,i.e.,MCF and GMCF,can effectively improve the accuracy and normalized mutual information in clustering;4.The manifold learning methods of simple graph model ignore the highorder relationship between data points,we propose a novel algorithm,called Hypergraph Regularized Concept Factorization(HRCF)to solve this shortcoming.HRCF considers the highorder relationship of samples by constructing the hyperedge in hypergraph with a subset of data points sharing with similar attribute.HRCF preserves the highorder relationship of the manifold structure with resorting to add the Hypergraph regulation term to CF framework.In order to consider the label information of data,we further proposes Hypergraph regularized Constrained Concept Factorization(HCCF).HCCF not only extractsthe multigeometry information of samples by constructing an undirected weighted hypergraph Laplacian regularize term,but also takes full advantage of the label information of labeled samples as hard constraints to preserve the label consistent in lowdimensional space.The experiment results On Reuters corpus,MNIST,and OUTEX datasets demonstrate that HRCF and HCCF based on Hypergraph own the superiorities interms of data representation and clustering performance than other compared methods.  Keywords/Search Tags:  Manifold Embedding, Nonnegative matrix factorization, Concept factorization, Orthogonal, Hyperspectral unmix, Structured discriminative, Cluster, Multilayer factorization, Graph Laplacian, Hypergraph, Semisupervised, Hard constraint  PDF Full Text Request  Related items 
 
