Font Size: a A A

Sparse And Low Rank Theory And Their Applications

Posted on:2020-11-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:L X YangFull Text:PDF
GTID:1368330623458211Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of science and technology,we step in the big data age.Large scale data was produced every day in many domain.These massive data is beyond the ability of the common used tools to capture,manage,and process within a tolerable time.One key challenge of the big data applications is to extract meaningful information or knowledge to direct future decision.Thus designing the effective and efficient big data analysis methods,which able to capture the underlying patterns,correlations and other useful knowledge,becomes more and more important.Due to the inherent redundancy of the data,sparsity and low-rankness are widely existing in many big data analytic applications which provides a new option to deal with the Big Data applications.Although most data admit sparse representations,they are usually non-sparse in the original data space,and only be sparse after some linear transformation.For many applications,the sparsity level of data is able to affect the performance of denoising,classification and clustering significantly.Thus it is valuable to learn a set of basis from the data,under which the data has a sparsest representation.Apart from sparsity,lowrankness occurs in many applications,which can be used to perform factor analysis and feature subtraction.However,in many real-world applications,data is usually incomplete due to some physical limitations or other reasons,which makes many traditional data analysis methods cannot apply directly.Thus it is meaningful to investigate how to impute and analysis data when some part of data are missing.keeping these problem in mind,this dissertation first investigate the signal sparsity problem.Most existing signal sparsity approach,i.e.dictionary learning,are based on deterministic model,which alternatively learn the sparse representation and update the dictionary.The main drawback of these methods are they either need to know the noise level or sparsity level or need to predefine some regularization parameters.Knowing the noise level or sparsity level is impractical and improper parameters considerably degrade the performance of the methods,which limits the applications of these approaches.To address this problem,we model all the parameters as a random variables and impose prior distribution on them to ensure they can be learned from the data.On the other hand,different with most existing Bayesian dictionary learning methods,the proposed one jointly learn the sparse representation and update the dictionary,which makes the proposed method unlikely to be stacked by undesirable local minimal.This method provides a new approach for the dictionary learning research.We next study the low rank matrix completion problem.Most existing low rank matrix completion methods can be classified into two class,i.e.the deterministic and the probabilistic methods.Compared with the deterministic methods,the probabilistic approaches usually do not need to pre-specify any regularization parameter and can achieve superior performance.Many current existing Bayesian methods are based on the factorization model,where the unknown matrix are assumed to be able to decomposed into two factor matrices and the low-rankness of the matrix is converted to the structural sparsity of the factor matrices.However,this factorization model prevent to utilize the additional structure of the signal matrix,and Bayesian methods usually suffer from high computational complexity.To address this problem,we propose a hierarchical Gaussian prior,which is a prior directly imposed on the signal matrix and do not rely on the factorization model,which empower the method be able to capture the additional structure of the signals.Moreover,to reduce the computational complexity we embed a generalized approximated message passing approach in our method to solve the matrix estimation problem.Compared with other Bayesian methods,the proposed one achieves better completion performance and run faster.In particular,the proposed method outperform the existing methods with a big margin when side information is available.We then investigate the low-rank tensor factorization problem.Different with the rank of the matrix,the multilinear rank of a tensor is more complicated,and it is nontrivial to generalize the matrix low-rank factorization to tensor decomposition.In this thesis,we propose a new conception called sub-tensor,and formulate the tensor low multilinear rank property as the structural sparsity of the core tensor.To facilitate the block sparsity of the core tensor,we impose a log-sum penalty on each sub-tensor.Considering that each entry in the core tensor is contained in multiple sub-tensor,we employ a majorization minimization technique,where the original loss function is downhill by alternatively reducing the a surrogate function.We finally yield an iterative reweighted low-rank tensor decomposition method.We finally study the data c lustering.The property that most of the data congregate around some centers is an important complementary for sparsity and low-rankness.In this thesis,we propose a graph embedding variational Gaussian mixture autoencoder for deep clustering,which is under the framework of variational autoencoder and combines the model based and similarity based clustering methods.Specifically,we replace the Gaussian prior in the variational autoencoder with a mixture of Gaussian distribution to ensure the method has the property of model based approaches.Besides,by pushing the distribution of features of the samples that connected on the graph to be similar,the method is empowered to be able to use the similarity information.After some mathematical tricks,we eventually yield a novel deep clustering method.The proposed method combine the model based and similarity based clustering method in first time,which makes the method to have good generalization ability and be able to utilize the graph information.
Keywords/Search Tags:Dictionary Learning, Sparse Representation Learning, Low-rank Matrix Completion, Low-rank Tensor Decomposition, Deep Clustering
PDF Full Text Request
Related items