Font Size: a A A

Clustering Algorithm Based On Robust Non-negative Matrix Factorization

Posted on:2021-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShenFull Text:PDF
GTID:2518306548494544Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
COVID-19 outbreak in the world,in the face of a large number of travel track,resource allocation and epidemic prevention and control data,how to extract useful information for the epidemic joint prevention and control is imperative.As a basic tool of data processing,clustering plays an active role in China's precision policy process.Due to the interpretability of non-negative matrices,it is known as learnt parts-based representation has become a data analysis tool for clustering tasks.It has become a data analysis tool for clustering tasks.Non-negative matrix factorization(NMF)is a classical data analysis tool for clustering tasks.It usually considers the squared loss to measure the reconstruction error,thus it is sensitive to the presence of outliers.To solve the problem,we first studies the robustness of NMF and proposes two robust NMF models:1.We rethink the hyperbolic tangent(tanh)function as a robust loss to evaluate the reconstruction error and propose a robust NMF model based on the parameterized hyperbolic tangent function(tanh NMF);2.We take a further step to consider the robust issue of similarity reconstruction and explore a robust similarity-based concept factorization model(RSCF),.The main work of this paper is:1)To restrain the effect of outliers on the loss function and enhance the robustness of model,we propose a robust NMF model based on the parameterized hyperbolic tangent function called tanh NMF.Moreover,to capture geometric structure within the data,we devise a locality constraint to regularize tanh NMF to model data locality.Experiments of face clustering on four popular facial datasets with/without corruptions show that the proposed method achieves the satisfactory performance against several representative base-lines including NMF and its robust counterparts.2)We also proposes an improved model of concept factorization,that is,Similarity based CF(SCF).SCF considers the similarity of reconstructed samples by CF is close to that of original samples,which improves the clustering performance.Besides,we take a further step to consider the robust issue of similarity reconstruction and explore a robust SCF model constrained by l_? norm(Robust SCF,RSCF).Thus,RSCF enjoys similarity preservation,robustness to similarity perturbation,and ability of reconstructing samples.Extensive experiments validate such properties and show that the proposed SCF and RSCF achieve large performance gains as compared to their counterparts.
Keywords/Search Tags:Clustering, Non-negative matrix factorization, parameterized hyperbolic tangent function, Robustness, locality constraint, Similarity reconstruction, l_? norm
PDF Full Text Request
Related items