Research And Application On Nonnegative Matrix Factorization Algorithm For Clustering

Posted on:2020-02-12

Degree:Master

Type:Thesis

Country:China

Candidate:B C Fan

Full Text:PDF

GTID:2428330575493579

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As an important data analysis method,clustering algorithms have been widely applied in all kinds of fields.Clustering algorithms are able to assign samples into different clusters according to the features and the similarities between them,and the samples belong to the same cluster should be as similar as possible,the samples belong to different clusters should be as dissimilar as possible.We can have a fast understanding of the structure of the data and find the underlying pattern of each cluster by using clustering algorithms to further analyse data.So far,various clustering methods have been developed to solve different problems.Among them,Nonnegative Matrix Factorization(NMF)is a popular method,which decomposes the nonnegative input matrix into the form of the product of the nonnegative basis matrix and the nonnegative representation matrix.Due to only the nonnegative elements are involved,NMF model has better interpretation and is consistent with the requirement of pratical application,such as image,gene expression data,and spectral data.Meanwhile,nonnegative constraint also guarantees that each data sample is represented by the nonnegative linear combination of nonnegative basis vectors,which reflects the mapping from whole to parts.However,there are still some issues of NMF:1.the generation of basis matrix is based on algebra;2.the basis vectors may have correlations under only nonnegative constraints;3.NMF neglects the correlations of local information.In order to overcome the abovementioned issues,based on nonnegative matrix factorization we propose three algorithms.The main research work and achievements are as follows:(1)A density peaks based multiple centroids nonnegative matrix factorization algorithm is proposed to capture the manifold information or structure information in the original data.It may not be optimal for the complex data.Clustering results.In order to capture the local structure of the original data,first select multiple density peaks from data points.Then,set the linear combination of these density peaks as the initial cluster centroids to obtain the relationship between the data points and the centroids.Finally,assign every data point to a centroid according to its max similarity.Degrees are completed.Experiments were carried out on several synhetic datasets and real datasets.The propomising experimental results demonstrate that the method can effectively introduce the local structure and improve the clustering effect.(2)A Dropout based deep semi-negative matrix factorization model is proposed.NMF factorizations the input matrix into the production of a basis matrix and a representation matrix.The nonnegative constrains cannot guarantee that the latent features in basis matrix are orthogonal and non-overlapping.In order to break the co-occurrences between latent features and decrease the redundancy,A Dropout based semi-negative matrix factorization model is proposed.By incorporating the deep model,a Dropout based deep semi-negative matrix factorization model is further proposed.(3)A nonnegative matrix tri-factorization model for biclustering is proposed.In order to find out the important features of the high-dimensional data while clustering,we clustering the data according to the correlation between features and and the correlation between samples.According to manifold assumption,the feature neighbor graph and the sample neighbor graph are constructed for the features and the sample,and the manifold regular terms are constructed based on them.It is well known that Frobenius norm is sensitive to noise and outliers.To overcome this weakness,we use L2,p norm in the cost function.Experiment results on real high-dimensinoal gene expression dataset show the superiority of the model.

Keywords/Search Tags:

clustering, NMF, density peak, deep model, graph regularization

PDF Full Text Request

Related items

1	Research And Application Of Density Peak Clustering Algorithm Based On Density Decay Graph
2	Research On Improved Density Peak Clustering Algorithm
3	Research And Application Of Clustering Algorithm Based On Density Peak
4	Research On Application And Optimization Of Density Peak Clustering
5	Research On Density Peak-based Clustering Algorithm And Its Parallel Implementation
6	Research On Skin Detection Algorithm Based On Deep Learning And Density Peak Clustering
7	Research On Quick Clustering Algorithm Based On Density Subgraph
8	Research On Deep Semi-supervised Clustering Algorithm
9	Research And Application Of Density Peak Clustering Algorithm
10	The Research And Application Of Density Peak Clustering Algorithm