Font Size: a A A

Clustering Ensemble Based On Nonnegative Matrix Factorization

Posted on:2019-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:W T YeFull Text:PDF
GTID:2348330563454546Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cluster analysis is one of the most powerful tools for data mining analysis.Clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups.Clustering ensemble inputs several clustering results to get a data partition,which at the most extent is composed of the clustering information described by several clustering results.Compare with single clustering,clustering ensemble has the characteristics of robustness,applicability and stability.and has inherent advantages in parallel processing of subsets.Dark knowledge is the knowledge in ensemble learning,which hides in learning machine and is benefit to the performance of ensemble learning.Traditional clustering ensemble methods only use labels produced by base learning algorithms to obtain an ensemble result.These base learning algorithms can also obtain other information,such as parameter,covariance,or probability data,which can be called dark knowledge.In this thesis,we develop the concept and construction method of dark knowledge,and apply it to clustering ensemble.This provides more information about the base clustering and prevent clustering ensemble model from the limit of discrete data.Non-negative matrix factorization(NMF)can be used to map high dimensional data to low dimensional space,which belongs to feature extraction method.Based on dark knowledge,we propose non-negative matrix factorization clustering ensemble model(NMFCE).First,different base clustering results are obtained by various clustering algorithms,then dark knowledge of every base clustering algorithm is extracted.NMF is then applied to the dark knowledge to obtain integrated results.Experimental results show that the method outperforms other clustering ensemble techniques.We can often get some extra supervised knowledge in practical application of clustering task.Semi-supervised clustering can use supervision knowledge to guide the unsupervised learning process.In this thesis,a semi-supervised clustering ensemble model based on NMF is proposed.Firstly,the model uses the gaussian kernel function to construct the similarity matrix based on dark knowledge.Secondly,the model uses the constraint technology to add supervised information.Finally,it combines with NMF to get the clustering results.Experimental results show that semi-supervised NMFCE model have better performance than NMFCE,and semi-supervised NMFCE used the supervised knowledge effectively.
Keywords/Search Tags:clustering ensemble, semi-supervised clustering ensemble, non-negative matrix factorization, dark knowledge, similarity matrix
PDF Full Text Request
Related items