Font Size: a A A

Research On Multi-kernel And Adaptive Similarity Measured Spectral Clustering And Ensemble

Posted on:2021-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Augustine MonneyFull Text:PDF
GTID:1488306128465264Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cluster analysis is an important tool of data analysis for revealing category attribute in data as it organizes data into clusters or groups based on similarities.Lately,many multiple kernel clustering methods have been proposed using complementary information that exists in multiple views from multiple kernels to improve the performance of clustering.Though,these methods have been somewhat successful,it is still challenging choosing appropriate kernels and their importance in a clustering task.Additionally,noise handling is still not optimized.Considering the fact that data in real world is mostly non-linear,kernel clustering and for that matter multiple kernel clustering has myriad range of applications in data mining and pattern recognition.The objective of this dissertation is to improve clustering performance using multi-kernels and adaptive neighbors as well as ensemble learning.In this dissertation,the background of the study and the application areas of clustering were firstly outlined.After which the research status is discussed.Throughout this current research,a significant progress has been made in data cluster analysis.To address the problem of improving the performance of spectral clustering,this dissertation proposes three novel methods.One is a multiple kernel method of measuring adaptive similarity for spectral clustering.Another one is a co-regularized discriminative spectral clustering method with adaptive similarity measure in a dual-kernel space.The last one is a robust discriminative multi-kernel based spectral clustering ensemble method.The main contents of the dissertation are as follow.(1)A multi-kernel method of measuring adaptive similarity for spectral clustering is proposed.Based on the adaptive neighborhood in multiple kernel space,the methodology learns the similarity of data points.Kernels with more accurate adaptive similarity measure on the data automatically obtain bigger weights and hence an optimum kernel that truly reflects the internal structure of the data points is obtained.The method ascribes adaptive and optimal neighbors to each data point based on the local structure using combined kernel.The combined similarity measure obtained is sparse and gotten with the combined kernel from the weighted sum of the various kernels.Considering similarity measurement and data clustering in two separated steps may lead to suboptimal results in this method,therefore,the data similarity matrix and clustering structure are learnt simultaneously.The presented technique is able to search the underlying similarity relationships amongst data points and is robust to complex data.Experimental comparison and analysis with other state-of-the-arts methods using accuracy and normalized mutual information show that this method has better clustering performance.(2)A co-regularized discriminatory spectral clustering in dual-kernel space is also proposed.Many past studies based on spectral clustering give little consideration to the problem of the global discriminable structure of the dataset.This method considers retaining global geometric information and global discriminative information for optimizing the clustering performance.Also,although previous studies have shown that using more than one kernel in clustering can result in a more accurate clustering than those obtained with a single kernel,the benefits of using more than one kernel have not been fully exploited with respect to spectral clustering.At the same time,multi-kernel approaches tend to be more time consuming compared to single kernel methods.In order to improve the accuracy of spectral clustering and the faster processing speed,the method integrates a global discriminative term into the clustering with an adaptive neighbor framework using two heterogeneous kernels.And the approach looks for clustering consistent across the two kernel views to detect the non-linear intrinsic geometrical information of the dataset.Clustering is performed using the obtained indicator matrix from the modified Laplacian utilizing k-means.Experimental comparison and analysis with other state-of-the-arts methods using accuracy and normalized mutual information show that this method improves the data clustering performance,and has certain ability of noise resistance and faster processing speed.(3)A robust discriminative multi-kernel based spectral clustering ensemble method is proposed.Usually,real data includes corrupted parts which make the learned graph to be inaccurate or unreliable.In order to further improve the clustering performance and noise resistance,based on the graph learning scheme,this method integrates the discriminative multi-kernel spectral clustering to learn reliable graphs from real-world noisy data by adaptively removing noise and errors in the raw data.The study incorporates discrimination into the idea of building a similarity graph based on clean data.It explores the non-linear feature space of data sets by projecting them into high dimensional spaces to learn adaptively the optimal neighbors of each data point in these spaces.At the same time,in multikernel space(RDSC-MK),multiple discriminative kernel spectral clustering methods are mutually constrained and integrated.The method compared with related methods in extensive experiments on synthetic and real datasets show that the accuracy and normalized mutual information results of this method has higher clustering accuracy and also has better noise resistance.
Keywords/Search Tags:Adaptive neighbors, Multi-kernel, Similarity measure, Spectral clustering, Ensemble
PDF Full Text Request
Related items