Research Of Incorporating Side Information Into Multivariate IB Method For Multi-view Clustering

Posted on:2015-06-24

Degree:Master

Type:Thesis

Country:China

Candidate:R N Liu

Full Text:PDF

GTID:2298330431995526

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The data in most of current real word applications is often complex andhigh-dimensional, which always contains multiple reasonable clusterings. Analyzingthe data from different views can help us understand data more comprehensively.However, traditional clustering algorithms focus on learning a single good clusteringsolution, which is difficult to put an accurate interpretation on the complex data.This issue has recently led to the emerging research area of multi-view clustering.Multi-view clustering tries to discover multiple clustering solutions resided in data.Existing multi-view clustering algorithms have the problems of cannot or can onlyincorporating one known clustering partition, applicable data being limited, needingto specify parameters that are not easy to choose in advance, etc. To solve the aboveproblems, this paper incorporates side information into the multivariate informationbottleneck (IB) method, and proposes a new objective-function-oriented multi-viewclustering algorithm, named SmIB, to iteratively discover multiple non-redundantand high-quality clustering solutions given one or more existing clusteringpartitions.SmIB algorithm takes the known reference clustering partitions as sideinformation and incorporates such information into the multivariate IB method. Onone hand, based on the basic idea of multivariate IB method, it utilizes two BayesianNetworks for specifying the trade-off terms: which variables to compress and whichinformation terms should be maintained, and preserves the relevant featureinformation of data as much as possible during clustering, through what it getshigh-quality clustering solutions. On the other hand, it takes known data partitions asside information and integrates them into the Bayesian Networks to constrainobjective clustering results, so that the objective clustering solutions arenon-redundant from existing clustering partitions. SmIB algorithm adopts mutualinformation and nonparametric MeanNN differential entropy estimator to measurethe preserved relevant information, through what it is not only suitable for analyzing co-occurrence data, but also suitable for analyzing Euclidean space data. Besides,SmIB algorithm has the ability to discover both linear and non-linear clusteringpartitions resided in data. The experimental results on synthetic, co-occurrence,Euclidean space datasets demonstrate that SmIB algorithm can discover multiplereasonable clustering solutions resided in different types of data effectively. Itsperformance is superior to the existing state-of-the-art traditional clusteringalgorithms and three existing multi-view clustering algorithms.

Keywords/Search Tags:

multi-view clustering, side information, multivariate IB methodmutual information, MeanNN differential entropy estimator

PDF Full Text Request

Related items

1	The Research On Side Information Generation Technology For Multi-view Distributed Video Coding
2	The Research On Side Information Generation For Distributed Multi-view Video Coding
3	Research On Improved Multi-view Fuzzy C-Means Clustering Based On Information Entropy And Multiple Kernel Learning
4	Research On Mutual Information Hierarchical Clustering Based On Grassberger Entropy Estimator
5	Side Information Fusion And Reconstruction For Distributed Video Coding
6	Research On Side Information Generation And Fusion Algorithm For Multi-view Distributed Video Coding
7	Multi-feature Clustering Based On Multivariate Information Bottleneck Method
8	Research On Multi-information Fusion Of Distributed Sensors In Two-phase Flow Based On Complex Network Theory
9	Research On Multi-view Learning Under Complex Application Situations
10	Study On The Balance System Based On The Multivariate Mutiscale Entropy