Research On Topology Relation-based Distance Metric And Clustering Algorithms

Posted on:2018-06-04

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Guang

Full Text:PDF

GTID:2348330536987924

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Clustering analysis, which is an important part of machine learning, has attracted much attention to study. In cluster analysis, distance metric is an important factor that affects the accuracy of algorithms.In traditional clustering algorithms, Euclidean distance is often used to measure the similarity between two samples and divide the sample sets. Although Euclidean distance is easy to understand and imple-ment, it assumes that the input space is isotropic. However, the assumption of isotropy is too harsh but can't always be guaranteed. In addition, Euclidean distance only considers the similarity between the two samples, while ignoring the information of all other samples. In this paper, we propose two kind of new distance metrics that can be used to discover the topological relationships among samples, and our new methods doesn't require the input space is isotropic, that is said that distance between two samples can be unequal. The main innovation and work of this paper are summarized as followsFirst, a new effective distance metric based on sparse reconstruction is proposed. In our method, we evaluate the similarity between two samples by using not only the distance between these two samples,but also distances between one specific sample and all the other related ones.Sparse reconstruction coef-ficients are employed to reflect such global relationship among samples. Then, we develop four effective distance-based clustering algorithms by applying the effective distance to three classical clustering algo-rithms, i.e., K-means, K-medoids, FCM and spectral clustering algorithms, respectively. Experimental results on UCI Benchmark datasets demonstrate the efficacy of our proposed methods.Second, a novel spectral clustering method with mixed Euclidean and Kendall Tau metrics is pro-posed. By our method, similarity between pairs of samples and their neighbors are both considered for learning the underlying structure of the dataset. Specifically, the new similarity metric is a fusion algorithm, which outputs enhanced metric by combining multiple metrics; i.e., Euclidean metric and Kendall Tau metric. Moreover, the proposed method utilizes the non-linear fusion of different similarity metrics to tackle the dataset from different aspects; and thus can effectively utilize different informa-tion from the data structure. Experimental study on various datasets demonstrates that the proposed approach achieves superior performance to conventional methods.Experimental study on various datasets demonstrates that our proposed two new distance metrics are effective and can improve the accuracy of algorithms.

Keywords/Search Tags:

Clustering analysis, distance metric, effective distance, Kendall Tau distance, similarity fusion

PDF Full Text Request

Related items

1	Research On Distance Margin-based Deep Discriminative Clustering Methods
2	Research On Fuzzy Clustering Algorithm Based On Distance Metric
3	Metric Learning And Clustering Algorithm Based On Transitive Distance
4	Implementation Of Hotspot Clustering Method Based On Improved Tangent Space Distance Metric
5	Research On Clustering Algorithm Of High Dimensional Data And Its Distance Metric
6	Semantic Distance Metric Learning And Its Application In Multimedia Content Analysis
7	The Study And Application Of Evolutionary Clustering Algorithm Based On Manifold Distance And Kernel Function
8	Research On Person Re-Identification Across Cameras
9	Target Similarity Measurement Algorithm Based On Wasserstein Distance
10	Research On Distance Metric Learning For Person Re-identification