Font Size: a A A

Research On Local Density Clustering Algorithm

Posted on:2018-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:X Z YeFull Text:PDF
GTID:2348330518976635Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of computer technology and communication technology,the data accumulated from all walks of life is increasing rapidly.As a method to extract patterns and knowledge from massive amount of data,data mining has extensive application prospects.Clustering analysis is a significant method for data mining,which has drawn considerable attention from experts and scholars all over the world.Due to the prior knowledge widely existing in practical applications,semi-supervised clustering will make use of prior knowledge,which includes the cluster labels and information of pair-wise constraints,to modify clustering process of traditional unsupervised clustering analysis method with a small amount of supervised data in order to gain more accurate results.Local Density Clustering(LDC)algorithm is a new,fast and highly-efficient clustering algorithm published in 2014 in Science by Rodriguez and Laio.LDC algorithm can detect clusters of arbitrary shapes and process the distribution for points which are not cluster centers in one step without any further iteration.However,LDC algorithm still needs improvement in two aspects.One is that LDC is an unsupervised learning algorithm without considering any prior knowledge which does exist.The other is that LDC cannot identify the number of clusters and detect cluster centers automatically.Aiming at these aspects,this dissertation carries out an exploratory research:Based on LDC algorithm,a new Semi-Supervised Local Density Clustering(SLDC)algorithm is proposed.First of all,the algorithm utilizes a small amount of pair-wise constraints to adjust the distance matrix of points in LDC,and then improves the distributing process of the points by using the same cluster elimination to make sure that this process satisfies the limit conditions of pair-wise constraints,and finally solves the problem of constraint violation effectively.Then,since the accuracy of the clustering results for some special data sets would be affected by artificially specifying cluster centers,a Semi-Supervised Local Density Clustering with Automatic Recognition of Cluster Centers(Auto-SLDC)algorithm is introduced in this dissertation.Auto-SLDC algorithm expands the difference between the potential cluster centersand other points by exploiting the method of difference expansion,which enables the machine automatically to recognize the cluster centers and consequently avoids the errors caused by objectivity.Finally,the simulations justify the validity of the Auto-SLDC algorithm,and also indicate that the proposed algorithm can solve the problem of constraint violation effectively and the algorithm significantly improves the clustering performance of LDC algorithm.
Keywords/Search Tags:clustering analysis, local density, semi-supervised clustering, pair-wise constraints, automatic recognition
PDF Full Text Request
Related items