Font Size: a A A

Research On Key Issues Of IB Clustering Algorithm

Posted on:2017-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2308330482487193Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network and multimedia, data collected by people is increasing, the amount and the complex types of data makes people tend to replace artificial by machine learning to deal with huge amounts of data. Clustering is a mean of data analysis and widely used in pattern recognition and data mining. Clustering algorithms are mainly the contribution of fields including data mining, statistics, machine learning, finance, marketing, etc. Nowadays, the invention and improvement of the clustering algorithms has become a very active research topic.The traditional clustering algorithms such as fuzzy c-means (FCM), have several issues, such as sensitive to the initial condition and easily getting into local dinky value, etc. The IB (Information -based) clustering algorithm is put forward, to a large extent alleviate these defects. The algorithm from the point of view of information theory to formalize the clustering problems, avoids the definition of the class prototype, which can better adapt to the different shapes of data sets. The ideas designed of this algorithm are consistent with the traditional definition of clustering, at the same time should be realized from the point of view of information theory to consider the data compression. So the algorithm can be implemented that looking for the nonlinear relationship between the data automatically without predefined the expression of the specified class. And at the same time, it also can alleviate the impact of the algorithm initializes on the final clustering results to a certain extent.IB clustering algorithm as a widely used clustering algorithm, the convergence rate of the algorithm is the direct factor affecting its application, so the judgment on the convergence rate of the algorithm in different data sets is very important. However, there is a lack of a reliable judgment method of the convergence of the clustering algorithm at present. In this paper, according to the principle of Jacobian matrix in the linear approximation at fixed point, put forward a kind of judging method of convergence and convergence rate of the IB clustering algorithm, and then proved in theory and experiment validation.In this paper, the method of judging the convergence of the algorithm objective function at the converged point according to the spectral radius of Jacobian matrix about the objective function of the clustering algorithm at the convergence point, can give the rationalization proposal of selecting parameters in theory. And in this paper, there is an analysis of the influence of parameters on the convergence rate of the IB clustering algorithm, and given an advice of algorithm parameter selection in the actual data set.
Keywords/Search Tags:Clustering analysis, Information-based Clustering, Iclust algorithm, Jacobian matrix, Convergence analysis, Parameters analysis
PDF Full Text Request
Related items