Font Size: a A A

Researches On Fuzzy Clustering Methods Based On Information Granules

Posted on:2019-09-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:1368330542472765Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the advent of data-intensive era,it has become increasingly important to effectively use data and release its derivative values.Clustering,as an important method of data analysis,can be used for mining the underlying structure of data to gain more detailed insight into data,generate hypotheses and discover laws.It can also be used for producing natural classification and realizing data compression.The research of data clustering is of great significance.In this PhD thesis,the fuzzy C-means(FCM)algorithm in fuzzy clustering is used as an example algorithm,and three main problems are discussed as follows:1)the uncertainty of attribute weighting and weighted fuzzy clustering;2)the uncertainty of missing value imputations and fuzzy clustering of incomplete data;3)dual relationship between data and cluster prototypes and the corresponding extension to the FCM algorithm for complete data caused by it.The descriptions of uncertain factors in the above mentioned problems are taken as breakthroughs and under the concept framework of information granules,several clustering models and algorithms are proposed.The main work in this PhD thesis can be summarized as below.(1)For the uncertainty of attribute weighting in weighted fuzzy clustering,attribute weights are described as interval information granules,and an interval weighted FCM clustering model that views attribute weights as variables constrained by intervals is proposed.When solving the clustering model,based on a tri-level alternating iterative structure for cluster prototypes,memberships and attribute weights,two kinds of algorithms,including human-computer cooperation and genetic-gradient hybrid,are proposed.Experimental results reveal that the proposed methods can play further tuning roles based on the traditional constant weights,and interval constraints for attribute weights are beneficial to avoid iterative calculation from falling into inappropriate local minimum solution.(2)For the uncertainty of missing value imputations in incomplete data clustering,missing values are also described as interval information granules.First,missing values are viewed as variables with interval constraints,and an analogical solution framework for interval imputed clustering model and interval weighted clustering model is established because of the structural similarity between them.In addition,the case that the interval-type imputations of missing values are regarded as constant interval numbers is focused on,and an interval kernel-based FCM clustering algorithm for incomplete data is proposed by means of the clustering model of interval data and the kernel method in machine learning,which results in interval-type prototypes with granulation characteristics and is helpful for improving the accuracy of partition for incomplete data.(3)The uncertainty of missing value imputations in incomplete data clustering is still considered.The missing values adhering to a certain Gaussian distribution are described as probabilistic information granules by utilizing the nearest neighbors of incomplete data and non-parametric hypothesis testing.On this basis,probabilistic information granules of missing values are incorporated into the FCM clustering of incomplete data by involving the maximum likelihood criterion,and the corresponding algorithm with a tri-level alternating optimization of cluster prototypes,memberships and missing values is developed.A dual relationship that cluster prototypes and missing values can be expressed with each other is also mined.Experimental results show the effective guiding role of probabilistic information granules of missing values in incomplete data clustering.(4)To introduce a kind of dual relationship similar to the mutual expression between missing values and cluster prototypes in incomplete data clustering into the FCM clustering of complete data,the concept of the reconstructed data supervised by the neighborhood granules of the original data is naturally derived.A new FCM clustering model that views reconstructed data as variables and makes them directly participate in clustering iterations has been proposed.The law of parameter in the algorithm is studied by utilizing the method of combining approximate analysis with experiments based on a reconstruction deviation index.Theoretical and experimental studies show that under the appropriate values of parameter,the reconstructed data are usually more compact relative to their respective cluster prototypes compared with the original data,which is more conducive to the capture of the cluster structure.In addition,a case study for monitoring data of shield construction is also presented.It reveals the superiority of the proposed algorithm from the viewpoints of the interpretability of clustering results and the representativeness of cluster prototypes.
Keywords/Search Tags:Fuzzy Clustering, Attribute Weighting, Incomplete Data, Reconstructed Data, Information Granules
PDF Full Text Request
Related items