Font Size: a A A

Rough Clustering And Granulation Analysis In Uncertain Information And Its Application

Posted on:2019-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:R Q LuFull Text:PDF
GTID:2428330572455294Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an unsupervised classification method,clustering algorithm can effectively classify data into different clusters without prior knowledge,and some common features and useful information in data can be discovered.However,the actual data to be analyzed usually contains a large amount of uncertain information and the characteristics of many data objects are not clear,analyzing such data by traditional hard clustering algorithms may cause many classification errors.The clustering algorithms dealing with uncertain information should be deeply researched.Moreover,when analyzing data with uncertain information,information granules generated by existing information granulation algorithms are often overlapped with each other.This affects the clear expression of information granule semantics and also influences the solution of the follow-up questions.Rough clustering algorithms divide data objects with uncertain attribution into the boundary regions,which make rough clustering algorithms have obvious advantages than traditional hard clustering algorithms in analyzing data with uncertain information.How to describe uncertain information according to the characteristics of data objects,and explore effective rough clusterring algorithms as well as information granulation algorithms based on rough clustering are still current research hotspots.Our research is proceeded in the following order: research on efficient rough k-means clustering algorithms? research on information granulation algorithms based on rough clustering ? research on the application of information granulation algorithm based on rough clustering in industrial data analysis.The efficient rough clustering algorithms dealing with uncertain information and rough clustering based granulation algorithms are deeply studied,and then the application of rough clustering based granulation algorithms in the process of ethylbenzene production is explored.The research work can be divided into four aspects as follows:(1)Improved ?RKM clustering algorithm based on local fuzzy enhancement of boundary region.The primary starting point of RKM and its derivatives is how to measure and process the data objects in the boundary regions.The traditional RKM algorithm is more sensitive to the choice of the weight coefficients of the upper and lower approximations.Meanwhile,the impact of non-competitive objects in boundary region data on partitioning results will increase along with the increasing number of overlapping clusters.The ?RKM algorithm,in which the Laplace's principle of indifference was introduced for measuring the objects in boundary regions,solved aforementioned problems well.However,the overlapping degree in boundary regions and spatial distributions of different boundary objects has not been considered by this algorithm.In order to better describe data objects in boundary regions,the local fuzzy measurement is introduced,and an improved ?RKM clustering algorithm based on local fuzzy enhancement of boundary region is developed.(2)Interval type-2 fuzzy measure based rough k-means clusteringExisting RKM algorithm and its derivatives focused on the description of data objects in uncertain boundary areas but ignored the impact caused by imbalanced sizes between clusters which will influence the clustering result.Clusters with smaller sizes are minority classes with fewer samples in data sets.Their means can be easily impacted by boundary regions which overlapping with larger clusters,and so that tradition RKM algorithm is limited in solving data sets with imbalanced classes.The interval type-2 fuzzy measure is introduced in this paper for measuring the boundary objects,on the basis of which,a improved rough k-means clustering algorithm is developed.Firstly,the membership degree interval of the boundary object is calculated,according to the data distribution of clusters,to describe the spatial distribution of clusters.And then,the data sample size of the cluster is further considered to adjust adaptively the influence coefficient of boundary objects on overlapping clusters.The adverse impact of the boundary objects on the means iterative calculations of small sample size clusters is mitigated and the clustering accuracy is improved.(3)Boundary fuzzified rough k-means based information granulation algorithm under the principle of justifiable granularity.In the case that the boundary of clusters are overlapped heavily in a data set,information granules formed by existing clustering-granulation algorithms based on these clusters are also overlapped severely.In order to translate data with uncertain information to information granules that separated clearly from each other,a local fuzzy measurement is introduced in rough k-means clustering and a refined parametric edition of the principle of justifiable granularity is proposed.After that,a boundary fuzzified rough k-means based information granulation algorithm under the principle of justifiable granularity is proposed.(4)The application of the rough clustering based information granulation algorithm in the process of ethylbenzene producion.The boundary fuzzified rough k-means based information granulation algorithm under the principle of justifiable granularity is applied in the data analysis of ethylbenzene production process.A simulation of ethylbenzene production is formed by the software of Aspen Plus.Based on data generated by the simulation,the data analysis is realized by using the boundary fuzzified rough k-means based information granulation algorithm under the principle of justifiable granularity.Some potential knowledge about the relationship between the yield of ethylbenzene,the temperature of transalkyl reaction,the temperature of alkylation reaction,and the purity of ethylbenzene is discovered.And then,some advices are given for the process of ethylbenzene production...
Keywords/Search Tags:Uncertain information, Rough sets, K-means clustering, Fuzzy measurement, Information granulation
PDF Full Text Request
Related items