Font Size: a A A

Research And Application Of Rough Clustering Methods Of Mixed Attribute Data With Self-adaptive Cluster Adjustment

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:X T ZhangFull Text:PDF
GTID:2428330614461611Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an unsupervised learning method,clustering analysis is an effective tool for data analysis,which groups data objects under the condition of unlabeled samples and excavates the potential structure of data.Most of the data to be processed in real application are mixed attribute data containing both numerical attribute and classification attribute at the same time.In addition,these data often contain a lot of uncertain knowledge.The characteristics of many data objects in the cross region of clusters are not clear.It cannot be simply classified into a certain category.When using the traditional clustering algorithm to process these data,there will be many classification errors in the clustering results.Therefore,it is of great significance to study the clustering method of mixed attribute data by combining the theoretical method which can deal with uncertain information.However,in practical applications,most of the existing clustering algorithms need to be given the number of clusters in advance,the unreasonable subjective selection will lead to the reduction of clustering accuracy and affect the performance of clustering algorithm.Furthermore,the random selection of the initial cluster centers will lead to the reduction of the stability and efficiency of the clustering algorithm.Therefore,the optimal number of clusters and the reasonable and efficient self-adaptive adjustment of the initial cluster centers are the problems that need to be solved in the current clustering research.Moreover,most of the current research on clustering algorithms for mixed data only focuses on the compactness within clusters,ignoring the importance of the separation degree between clusters.How to ensure that the clustering results have high density and separation between clusters is also the research hotspot of the current mixed attribute data clustering algorithm.Our research is proceeded in the following order: Research on Soft Clustering Algorithm of Cluster Self-adaptive Adjustment? Research on Rough Clustering Method of Mixed Attribute Based on Information Between Clusters?Application of Mixed Attributes Rough Clustering in Data Analysis of Actual Grain and Oil Processing.The clustering methods of mixed attribute data with self-adaptive cluster adjustment are deeply studied,and then the application of the proposed clustering algorithm in the production process of Undecylenic Acid Methyl Ester is explored.The main work of this thesis includes the following contents:(1)Rough Fuzzy K-Means Clustering Algorithm Based on Mixed Metrics and Cluster Adaptive AdjustmentIn order to solve the problems that rough K-Means(RKM)algorithm and its related derivatives need the number of clusters ahead of time and the random selection of the initial cluster center,which results in low accuracy of data partition in the cross-region of clusters,a rough fuzzy K-Means clustering algorithm with adaptive adjustment of clusters is proposed.When calculating the membership degree of the data objects belonging to different clusters in the intersection area of the cluster boundary,the mixed metrics of local density and distance are taken into account in the algorithm.Meanwhile,to obtain the optimal number of clusters,a robust learning-based strategy is adopted to adaptively adjust the number of clusters.The midpoint of two samples with the smallest distance in the dense area of data object is selected as the initial cluster center,and the object with the local density higher than the average density is divided into the cluster,and then the remaining initial clusters center can be selected,so that the selection of the initial cluster centers is more reasonable.The comparative experiments on synthetic datasets and UCI datasets demonstrate the advantages of the algorithm in adaptability and clustering accuracy when it deals with spherical clusters with blurred boundaries.(2)Rough K-prototypes Clustering Algorithm Based on OTC Similarity and Between-Cluster Information for Mixed DataIn the iterative updating process of cluster center and partition matrix,most of the existing mixed data clustering algorithms only consider the impact of the information within the cluster,but ignore the information between clusters,resulting in low cluster separation.In view of this,a rough mixed data clustering algorithm based on the frequency division information between clusters is proposed.In the process of measuring the similarity between data objects and clusters,the unified OTC similarity of mixed attribute data is used to avoid the transformation of classification attributes and numerical attributes and parameter adjustment in the traditional mixed attribute clustering algorithm,and the between-cluster frequency division information is added in the algorithm iteration process to ensure the intra-cluster density and inter-cluster separation of clustering results.The validity of the algorithm is verified by the comparative analysis of several groups of experiments.(3)Application of Rough Clustering Analysis of Mixed Attribute Data in the Production of Undecylenic Acid Methyl EsterThe method of determining the initial cluster center is combined with the rough K-prototypes clustering algorithm based on OTC similarity and inter-cluster information,which is applied to the data analysis of Undecylenic Acid Methyl Ester products.A simulation of Undecylenic Acid Methyl Ester production is formed by the software of Aspen Plus.The potential relationship between raw material purity,preheating temperature,cracking temperature,material flow rate,antioxidant type and product yield is explored.And then,some advices are given for the process of Undecylenic Acid Methyl Ester.
Keywords/Search Tags:Rough Fuzzy Clustering, Mixed Metrics, Cluster Adaptive Adjustment, Mixed Data, Between-cluster Information
PDF Full Text Request
Related items