Font Size: a A A

Three-way Clustering Analysis Based On Mathematical Morphology

Posted on:2020-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:2428330590451021Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technology such as the Internet,Internet of Things,and cloud computing,people's lives have become more and more convenient,and people-to-people,people-to-work communication has also become more frequent.At the same time,the amount of data generated by human society is also exploding.With the advent of the era of big data,data presents high-speed,large-scale,diverse,low-value density,which brings huge challenges to traditional data processing technology.In order to mine the valuable information hidden in big data,data mining technology has emerged.As an important branch of data mining technology,cluster analysis can discover the internal structure of data sets.It plays an important role in the successful mining of valuable information in the data.However,the traditional hard clustering algorithm uses only one set to represent a single cluster,which has great limitation for the complete representation of the internal structure of the data set.In order to relax the limitation and reveal a better structure of the data,many soft clustering methods were proposed for different application backgrounds.As a special kind of soft clustering,three-way clustering incorporates the idea of three-way decision.In three-way clustering,a cluster is represented by a pair of sets called the core and the fringe of the cluster.The identified elements are assigned into the core region and the uncertain elements are assigned into the fringe region in order to reduce decision risk.In this paper,by combining ideas of erosion and dilation from mathematical morphology and principles of three-way decision,we propose a framework of three-way clustering called CE3.The main contents of this paper are organized as three parts:(1)A CE3 framework based on mathematical morphology is proposed to transform two-way clustering results into three-way clustering results.The essential idea of CE3 is to contract and expand each cluster obtained from a hard clustering method by using contraction operation and expansion operation.The contraction operation shrinks a cluster so that a stronger relationship holds between objects in the contracted cluster.The expansion operation enlarges a cluster so that a weaker relationship holds between objects in the expanded cluster.The difference between the contracted and expanded clusters is regarded as the fringe region.Under the CE3 framework,different three-way clustering algorithms can be generated by selecting different structural operators.(2)Under the CE3 framework,a q neighborhood of data objects is used as a structural operator,and a three-branch clustering algorithm is proposed.The traditional hard clustering algorithm can only classify an element into a single cluster,but cannot distinguish points located on the boundary of two clusters.The three-way clustering provides a solution to this problem due to its good clustering structure.This paper uses the q neighborhood of the data element as the structure operator to analyze the relationship between the q neighborhood of the data element and the data elements in the adjacent cluster.The stretched domain is obtained by expanding the two-way clustering results.The core domain is obtained by contracting the two-way clustering results,so that the clustering result has better structural features.The experimental results show that the method has a good improvement in the structure and accuracy of the clustering results.(3)Under the CE3 framework,the neighborhood density of data objects is used as the structural operator,and a three-way clustering algorithm is proposed.The algorithm uses the neighborhood density of the element to shrink and expand the two-way clustering results to obtain the three-way clustering results.The algorithm can not only identify the boundary points located in the middle of the two clusters,but also identify the points in the single cluster that are far from the center of the cluster.The result of this algorithm is much better than the classical clustering algorithm.Experiments show that the method makes the DBI of the clustering result smaller,the average silhouette coefficient larger,and the accuracy higher.
Keywords/Search Tags:three-way decision, clustering, mathematical morphology, expansion, contraction
PDF Full Text Request
Related items