Font Size: a A A

Study On Three-way Decision Clustering Algorithms Based Dynamic Random Projection For High Dimensional Data

Posted on:2018-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:H B ZhangFull Text:PDF
GTID:2348330569486443Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,how to quickly and correctly mine the value of information contained in data resources is the focus of today.Clustering analysis is widely used to identify the structural information in the massive high-dimensional data.Therefore,the methods of clustering for high-dimensional data have become a hot and difficult issue in the field of clustering research.Besides of “the Curse of Dimensionality” caused by the attributes' dimensions increasing,the actual applications of complex high-dimensional data in some areas such as social networks,biological information processing and electronic commerce,there are three possible relationships between data objects and clusters because of the data's uncertain information.The data object is definitely a member of the cluster,may be a member of the cluster or isn't a member of the cluster.In order to improve the accuracy of clustering and retain the uncertain relationship between data,high-dimensional clustering algorithms can deal with the uncertain data,deal with boundary data points between clusters in uncerntain way,and divide the degree of the relationship within the cluster in detaile.Therefore,three-way decision clustering methods are proposed for high dimensional data and its uncertainty in this thesis.First,a new three-way decision clustering model based on dynamic random projection for high-dimensional data is proposed.This model projectes original data to different dimension for ascending ordering and a three-way clustering result is produced by a three-way decision clustering algorithm in each dimension subspace.The results of adjacent subspace are compared and the better one is kept.And calculate the objective function until the value of objective function satisfied the stopping condition.While the algorithm stops,clustering result make a tradeoff between the clustering quality and computational cost.Second,a three-way k-medoids dynamic clustering method based on random projection is proposed.In order to verify the feasibility and effectiveness of the model,the three-way k-medoids dynamic clustering method based on random projection has proposed.In this clustering method,a three-way decision clustering algorithm based on k-medoids has proposed and defines the objective function of the model.And a new threshold setting method of ? and ? is applied in the three-way decision clustering algorithms based on k-mediods.Only a pair of parameters is set to automatically calculate the decision thresholds ? and ? of each cluter,and different threshold pairs are obtained according to the different clusters.Doing like this is more reasonable than only setting a pair of global decision thresholds when dividing the data object into the positive region,the boundary region or the negative region of the cluster.Finally,an improved dynamic random projection three-way decision clustering method is proposed.The method is an improvement of a three-way k-medoids dynamic clustering method based on random projection.In the improved method,a three-way decision clustering has proposed,objective function is redefined and the problem that the dimension of the dynamic random projection can not be adjusted automatically according to the clustering result in the three-way k-medoids dynamic clustering method based on random projection is improved.The effectiveness and feasibility of the model are further verified by the method.Experiments show that the two methods proposed in this thesis are effective.And compared with some traditional two-way decision clustering algorithms,the two three-way decision clustering methods can significantly improve the accuracy of clustering.
Keywords/Search Tags:Three-way decision Clustering, Three-way decision, Dynamic, Random projection, High-dimensional data
PDF Full Text Request
Related items