Font Size: a A A

Research On Clustering Method Of Complex Data Information

Posted on:2021-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2428330605472937Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the complexity and scale of data are increasing,the application scenarios are becoming more and more div erse,and the requirements for clustering methods are becoming higher and higher.Therefore,this article has made a related research on the clustering algorithm of complex data information.One complex data is high-dimensional data.Due to the influence of "dimensional disaster",traditional algorithms cannot effectively handle it.The other complex data is under the obstacle space data,because the existence of obstacles will cause the traditional clustering algorithm to fail,so it is also important to deal with it.Firstly,in order to solve the clustering problem of high-dimensional data,based on principal component analysis(PCA),this paper proposes a new method to reduce the accuracy of subsequent clustering algorithms after reducing the dimension.Based on the concept of feature space,a new dimension reduction standard was constructed through the combination of feature space and information entropy,and a dimension reduction algorithm(entropy-PCA,EN-PCA)more suitable for high-dimensional data clustering was proposed.Aiming at the problem of poor interpretability caused by linear combination of original features after dimensionality reduction and insufficient input flexibility,a sparse principal component algorithm based on ridge regression(ESP CA)was proposed.Finally,based on the dimensionality reduction data,in order to solve the problem of slow convergence of genetic algorithm clustering,the initialization,selection,crossover and mutation of genetic algorithm were improved,and a new clustering algorithm(genetic K-means algorithm ++,GKA ++)was imposed.Secondly,for the problem of data clustering in obstacle space,the primary goal of this paper is to solve the lack of accuracy of the obstacle space clustering algorithm and the clustering problem of dynamic changes of obstacles that few researchers pay attention to the traditional algorithm(STA?PI?OBGRID),which contains a series of definitions and rules to increase the accuracy of clustering,and then proposed a clustering algorithm in the case of increased obstacles(cluster algorithm in the case of increased obstacles(DYN?OBGRID?ADD),cluster algorithm in the case of obstacle reduction(DYN?OBGRID?DE),and cluster algorithm in the case of obstacle moving(DYN?OBGRID?MV).The static obstacle clustering algorithm increases the accuracy of the clustering results.Dynamic obstacle algorithm increases the comprehensiveness of clustering algorithm for this problem.For the above algorithms,experiments create data from obstacles that are stationary and the number or location of obstacles.Experience the post-certification algorithm performs well both in accuracy and efficiency.
Keywords/Search Tags:cluster, high dimensional data, obstacle space, grid-density
PDF Full Text Request
Related items