Font Size: a A A

Research And Application Of The Clustering Algorithm For High Dimensional Data

Posted on:2022-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:K T ZhaoFull Text:PDF
GTID:2518306332995279Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of information,massive data emerged.How to find high-dimensional data in these massive data and quickly mine effective information in high-dimensional data is one of the most important problems to be solved.traditional clustering methods have limitations in processing high-dimensional data.In this study,in solving the problem of high-dimensional data clustering,the anti-noise constraint structure is introduced into the low rank representation subspace clustering algorithm to cluster,and the fuzzy inhomogeneous clustering and artificial bee colony algorithm are combined to realize the optimal clustering.The main results of this thesis are as follows:(1)Aiming at the sensitivity of center selection between data,an algorithm LLS-ACO dynamic multi-objective optimization combined with fuzzy heterogeneous clustering is proposed.This method combines the advantages of the two and uses a large amount of data to cluster the index parameters to realize clustering.This study adopts different methods according to whether the classification object of high dimensional data has strict attribute definition.When there is fuzziness,fuzzy heterogeneous clustering algorithm is used.Because the traditional fuzzy clustering has limitations on the selection of the original central points,the artificial bee colony algorithm using linear local search can be optimized.It is linear with the number of iterations A regular parameter dynamic adjustment algorithm is used to adjust the local optimization effect,and the algorithm is changed to the form of dynamic multi-objective,and finally LLS-ACO dynamic multi-objective optimization is formed.For the application of the proposed algorithm in wireless sensor networks,it is verified and compared with the existing algorithms.Experimental results show that the proposed algorithm can effectively solve the problem of life extension and hot spot elimination in WSN.(2)Aiming at the obvious difference between data and noise and outliers,a low rank representation algorithm with anti-noise structure constraint is proposed.The algorithm realizes clustering by constructing affinity graph and spectral clustering.First,the original data is decomposed according to the matrix principle and the data dictionary is reconstituted.At the initial algorithm,the SVD is used to effectively reduce the noise of high-dimensional data,while retaining the original signal,and to screen part of the noise.Secondly,the representation coefficient is obtained by using the low rank representation between the data,and the affinity matrix is constructed by Lagrange multiplier method and alternating direction method,which reflects the characteristics between the data Sign.Finally,the low rank representation in the subspace clustering problem is combined with the coefficient matrix and the coefficient matrix to construct the affinity graph,so that the two can benefit each other in the process of solving realize the overall optimization.combined the iteratively optimized low-rank representation and anti-noise structure constraints to obtain the RLRSI method,and verified the clustering effect of the proposed algorithm.The proposed algorithm is tested on handwritten data sets and typical data sets,and the results show the effectiveness of the algorithm.
Keywords/Search Tags:High dimensional data, Fuzzy clustering, Subspace clustering, Low rank representation
PDF Full Text Request
Related items