Font Size: a A A

Research And Application Of Clustering Method Based On Two Curve Chebyshev Approximation

Posted on:2022-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:M YuFull Text:PDF
GTID:2518306521996799Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,data mining has made remarkable progress and has been used widely.With the deepening of research,the mining of sequence data has been paid more and more attention by researchers at home and abroad.Cluster analysis,as an unsupervised or semi-supervised learning method in data mining,is also widely used.As a classic clustering algorithm,K-means also has the problem of random selection of initial cluster centers and weak robustness of iterative calculation of cluster centers when it is used to process sequence data.Aiming at the above problems,this paper carried out research on the clustering method of sequence data,and the main research results are as follows:(1)Aiming at the problem of random selection of initial cluster centers,this paper proposes a TC-means algorithm for constructing initial clustering center based on two curve Chebyshev approximation.Firstly,segment each kind of data by a sliding window;secondly,obtain the central function of every segment data using the two curve Chebyshev approximation method and calculate the value of each dimension;finally,join segments together by the common point of adjacent two segments and obtain a relatively ideal initial center of clustering.Experimental results show that the initial clustering center constructed by this method overcomes the sensitivity of K-means algorithm to the initial clustering center for clustering analysis,and shows better clustering performance compared with other clustering algorithms.(2)In order to reduce the time cost of TC-means algorithm and improve the processing ability of large-scale sequence data,a new sequence data clustering algorithm(FTC-Prototype)was proposed by improving the updating mode of clustering center in the iteration process of each sub-cluster.The algorithm uses the two curve Chebyshev approximation to update the cluster centers in each iteration,and introduces the iterative termination condition for the Two curve Chebyshev approximation to reduce the number of iterations.Experimental results show that this method improves the time efficiency of clustering.(3)Based on the above research,a prototype system for cluster analysis of stellar spectral sequence data was designed and implemented by treating the stellar spectral data as sequence data according to the characteristics of strict sequence.The related modules and functions of the system are introduced.The operation results show that the prototype system is effective and provides a new way for clustering analysis of stellar spectral data.
Keywords/Search Tags:Sequence data, Clustering, Two curve Chebyshev approximation, Initial clustering center, Stellar spectrum
PDF Full Text Request
Related items