Font Size: a A A

A Study Of Nystr(?)m Extensions Based On Cyclic Matrix Projections

Posted on:2022-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:J S LiuFull Text:PDF
GTID:2518306542479454Subject:Data Science and Technology
Abstract/Summary:PDF Full Text Request
As a flexible and popular clustering algorithm,spectral clustering is superior to traditional clustering algorithms such as k-means when dealing with small-scale data sets.But it has limited applicability of large-scale problems due to its computational complexity of O(n~3)and space complexity of O(n~2),with n the number of data points.First,the idea of approximating random Fourier features from the distribution independent of the sample set was proposed to further improve the speed of the cluster-ing algorithm for processing large-scale data sets.In our designed two-stage preference clustering framework,the preference matrix is first mapped to the feature matrix ac-cording to the random Fourier feature,and then the traditional clustering method is used for clustering in the transformed feature space.Experimental results on a movie dataset containing 100,000 ratings show that this method is more effective than Nys-tr(?)m and k-means methods in terms of clustering accuracy,and has better clustering performance.Secondly,the Nystr(?)m extension method based on random circulant matrix pro-jection was proposed to reduce the space complexity and time complexity of the kernel matrix approximation method for processing large-scale data sets are reduced.In our designed Nystr(?)m extended model,we first use random sampling to obtain an approx-imate contour of the initial matrix,and then construct a cyclic embedding matrix,using the cyclic matrix as a projection matrix,so as to embed the initial contour of the input data space into a low-dimensional feature subspace,and then perform the SVD in this feature subspace.Compared with other typical matrix approximation methods,the designed circulant matrix projection method has the advantages of low time complexity and high reconstruction accuracy.Finally,a fast spectral clustering algorithm without feature decomposition was proposed,which reduces time overhead through the multiplication and iteration.It solves the problem that traditional spectral clustering algorithms consume too much time to perform feature decomposition when the number of samples is large.By using the Nystr(?)m method to conduct random sampling to establish the relationship between the sampling matrix and the original matrix,we implement the iterative update of the matrix indicator matrix based on the multiplicative update principle.The experiments show that the designed algorithm can guarantee the clustering accuracy and improve the efficiency of traditional spectral clustering methods the convergence and time of class accuracy at the same time,which makes up for the time consumption defect of spectral clustering that needs to complete the eigen-decomposition of the Laplacian matrix when processing large sample data sets.In summary,the goal of this article is to use our designed clustering method to accelerate the running speed of the spectral clustering algorithm,which could solve the problems in the traditional spectral clustering method and make it more effective for large-scale data.
Keywords/Search Tags:Spectral clustering, kernel, Nystr(?)m approximation, circulent matrix, random Fourier transform, eigen-decomposition, multiplication update iteration
PDF Full Text Request
Related items