Font Size: a A A

Improvement Of Kernel K-means Clustering And Its Application In Course Selection System

Posted on:2021-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2518306107982919Subject:Engineering
Abstract/Summary:PDF Full Text Request
As an unsupervised learning method,cluster analysis is one of the most important tool of data mining.Its importance and cross-cutting characteristics with other directions are widely used in various fields such as data mining and machine learning.The purpose of cluster analysis is to find the hidden data structure in the data,cluster the acquired data into different clusters according to certain constraints and judgment criteria,so that the similarity between data points in the same cluster is large,and the similarity between different clusters as small as possible.K-means clustering algorithm is a relatively classic algorithm in clustering problems.Due to its simplicity,high efficiency,and strong adaptability,it is widely used in the field of machine learning.However,this algorithm is not very effect ive for processing high-dimensional nonlinear separable data.Since the real-world data is generally complex,with high dimensionality and non-linearity,the optimization of high-dimensional non-linear clustering algorithms has become an important research direction and a very challenging task.The kernel K-means algorithm is an improved clustering algorithm based on the K-means algorithm.By introducing the kernel function into the clustering algorithm,the nonlinearly separable data is mapped to the high-dimensional space through the kernel function to improve the clustering.The performance of similar algorithms on high-dimensional data.In view of the above problems,the research work in this thesis mainly improves the performance of the kernel K-means algorithm and applies the improved algorithm in the course selection processing module of the course selection prototype system.The specific work is as follows:(1)Based on the property of indicator matrix,we propose a non-convex relaxed model of kernel K-means clustering model to solve image and document clustering problems.We also analysis the relationship between the proposed model and orthogonal non-negative matrix factorization model and non-negative spectral clustering.(2)In order to better solve the proposed non-convex optimization model,we design a simple but robust numerical algorithm.Firstly,we split the original matrix variables into two variables which satisfying orthogonality and non-negativity respectively,then use alternating iteration algorithm to project the solution onto the Stiefel manifold and non-negative subspaces to find the optimal solution.(3)A large number of experiments demonstrate that the proposed algorithm,comparing with the existing methods,is able to cluster high-dimensional nonlinear data efficiently,and achieves higher clustering accuracy and better stability on synthetic data and real-world data.(4)Based on the clustering method proposed in the thesis,the improved kernel K-means algorithm is applied in the course selection module of the course selection system to assist students in course selection.
Keywords/Search Tags:Clustering, Kernel K-means, Indicator Matrix, Course Selection System
PDF Full Text Request
Related items