| Kernel methods,due to their solid theoretical foundation and excellent non-linear processing capabilities,have been widely applied in numerous machine learning tasks,demonstrating satisfactory performance.However,in practical applications,kernel methods still face two key issues: the difficulty in selecting the kernel function and the high computational complexity of the kernel model.Specifically,the choice of the kernel function directly affects the performance of the model,while high complexity limits the application of kernel methods in large-scale scenarios.To alleviate these two major problems,this dissertation focuses on learning and approximating the kernel matrix of kernel functions on a given dataset.Particularly,this dissertation explores the kernel matrix learning methods in the graph-based kernel clustering models,aiming to provide more flexible kernel selection mechanisms for these models,thereby enhancing their clustering performance.Meanwhile,in terms of kernel matrix approximation,this dissertation focuses on the widely recognized Nystr(?)m approximation method,striving to reduce the computational complexity of related kernel models,and making beneficial extensions to the Nystr(?)m method both theoretically and practically.In general,the research content of this dissertation can be summarized as follows:Firstly,to enhance the quality of the kernel matrix in the graph-based kernel models,this dissertation proposes a self-paced smooth multiple kernel learning method from the perspective of parametric kernel learning.This method aims to solve the problems of insufficient discriminability of kernel feature representation and unreasonable weight allocation in existing multiple kernel parameter learning methods.Specifically,the proposed method enhances the smoothness of kernel feature representation on the affinity graph by introducing a feature smoothing regularizer,thereby improving the separability of data in the kernel feature space.This means that data processed by kernel mapping will be more conducive to achieving clustering objectives.At the same time,this method uses a self-paced learning strategy to learn the multiple kernel weights,which can fully utilize the fitting ability of each base kernel on the given data,thereby more reasonably allocating weights to them.By integrating these two improvements into a unified optimization objective,this dissertation constructs a self-paced smooth multiple kernel clustering model.This model can simultaneously learn the kernel matrix and the graph matrix used for clustering,both of which enhance each other during the learning process,collectively improving the clustering performance of the model.Secondly,to give greater flexibility to kernel matrix learning in the graph-based kernel clustering,this dissertation further explores non-parametric kernel learning and proposes a contrastive kernel learning method.Unlike parametric kernel learning,which requires pre-setting the parameter form of the kernel function,non-parametric kernel learning can directly learn the appropriate kernel matrix from the data,offering higher flexibility.However,most non-parametric kernel learning methods are limited to searching for the optimal kernel near the predefined prior kernel,thereby narrowing the search range.To overcome this limitation,this dissertation proposes a simple and effective strategy for selecting the positive and negative samples and constructs a contrastive kernel learning method based on this strategy.The proposed method,by implementing contrastive learning in the kernel feature space,can adaptively learn the kernel matrix from the data without relying on the prior kernel.More importantly,the learned kernel matrix can well preserve the neighborhood structure information of the data,making this method more applicable in clustering tasks.Thirdly,to alleviate the computational and storage burden of the graph-based kernel clustering models,this dissertation explores the application of the kernel matrix approximation method-the Nystr(?)m method in these models.By introducing the RLS-Nystr(?)m method,this dissertation provides a more efficient approximation mechanism for kernel feature representation,effectively reducing the related computational and storage overhead.Inspired by the idea of anchor graph learning,this dissertation further proposes two kernel clustering models with linear time complexity.These models optimize the computation process of kernel feature representation and low-dimensional graph representation,achieving more efficient and accurate approximation calculations.Compared with the graph-based kernel clustering models that do not use any approximations,the proposed methods not only reduce the demand for time and storage space but also improve the clustering performance of the original models.Fourthly,to more comprehensively evaluate the effect of the kernel approximation method-the Nystr(?)m method in downstream learning tasks,this dissertation further applies it to the sparse kernel ridge regression model.This move aims to reduce the computational complexity of this model and provide a theoretical guarantee for the error between the obtained approximate solution and the exact solution of the model.According to the derived error upper limit,as long as the error of the approximate matrix obtained by the Nystr(?)m method is kept within a small range,the error of the model’s approximate solution can be effectively controlled.Based on this,this paper also explores the performance of the Nystr(?)m method under two non-uniform sampling strategies,namely Determinantal Point Processes(DPPs)-Nystr(?)m and Ridge Leverage Scores(RLS)-Nystr(?)m,in the sparse kernel ridge regression model.Numerical experimental results show that compared with the traditional uniform Nystr(?)m method,the non-uniform Nystr(?)m method can achieve better predictive performance in the sparse kernel ridge regression model. |