Font Size: a A A

The Method Of Online Learning The Gaussian Kernel Function Based On The Structure Of KTA

Posted on:2016-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y L XiaoFull Text:PDF
GTID:2348330512475218Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Kernel-based methods,such as support vector machines,kernel principal component analysis,etc,have become the mainstream methods in the field of pattern recognition and statistical learning.However,kernel-based methods greatly depend on the choice of kernel function.Since the structure of high dimensional feature space is determined by kernel function,whether choosing an appropriate kernel function will significantly influence the performance of kernel-based methods.Gaussian kernel function has been widely used because of its excellent properties.Howerer,research shows that learning the Gaussian kernel is a quite challenging problem.The traditional Gaussian kernel learning methods,such as cross-validation method,batch learning methods,have the defect that they must consume an enormous amount of computing time and memory consumption,especially for large-scale datasets.The kernel target alignment(KTA)has the advantages of simplicity,efficiency andtheoretical guarantee,thus,to overcome the drawback of the traditional methods,we based on the 'difference of series' structure of KTA,first prove that the formulated KTA criterion is a local convex function,then we proposed three on-line Gaussian kernel learning methods.Experimental results show that compared with the stat-of-the-art kernel learning methods,our methods have a greate improvement on time efficiency and classification performance.The special contents include the following four points:(1)Analyzing the structure of the KTA and prove its local convexity.In this paper,based on having an insight into the "difference of series" struction of KTA criterion,we use a bipartite graph model to decompose the formulated criterion.In addition,we proved that the formulated KTA criterion function is locally convex in the vicinity of a determined meaningful global minimum.Then,we use ten datasets to validate above analysis.(2)Using the classical stochastic gradient descent(SGD)to learning the Gaussian kernel function.Currently,most of the kernel learning methods based on KTA adapt the batch learning method,which has high computational cost.Thus,based on(1),we using the SGD to learning the Gaussian kernel.This method has the advantages of simple,fast,and often makes few statistical assumptions etc.In addition,to guarantee the convergence of the learning process,we use the annealing learning rate to learning the Gaussian kernel function.(3)Using the adaptive online learning algorithm to learn the Gaussian kernel function.An appropriate learning rate is quite import for online learning algorithm.Although(2)provides an optimal convergence rate in estimation,it cannot follow the changes fast enough since ?t= 1/t is too small.So we proposed to use the adaptive online learn the Gaussian kernel function.Our method can guarantee that when the Guassian parameter ? is far from the optimal value,it takes an appropriately large ?;When the Guassian parameter ? is close to the optimal value,it takes an appropriately small ?.(4)Using the predictive variance reduction online learning algorithm to learn the Gaussian kernel function.The inherent variance of online learning is the key factor that influences the performance of online learning.By designing an efficient variance reduction algorithm,we can use a larger learning rate,which will significantly accelerating the optimization process.Thus,based on(1),we use the predicitive reduction online learning algorithm to learn the Gaussian kernel function.Finally,we analyze the proposed three methods,and illustrate the advantages and disadvantages of them.
Keywords/Search Tags:Ker words, Online kernel function learning, Structure of KTA, Function decomposition, SGD, Adaptive, Variance reduction
PDF Full Text Request
Related items