Font Size: a A A

Deep Spectral Kernel Learning And Its Theoretical Essence And Extended Research

Posted on:2022-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z F WuFull Text:PDF
GTID:2518306740982849Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a class of effective statistical learning techniques in artificial intelligence,kernel method is the key to improve the ability of learners to solve nonlinear problems.But,with the rapid develop--ment of machine learning in recent years,most classic kernels are no longer applicable to increasingly complex tasks demanding better fitting and generalization.Some theoretical and experimental studies have shown that the core bottleneck lies in the limitation on locality and the inefficiency in computa-tional elements.Accordingly,the emerging spectral kernel and deep kernel try to break through the locality limitation and the computational inefficiency by improving mapping property and composi-tional architecture,respectively.They provide enlightening thinking for the further development of kernel method.However,the research and application of spectral kernel and deep kernel are still in the initial stage.On the one hand,most existing spectral kernels only avoiding the monotonicity constraint are still limited by the stationarity condition of shift-invariant function,and thus can not reveal important information manifolds in feature space.On the other hand,most existing deep kernels paired with closed structures,poor optimization,and expansion difficulty,are hard to make full use of the ex-ponential superiority of deep architecture in computational efficiency.Moreover,considering their commonness and characteristics,it is of great significance to propose deep spectral kernel learning with preferable mapping and architecture,so as to address the issue on locality and inefficiency.In this thesis,our main contributions are 4-fold:1)We propose a general and scalable deep spectral kernel learning framework:To solve the problem on the poor representation ability of spectral kernel and the closed architecture of deep ker-nel,we construct deep spectral kernel network.First,non-stationary and non-monotonic spectral mappings are derived to break through the locality limitation,which can approximate any real-valued bounded continuous positive semi-definite kernel.These preeminent mappings are further embedded in a deep network with a directed acyclic graph architecture.Then,intuitive analysis and experimental verification are also conducted.2)We propose a Bayesian random kernel mapping network with more efficient optimization:To address the optimization dilemma of deep spectral kernel network,we construct the new Bayesian random kernel mapping network that can improve the learning process by escaping from some poor and dense local minima with proper probability.First,a prior-posterior bridge with copulas is derived to enable the uncertainty,and a Bayesian learning paradigm with stochastic variational inference is presented to optimize the network efficiently.Then,the mechanism of prior-posterior bridge and the effectiveness of Bayesian learning paradigm are discussed and verified.3)We propose a novel soft transfer approach to help deep spectral kernel network extend to deeper architecture:Aiming at the bottleneck that deep spectral kernel network is not compatible with deeper architecture due to its bumpy solution space,we specially design this transfer scheme.First,a rectified concrete gate is constructed to automatically characterize the monotonicity/non-monotonicity of each computational element,and a loss function based on variational Bayesian infer-ence and spike and slab prior is proposed to dynamically balance the empirical risk and the intensity of transfer.Moreover,the feasibility of each component is verified through intuitive analysis and sys-tematic experiment,in which the proposed transfer approach successfully helps deep spectral kernel network scale to much deeper 110-layer residual architecture.4)We propose a non-trivial theoretical framework to characterize the representation ability of deep spectral kernel learning:In addition to the above learning framework and extended research,we particularly present a numerically-tight theoretical framework to analyze their representational properties.Our main results characterize the insights as follows.First,the non-stationary and non-monotonic spectral mappings in kernels can achieve the theoretically-optimal representation ability,which is O(m?/d2)times better than that of the piecewise linear mappings in neural networks,for m computational elements,d input dimensions,and a Lipschitz constant??Then,the deeper architec-tures can significantly improve the efficiency in computational elements by reducing the exponential approximation errors to polynomial ones,and thus help learners compactly represent complex con-cepts.In general,we propose the deep spectral kernel learning and research its extended variants pro-foundly.We hope the novel framework with excellent representation ability,better optimization pro-cess,and complete theoretical basis,can improve the further development,foundational theory,and industrial application of machine learning.
Keywords/Search Tags:Deep Spectral Kernel Learning, Deep Kernel, Spectral Kernel, Kernel Method, Neural Network
PDF Full Text Request
Related items