Font Size: a A A

Research On Multiview Kernel Methods With Variational Approximation

Posted on:2022-01-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:L MaoFull Text:PDF
GTID:1488306482986989Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Many real-world data can be represented by multiple different views.Multiview learning is a branch of machine learning,which is concerned with modelling these data using knowledge about different views.Bayes' law provides both a theoretical workflow and a computational tool to iterate between modelling a phenomenon and updating one's knowledge.The knowledge about multiview data can usually be decomposed into two parts,a within-view part about how each view is generated and structured,and a between-view part about how different views are related together.The latter shows some commonness across different tasks and has been distilled to several multiview model assumptions.Among these assumptions,the view consistency assumption is most commonly used in multiview supervised learning.It remains an important research problem that performing Bayesian multiview learning with the view consistency assumption taken as a kind of knowledge.In the standard Bayesian framework,the knowledge about data is usually used to build a model or select a prior over the model space.However,the view consistency assumption can hardly be directly applied to these two tasks.On the one hand,the view consistency assumption is about the prediction functions.In the context of Bayesian learning,the prediction function is related to the expectation of the model concerning the posterior.It is difficult to enforce the agreement between the prediction functions from different views through the choice of the model solely.On the other hand,the view consistency assumption can hardly be represented as a distribution over the model space in a data-independent fashion,while reusing training data to choose the prior breaks the likelihood principle and makes the model vulnerable to over-fitting,and using reserved data lacks efficiency.Another problem one may face in performing Bayesian multiview learning is that the Bayesian posterior is usually computationally intractable and requires some approximations to obtain.In this dissertation,we propose to incorporate the view consistency assumption into the approximation of the posterior.Due to the dependence of the posterior on data,we can make full use of training data to measure the agreement between different views.By choosing varia-tional objective and variational family,we impose view consistency constraints on the predic-tive functions from different views.Moreover,although data-dependent priors are not suitable for Bayesian learning,they can be used to analyze the generalization performance of learning algorithms under the PAC-Bayesian framework.We adopt these ideas to design and analyze multiview kernel methods.The contributions of this dissertation are:Soft Margin Consistency Multiview Maximum Entropy Discrimination(SMVMED):Maximum entropy discrimination(MED)is an effective approach to learn a discriminative clas-sifier as well as consider uncertainties over model space,which combines Bayesian learning and the large margin principle.SMVMED models each view with an MED and modifies the varia-tional objective of MED to propagate the posteriors between the views.We give an instantiation of SMVMED using Gaussian process,and propose a sequential minimal optimization algorithm to accelerate the training of SMVMED.Multiview Variational Sparse Gaussian Process(MVSGP): Variational sparse Gaussian process(VSGP)approximates Gaussian process models by summarizing the posterior process with a set of inducing points.MVSGP models each view with a VSGP and augments each of them with a set of shared inducing points.By enforcing the mean functions of the approximate posteriors to be identical at the locations of these shared inducing points,MVSGP makes the predict functions from different views be consistent.We further propose to use an additional VSGP to learn these locations.Experiments on toy and real-world datasets show that MVSGP is capable of modelling multiview data with a small set of shared inducing points.Multiview PAC-Bayesian Theory: Statistical learning theory provides a theoretical frame-work to analyze the generalization performance of machine learning algorithms.PAC-Bayes theory provides tight bounds on the generalization error by incorporating the knowledge about data distribution into a prior distribution over the concept class and constructing a posterior with the classifier learned by the learning algorithm to be analyzed.To complement the lack of PAC-Bayesian theory in multiview learning,we propose to use the view consistency assump-tion to design data-dependent priors over the concept class and perform PAC-Bayesian analysis on multiview kernel methods.
Keywords/Search Tags:Multiview Learning, Kernel Methods, Gaussian Process, PAC-Bayesian Analysis, Bayesian Learning, Variational Inference
PDF Full Text Request
Related items