Font Size: a A A

Combining Sparse Bayesian Learning And Gaussian Mixtures For Active Learning

Posted on:2019-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:M TongFull Text:PDF
GTID:2428330596966425Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In real life,we can easily obtain large quantities of data,but usually most of them are lack of sample labels.The traditional supervised learning algorithms train the learning model only with a relatively small number of labeled data points,which is hard to achieve superior performance due to the small scale and incompleteness of information.If manually annotating all unlabeled samples,lots of time and efforts need to be consumed or the task will even fail in some special case.For this practical issue,traditional supervised learning algorithms cannot provide an accurate and efficient solution,but active learning can make it settled.The expert annotating mechanism of active learning can be employed to choose the most informative samples for manual labeling to expand the training set,and finally a superior prediction model can be achieved.Relevance vector machine(RVM)is a typical sparse learning model,which has strong sparseness,provides more flexible selection of kernel functions and probabilistic outputs while maintaining the comparable results with other machine learning methods.Therefore,this thesis mainly studies the active learning procedure on top of the RVM model,uses the Gaussian mixture model(GMM)to explore the features of sample distribution,constructs the Gaussian mixture model kernel function combined with distribution characteristics based on the Mahalanobis distance,modifies the traditional RVM model,and proposes the transductive RVM algorithm based on GMM kernel.Then the proposed method is applied to the active learning framework to define a new active learning approach.The main work is as follows:(1)To fully consider the features of sample distribution during learning process,the Gaussian mixture model is employed to explore the properties of sample distribution.The Gaussian mixture model distance is constructed based on Mahalanobis distance as kernel distance to design the GMM kernel combined with distribution characteristics.Then the performance of GMM kernel is evaluated by means of Kernel Target Alignment(KTA).(2)The research on the transductive RVM is conducted.By means of kernel matrix expansion,the unlabeled samples are added into the training process,and a transductive RVM based on kernel matrix expansion is proposed to fully consider the information of all samples.Then the GMM kernel is applied to the transductive RVM to develop the transductive RVM algorithm based on GMM kernel.The performance of the new method is verified in experiments.(3)The active learning algorithm is studied based on the RVM and GMM.The transductive RVM based on GMM kernel is applied to the active learning framework to build the classifiers to improve the prediction accuracy and accelerate the convergence speed of the training process.Further,the strategies for initial selection and iterative filtering of samples are proposed to define a novel active learning algorithm.Finally the proposed algorithm is applied to the text categorization problem and shows its strong accuracy and practicality.
Keywords/Search Tags:active learning, sparse Bayesian learning, relevance vector machine, Gaussian mixtures
PDF Full Text Request
Related items