Research And Application Of Machine Learning Algorithms Based On Gaussian Process Model

Posted on:2013-02-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J J He

Full Text:PDF

GTID:1118330371496680

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

Machine learning is one of the research focuses of information science,and it is widely applied in various fields such as control engineering,machine vision,information security, bioinformation and medical diagnosis.Gaussian process model is a novel kernel method proposed recently in machine learning. Besides the advantages of traditional kernel meth-ods, it also has many other merits such as being implemented easily, being represented fully by Bayes formula, adaptively choosing hyper-parameters.This dissertation focuses on the machine learning algorithms based on Gaussian process model and their application problems. The main research contents are as follows:1. Research on multi-instance learning algorithm. A novel multi-instance learning algorithm is proposed by using Gaussian process model. Since the algorithm can deal with the relationships between the instances and the labels of samples assumed by various multi-instance assumptions by employing different likelihood functions, it can be used for effectively solving various problems obeying different multi-instance assumptions.The simulation results on multiple benchmark data sets and problems of detecting regions of interest in images indicate that the algorithm can achieve superior performance.2.Research on multi-instance multi-label learning algorithm. Based on Gaussian pro-cess model, an effective multi-instance multi-label learning algorithm which can simulta-neously describe the relationship between instances and labels, as well as the relationship among labels is proposed. The basic idea of the algorithm is to define a set of latent variable functions with Gaussian process prior so that the correlations among labels can be represented by using the covariance matrix of latent variables,and the relationship between instances and labels can be described by defining new likelihood functions or multi-instance kernels.The simulation results on the problems of visual mobile robot navigation and some benchmark data sets show that the proposed algorithm has a better performance than the existing methods.3.Research on twin Gaussian processes model.Considering that the computational complexity will be very high when the traditional Gaussian process models are used to deal with classification problems,this dissertation proposes a new model with low com- putational complexity. The basic idea is to define a latent variable function for each class. keep the value of the latent variable function of a sample as small as possible if it belongs to the corresponding class otherwise keep the value as large as possible. So it can use the Gaussian function as the likelihood function and obtain the explicit expression of poste-rior distribution by using analytical method,as a result, the computational complexity of the model is greatly reduced. The simulation results indicate that this model not only has lower computational complexity but also can achieve the matching or even higher prediction accuracy than the existing models.4. Research on the algorithm satisfying the requirements of high accuracy and real-time processing for solving the forward kinematics problem of Stewart platform.This dissertation proposes a new solving strategy which can reduce the original multivari-able problem to several single variable or bivariate problems by introducing intermediate variables.Employing this strategy,we find intermediate variables via independent com-ponent analysis and Gaussian process multi-task learning respectively, and propose two corresponding numerical algorithms.The simulation results on multiple Stewart plat-forms show that the numerical algorithm based on Gaussian process multi-task learning method is superior to the existing approaches,and can basically satisfy the requirements of real-time processing and high precision.5. Research on the subcellular localization prediction algorithm of dual-targeted pro-teins.Based on Gaussian process model,this dissertation presents a prediction algorithm that can take the problems of integrating multiple feature information,describing the correlations among locations and imbalanced data into account simultaneously. The al-gorithm describes the correlations among locations by using the covariance matrix of la-tent variables, optimally integrates multiple feature information through defining a novel likelihood function, and solves the data imbalance problem by assigning different index weight coefficients to the likelihood function of each type of samples in the joint likelihood function. The experimental results on human protein data sets indicate that the proposed algorithm can achieve high prediction accuracy.

Keywords/Search Tags:

Gaussian Process Model, Multi-instance Multi-label Learning, ForwardKinematics Problem, Subcellular Localization Prediction

PDF Full Text Request

Related items

1	Multi-label Prediction Model Based On Ontology Database And Data Mining In Bio-medicine
2	Relationship Minging In Heterogeneous Information Networks Based On Multi-label And Multi-instance Learning
3	Research And Application Of Multi Instance Multi Label Active Learning Method
4	Multi-Instance Multi-Label Learning Based On Neighborhood Consensus
5	Study On Multi-label Prediction For Several Types Of Protein Classification
6	Research On Key Technologies For Multi-instance Multi-label Web Page Categorization
7	Web-page Classification Method Based On Multi-instance Multi-label
8	Research On Multi-instance Multi-labe Learning Based On Feature Learning
9	Study Of Classification Problems Based On Sparse Representation And Ensemble Learning
10	Research On Label Relationship Exploitation In Multi-Label Learning