Font Size: a A A

Research On Semi-supervised Classification Model Based On Dirichlet Process Mixture

Posted on:2014-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z F LiangFull Text:PDF
GTID:2268330392962846Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the field of machine learning, most of the machine learningalgorithms are based on the following assumptions: the data samples areindependent and identically distributed. Then, in the actual situation,the use of a single classification model used to predict the unknown samplelabel, often has poor generalization capability. Nonparametric Bayesianmethod is based on the distribution of the data samples adaptively adjustthe model structure, so as to adapt to the data samples are not independentand identically distributed characteristics, effectively avoid thedefects of traditional Bayesian model relies heavily on a prioriassumptions. However, the use of nonparametric Bayesian model, lessresearch results in the field of semi-supervised.In view of this, we propose the Dirichlet mixing process in asemi-supervised classification generic model framework (SDPMC).By constructing Dirichlet mixing process and the integration of theclassification model, the classifier based on the characteristics of datadistribution, divided into a number of sub-classification model.Algorithm characteristics are as follows:(1) the framework of the model is a production model that can reflectmore data information characteristics on the basis of Bayesian framework.(2) the model framework for the Dirichlet mixing process and theintegration of the classifier is not a simple linear superposition processthe Dirichlet mixing process, but through joint learning and local classification model. Meter maximize the operator of the jointprobability p (x, y) as a model for training objectives.(3) the model framework can be naturally extended to semi-supervisedscenarios, semi-supervised extension, making the oversight andsemi-supervised unified into a semi-supervised scenarios, while the useof labeled samples and unlabeled training model framework, furtherimprove the generalization capability of the classifier.The SDPMC is a generic model framework. In order to verify theperformance, the paper selected multivariate logistic regression modelas a classifier constructed Dirichlet mixing process and the integrationof the multiple logistic regression model, combined with semi-supervisedlearning scenarios, as SDMPC an instance model. Finally, we posterioriderivation method commonly used in the field of graph model, complete thederivation validation of the entire model, using Gibbs sampling, Hamiltonsampling common Markov chain Monte Carlo sampling sampling algorithm tomodel hidden variables. Supervised and semi-supervised two experimentalscene contrast relative to the other classification algorithms, whendealing with complex data, with a certain degree of superiority.
Keywords/Search Tags:Dirichlet process mixed classifier, generalized linear models, semi-supervised
PDF Full Text Request
Related items