Font Size: a A A

A Study On Supervised Topic Model And Its Application

Posted on:2012-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2218330362957520Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The requirement that enables the computers to process large amount of texts, images and multimedia data has always be one the great challenges of scientists, and there is no doubt that how to understand the natural languages is the key to the intelligent processing of information. Different from the traditional research methods, establishing kinds of logic rules, some researchers believe that they can achieve this goal by training on computers based on the theory of probability and statistics. Agreed with this idea, the topic model was produced due to the combination of Bayesian theory and graph model.As a new model, the topic model has been applied into text clustering, information retrieval, speech recognition and so on. With the introduction of latent semantic variables associated with semantic, the topic model has the performance of dimensionality reduction, and it is an unsupervised model which cannot be applied into supervised learning, so how to use tagged data into model for supervised learning is the focus. Making use of generalized linear model to model the relationship of latent variables and tags, which can be described into the process of topic model, a supervised method can be produced, then the variation method based on mean field theory and EM algorithm can be used to solve the updating formulas of parameters and also the prediction ones, and at last prediction can be made after the training.The classification experiment shows that this supervised topic model is feasible and effective. Compared with the traditional method of machine learning algorithms, support vector machine, which often has the highest accuracy, the supervised topic model takes fewer minutes on prediction while gets lower accuracy. In addition, the supervised model gets higher accuracy obviously than the combination of topic models and support vector machine though takes more time than the latter.
Keywords/Search Tags:Supervised Learning, Machine Learning, Dimensionality Reduction, Topic Model
PDF Full Text Request
Related items