Font Size: a A A

Supervised Topic Models With Deep Learning And Its Application

Posted on:2020-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:D D YuanFull Text:PDF
GTID:2428330626451359Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Electronic documents are excessively increased in the age of Internet.How to effectively mine the implicit semantics of these electronic documents has become a hot issue in natural language processing.An LDA model posits that each document is an admixture of latent topics,of which each topic is represented as a unique distribution over a given vocabulary.The document-specific admixture proportion vector ? can be regarded as a low dimensional representation of the document in a topical space.A supervised topic model can use side information such as ratings or labels associated with documents to discover more predictive low dimensional topical representations.This article conducts an in-depth study of the topic model and its applications.The main contributions are as follows:(1)This paper presents a supervised topic model with Deep Learning(DL-sLDA).We use the deep neural network to establish the mapping between the topic of a document and its label.DL-sLDA can be used for both classification and regression tasks with different structure of the deep network.Based on the unsupervised topic model,DL-sLDA adds steps to describe the relationship between the topic information and the label in the generation hypothesis,so that it can model document words and labels at the same time;We adopt both variational EM and deep learning methods to learn the defined parameters in DL-sLDA.The result of our experiment demonstrates that DL-sLDA can not only maintain the ability of topic extraction but also gain a better predictive ability.Compared with other popular non-topic models(such as DAN),DL-sLDA has achieved competitive results.The advantage of DL-sLDA is the ability to discover implicit topic information,which makes DL-sLDA interpretable.(2)This paper presents a text feature extraction method combining LDA and word2 vec.LDA defines a hierarchical data generation process,in which the document and words are linked through topics.LDA can extract high-level topic features of text and word2 vec can extract low-level word features of text.In this paper,a method of combining high-level topic features and low-level word features is proposed to construct a better document representation.The result of experiments show that the proposed method is better than those methods using LDA feature or word2 vec feature only.
Keywords/Search Tags:Topic Model, Deep Learning, variational EM algorithm
PDF Full Text Request
Related items