Hierarchical Topic Model Based On Document Frequency

Posted on:2012-05-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Zhang

Full Text:PDF

GTID:2178330335460291

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

To extract the semantic structure in text collection, a variety of unsupervised approaches has been proposed. In the context of a general "bag of words" assumption, documents turn out to be vectors containing counts of terms in them. After such a process, a sophisticated statistical framework has been created successfully by topic model, following a line of work which continuously improving the model structure.Statistical topic models are attractive because they allow for a rapid analysis and understanding of new collections of text. However, this framework cannot provide sufficient information for the problem of learning a topic hierarchy from data. It has been shown recently that the data-driven learning approaches combined with some structure and prior knowledge can be a satisfactory solution. In this paper, we review a new probabilistic framework which adds the hierarchical information within document frequency into topics to seek the more semantic structure. The hierarchical topics created by DF topic model have a natural relationship beyond the tree structure. I illustrate our approach on 20 Newsgroups to show the performance of our model in extracting hierarchy of topics.From a cognitive science perspective, the background knowledge is an important supplementary means of getting hierarchical topics. And a lot of previous work has been developed by adding side information in analyzing text data. We follow this idea in a different way. That's because document frequency comes from basic data itself. So my work is also an unsupervised learning. Finally, by the combination of DF and statistical learning processes, I want this human-interpretable decomposition of the texts to be more semantic.

Keywords/Search Tags:

Graphical Model, Topic Model, Hierarchy, Semantic

PDF Full Text Request

Related items

1	Researches On Technologies Of Multi-Hierarchy Semantic Video Object Description Model And Extraction
2	Research On Semantic Representation Of Text Based On Topic Model
3	Text Semantic Mining Based On Topic Model
4	Research On Semantic Reinforcement Based On Topic And Word Features For RNN Language Model
5	Semantic SLAM Based On Topic Model
6	Research On Image Scene Semantic Recognition Based On Probability Topic Model
7	The Improvement And Application Of TFIDF Algorithm Based On LDA Topic Model
8	Research On Microblog Topic Recognition Based On Neuro-semantic Topic
9	Research On Service Semantic Link Network Based On Probalisitic Graphical Model
10	Research And Implementation Of Topic Model Technology