Font Size: a A A

Text Semantic Mining Based On Topic Model

Posted on:2016-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:R Y YangFull Text:PDF
GTID:2348330518499019Subject:Information Science
Abstract/Summary:PDF Full Text Request
Under the current network environment,the opening data sharing and spreading method have accumulated huge data resources for us.It has becomes to ben an urgent and important problem of how to get user-expected data from these huge data.Texts,as a most basic and widely used data type,have been focused by researchers in a long time.Topic model,an efficient method of features extraction,has becomes to be a main method in texts analysis.It models documents' generate process then extracts the latent information called topics from texts.Thus texts can be expressed as a lower dimension vector consists of a group of topics.The objective of this paper is to present an improved topic model integrating extra-features,and study on topics mining of scientific literatures by the new model,thereby,topics revolution and authors' interests will be studied.The meaning of this paper is to provide an effieient model and realization process for text semantic mining.The main works are as follows:(1)Bibliometric analysis of topic model.We search and download topic model related literatures in the web of science.By drawing country and institution clustering map,co-cited literature clustering map and key word clustering map,we present the complete research status about topic model in a visual way.The results suggest that,frontier research of topic model develops in a diversified way,which has made some new breakthroughs in traditional texts mining and semantic analysis,but also made new progresses under new application environment such as social media and big data.(2)Putting forward an improved topic model.After studying on the basic theoretical models and application research status in the development of topic model,we combine the advantages of dynamic topic model and author topic model,considering author feature and time feature.We propose a new model called Dynamic Author Topic(DAT)model.Then we detailed expound the model's implementation in the aspects of model's importing,basic hypotheses,model's graphic representation and parameters estimation.Finally,DAT model is compared with other topic models,revealing the DAT model's advantages in application scenarios and complexity.(3)Using DAT model to study topics revolution and author's interests changing.We get probability distributions of the topic-items and author-topics by extracting texts' topics with DAT model.Acoording to topic-items distribution,we study on contents changing and strength changing about topics of scientific literatures.According to author's attention changing on topics in different time,we study on author's interests changing.The experiment shows that DAT model can reflect topic revolution and author's interests changing accurately.The achievements of this paper are as follows.We revealed topic model's research status and frontier-hot points with bibliometric analysis.We proposed an improved topic model integrating inner and extra features.With the experiment of extracting topics,we proved the DAT model can be used in such studies as texts' topic revoluton and authors' s interests changing.
Keywords/Search Tags:Topic Model, LDA Model, Gibbs Sampling, Text Semantic Mining, Topic Revolution
PDF Full Text Request
Related items