Font Size: a A A

Research On Ranking Topic Models And Their Applications

Posted on:2015-07-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z B XiaoFull Text:PDF
GTID:1228330467950835Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Topic models are generative probabilistic models which can extract hidden semantic topics from large scale discrete data. Since its proposal in2003, topic models have become important research projects in machine learning, natural language processing, computer vi-sion, and received wide applications in text mining, opinion mining, social network analysis, video scene understanding, protein structure analysis, finance data analysis and other fields. But, as the size of the corpus grows, the topic numbers discovered in topic models grows accordingly, the results of topic models are getting more and more harder to use for other tasks. A detailed and systematics review of topic models is conducted, which focus on topic models’development and evolution and characteristics of various improved models. Based on this comprehensive review, combining with other techniques in machine learning, in-formation retrieval and natural language processing, query-based ranking topic models and query-independent ranking topic models are proposed and applied to recommender system and multi-document automatic summarization. Innovative results can be summarized as follows:1) After systematic review of topic models and compared its unique characteristics with other machine learning paradigms, the research question "How to highlight the salient topics and ignore other non-significant topics" is raised. To answer this prob-lem, ranking topic models are proposed.2) Correlation based ranking topic model framework is proposed, which can utilize various relationships re-order topic distributions without query words provided by the users. This algorithm can solve the problem of too many topics when corpus size grows without any other prior information provided by the user, in this way, topics are ordered by their importance in the corpus. Correlation based ranking topic models are applied to multi-document automatic summarization, after compared with other clas-sic summary algorithms and summary algorithms based on topic models, the experi- mental results show that correlation based ranking topic models can indeed highlight important topic features and improve the performance of automatic summarization.3) Query-based ranking topic models are proposed so as to re-order topic distributions under the supervision of query words provided by users. Topics are re-organized ac-cording to users’ intension. Topics are ordered by proposed topic relevance, in this way, the utilities of ranked topics are brought up. Proposed ranking topic models are applied to academic paper recommender system so as to increase the serendipity of the system. Query-based ranking topic models can not only find similar topics to users’query words, but also can discover semantically relevant topics, in this way, the serendipity of the system is raised as well as the accuracy of the recommendation. Compared experiments show that recommender system based on proposed algorithm has higher serendipity and robustness.4) Behavioral experimental technique and functional magnetic resonance imaging techniques(fMRI) from brain cognitive science are introduced to study topic models. With these two techniques, cognitive behaviors and brain activation area are inves-tigated when human beings are inducing abstract topics. Preliminary and promising results are helpful to further research on topic models.
Keywords/Search Tags:Machine Learning, Topic Models, Ranking Topic Models, Multi-Document Automatic Summarization, Recommender System, fMRI
PDF Full Text Request
Related items