Font Size: a A A

Joint-NMF Based Topic Detection And Evolution Analysis

Posted on:2018-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:M W ChenFull Text:PDF
GTID:2348330512987248Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The recent years have witnessed rapid advancement of Internet technology,the number of network news media grows rapidly.The network news has gradually become an important way of obtaining information for the public.However,in terms of the network news' characteristics,i.e.temporality and dispersibility,the same event may be reported by different media during a period of time,and the topics of the event will shift with time.These restrict readers to obtain the hot topics and understand the whole event quickly and accurately.In the face of massive reports,how to detect the hot topics of news events,helping people understand the whole event is an urgent problem to be solved,and it is also the primary concern in the field of topic detection and evolution analysis.The basic task of topic detection and evolution analysis is to find and track the potential topics contained in large-scale dataset,and further analyze the changing of hot topics.At present,the abundant researches only take the structure information of news reports into consideration,while ignore the continuity of news reports in time dimension.Thus,it leads to a big difference among the discovered hot topics of the same event,which may confuse people to understand the whole event.In order to solve the problem,we develop a Joint-NMF Based Topic Detection and Evolution Analysis(ToD)approach.The main contributions of this paper include:1)We propose a novel joint non-negative matrix factorization(NJNMF)algorithm in the ToD approach to capture temporal discriminant topics from dataset using time information.By leveraging a Joint-NMF algorithm,NJNMF can be utilized to analysis the trend of hot topics evolution.2)We add penalty function and new defined iteration rule to NJNMF algorithm.According to the feature of temporality and dispersibility in the news,penalty function and iteration rule can server the purpose of discriminant topic detection.3)We adopt the concept of entropy in ToD approach to eliminate noise topics.To ensure the reliability of data in topics,we propose a method of selecting high-quality topics.The method prevents noise from influencing the analysis result of topic evolution.To verify the validity of the ToD approach,we conduct experiments on three real datasets,i.e.20Newsgroups,LTN2011 and LTN2014(Mexican illegal immigration news reports).First,we carry out a comparative experiment on the 20Newsgroups dataset.Compared with existing methods,the ToD approach provides a better solution for the problem of topic detection.Then,we analyze the Mexican immigrants issue on the time dimension in the LTN2011 dataset.Last,we analyze the issue among the different media news.Experimental results show that the ToD approach outperforms the state-of-the-art approaches on topic detection,and has an outstanding ability of analyzing the topic evolution.
Keywords/Search Tags:Joint-NMF, Topic model, Temporal discriminant topic, High quality topic, Topic detection and evolution analysis
PDF Full Text Request
Related items