Font Size: a A A

Content Organization And Topic Monitoring On The News Platform

Posted on:2015-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2298330467462378Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Based on the platform of news websites on the Internet, the paper focuses on content organization and topic monitoring With NLP and machine learning algorithms, so as to provide users with a brand new browsing experience of finding the "interesting information" conveniently. Through this text processing system, users can get real-time news, customize favorite news, and find news that might interest them by category. In addition, users can also find real-time hot topics and tracking them.In this paper, firstly, use traditional textual analysis methods to realize news organization, topic detection and tracking and users’news customization. For example, extract users’ interested news based on text classifier automatically. Detect hot topics on historical news with text processing algorithms such as Single-pass, NMF and LDA. Then, the paper puts forward a series of innovative solutions of news platform. The solutions include news organization based on HFTC algorithm and topic detection based on WBN-FTC. HFTC builds a news hierarchical clustering structure to help users find news by category carrying the semantic description. The WBN-FTC overcomes the shortcoming of FTC algorithm that the support threshold is difficult to choose. It can’t only find topics effectively like LDA, but also get rid of the limit of VSM. So it performs better in mass data. In addition, it can set topics’size by adjusting parameters. Meanwhile, in the engineering realization, the paper use search engine to implement text mining algorithm. It not only improves the efficiency of system, but also reduces the program cost.At the same time, the paper puts forward two topic tracking schemes based on query expansion and combined classifier respectively, and brings in the idea of using time series features to realize topic prediction and pattern recognition. All of these methods lay the foundation of more brand new applications in the topic monitoring field.
Keywords/Search Tags:news organization, topic detection and tracking, HFTC, WBN-FTC, topic dynamic
PDF Full Text Request
Related items