Font Size: a A A

Study On Topic Detection And Tracking Based On Tolerance Rough Set

Posted on:2010-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2178360272982452Subject:Cryptography
Abstract/Summary:PDF Full Text Request
With rapid development of Internet, web news publication, as a major application of Internet, is a vital factor that influences public opinion. Thus, monitoring the content of web news properly is significantly necessary. Using computer technology to collect and analysis web news is a crucial measure to maintain the network content security.In this thesis, a survey of the characteristics of web news corpus and the recent advanced technology of document representation model of other research institutions is presented. Current document representation models are not suitable on web news corpus due to the sparseness of document terms and transformation of key words of topics in topic tracking process.Consequently, based on the theoretical analyses and verification with experiments, this thesis, combining with vector space model, introduces a tolerance rough set model as documents representation. In the model, the features of terms co-occurrence are used to describe the tolerance classes of terms. The experimental results show that the tolerance rough set model can improve the performance. At last based on the improved model, a prototype system is presented to realize public opinion monitoring.
Keywords/Search Tags:Information Content Security, Public Opinion Analysis, Topic Detection and Tracking, Rough Set, Tolerance Rough Set
PDF Full Text Request
Related items