Font Size: a A A

Research And Application Of Network Public Opinion Topic Detection And Tracking Technology

Posted on:2014-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:B YiFull Text:PDF
GTID:2248330398957599Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the growing importance of network public opinion of the state, enterprises, more and more network public opinion monitoring system was developed to help governments, businesses cope with the outbreak on the network on their own public opinion pressure or group events. In the low threshold for the use and disclosure of information is very easy to spread quickly, a great influence on the Internet environment, real-time monitoring systems collect relevant information, intelligent analysis of the information content, found that public opinion crisis situation in a timely manner, monitoring for automation, processing network public opinion provides a good support, which greatly facilitates the user to assist users in a timely manner to deal with public opinion crisis situation.Research network public opinion monitoring system status and development trend of mass public opinion, access to information and topic tracking study, and focus on the design of a web crawler, topic detection algorithms and models, The basic idea of the variety of topic detection based on clustering algorithm, advantages, less were analyzed and summarized.Followed by the next study the overall design of the Internet public opinion monitoring system, and elaborates on a massive network of public opinion information, described in detail the design of the web crawler microblogging reptiles, which through the use of open-source web crawler Larbin news, forums blog collection, and the Web crawler to improve on the basis of the original, well adapted to the requirements of the system. The word text pre-processing module of public opinion information, text vectorization, feature extraction and feature weight calculation, web pages purification, web re-scheduling, automatic summarization of web technology to do a brief description.Finally, a detailed description of the discovery of the topic and tracking, topic detection and topic tracking the design characteristics of mass public opinion information. Representation of the information text introduces the vector space model, including the introduction of a set feature items weight, features dimension. And by improving the similarity algorithm to improve the efficiency of the clustering algorithm. By comparing various clustering algorithms, a hybrid clustering algorithm SHDC. Finally, the topic tracking study to design a multi-dimensional characteristics of the topic tracking model, and finally prove that this model can effectively distinguish between similar and the same event, the correct track the topic on the Internet.Finally, although the topic of technology and hotspots found that technology is more mature, but encountered a lot of problems in the practical process of technology has seriously hampered the effectiveness of technology in public opinion monitoring system to establish a hot topic discovery model also has a high research significance and research value. Finally, by running the examples and comparative analysis of the feasibility and effectiveness of the model is validated.
Keywords/Search Tags:Network public opinion, Web crawler, Text similarity, Text clustering, Topic tracking
PDF Full Text Request
Related items