Font Size: a A A

Design And Realization Of An Internet Public Opinion Monitoring System

Posted on:2014-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y TangFull Text:PDF
GTID:2248330398470906Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of IT during recent years, more and more Internet users in China are playing more important role in the whole public opinion. The various network public opinion events happened during the last five years illustrates that the Internet has become the main medium for Chinese people to freely express their opinions, ideas, and views. Because of the openness, inclusiveness and limitlessness of the network, it’s hard to find an efficient avenue to supervise and monitor the public opinion over the whole Internet. Because of the lack of available monitoring means, during the past several year, a series of public opinion events occurred frequently, which involves in a great of national administrations, NGOs, corporations and organizations. As a result, it is an exigent desire to develop an efficient and practical measure to monitor the opinion trend over the whole Internet.In response to those problems mentioned above, this paper analyzes several key points of public opinion monitoring over the Internet, and provides the whole and detailed requirements analysis, design proposal, verification proposal and concrete implementations.First and foremost, this article focus on network public opinion monitoring mainly involves to the relevant techniques, including general Web crawler techniques, focused web crawler techniques, and several practical search strategies. After that, the implementation of Chinese segmentation under Nutch, the open source search engine framework, based on ICTCLAS4J will be introduced. This paper introduces key points of text feature extraction techniques and some identical models of text classifier, including information gain model, Chi square statistic model, and mutual information model, especially the TF/IDF algorithms. An overview of three text classifier models, kNN, naive bayes and Rocchio will be described. After all those above, this paper comprehensively compares the two main system architectures, B/S and C/S.Then, this paper analyzes a mature, practical, efficient, accurate Internet public opinion monitoring system of the relevant requirements, including the system detection range, system function demands, system business requirements, system performance requirement, front end user interface requirements, system security needs, data acquisition and data processing demand needs, and analyzes the different collection object characteristics, including BBS, news commentary, the blog,micro blog and Baidu Tieba. In view of the above requirements, this paper gives a summary of the Internet public opinion monitoring system design and detailed design solutions, which include the design of system framework, the design proposal of system backend architecture, the design of system usage life circle, and the several design scheme of several key subsystems, including information collection subsystem, information processing subsystem, public opinion alarm subsystem and public opinion guide subsystem. After all of those, this paper suggests the design details of the directional crawler module and automatic irritation module.In the last part, according to the designs above, we represent a realization sample of Internet public opinion monitoring system, including the dev-platform, the chosen database, the primary dev-languages and the primary dev-tools. Finally, this paper outlines direct implementation affects of several front pages.
Keywords/Search Tags:Focused Crawler, Public Opinion Monitoring, Text Classification, Internet Public Sentiment
PDF Full Text Request
Related items