Font Size: a A A

Design And Implementation Of Focused Crawler In Internet Public Opinion Supervisory System

Posted on:2012-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2178330335960428Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the high speed development of Internet in the global wide, Internet has become one of the most important techniques for people to get information and public opinion. Since the Internet has the characteristics of open and virtual, the Internet Public Opinion affects more broadly, spreads faster and erupts more suddenly than the traditional public opinion. It is essential to supervise and control the Internet public opinion effectively.This thesis discussed the research results of focused crawler in public opinion supervisory system, and analyzed the key technologies such as strategy of choice, web evaluation algorithm, topic isolated island avoidance, then designed and implemented the focused crawler for Internet public opinion supervisory system. The main work done was as follows:(1) Analyzed and summarized the characteristics and difficulties of Internet public opinion supervisory system, and pointed out the function and goal of focused crawler in system.(2) Analyzed the main functions and design features of focused crawler, and designed the focused crawler for Internet public opinion supervisory system, including the framework of focused crawler, crawler's rule module, crawler's control module, and user configuration module.(3) Realized the crawler, including webpage crawling and analysis, crawling strategy, similar webpage elimination, multi-crawler scheduling, and improved and implemented the I-Match similar webpage elimination algorithm and consistency Hash scheduling algorithm, then carried out experiments of the realized algorithm and tested the focused crawler fully.The focused crawler not only can meet the daily use of relevant person in AQSIQ, but also has a better display in performance, accuracy and recall rate than common crawler even classic focused crawler. It fully plays the role of information technology, and better supports the leadership decision-making and rapid response to Internet emergency.
Keywords/Search Tags:Internet public opinion, public opinion supervisory system, information collection, focused crawler
PDF Full Text Request
Related items