Font Size: a A A

Design And Implementation Of Public Opinion Oriented Education Theme Network Crawler

Posted on:2016-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2308330464962794Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the Internet has gradually become the main channel for public to access to various types of information. At the same time the Internet information classification has been refined, the theme of network information is becoming more and more obvious. As current general search engines provides too many results, and the topic relevance is not strong and so on, this paper proposes a educational topic web crawler, which is also an important part of public opinion monitoring system. The topic web crawler has become a research hotspot, but when it comes to education, the research is far from enough. So the research on educational topic web crawler is of certain significance.This article designed a educational topic web crawler in the context of its current status in China by analyzing the techs of current search engine, the way how topic web crawler system works, the search algorithm, thematic information recognition technology and so on. This educational topic web crawler could efficiently acquire and recognize information about education on the Internet.Search algorithm is one of the key technologies of topic web crawler, this paper mainly works on analysis and improvement of search algorithm. It puts forward a kind of topic-estimate search algorithm based on cloud computing platform. The algorithm mainly includes the search task scheduling algorithm in cloud platform, under site search algorithm based on cloud node, theme recognition algorithm based on vector space model and retry method based on Bloom filter. By taking advantages of cloud computing, such as rapid speed and excellent stability, this algorithm is greatly different from the traditional topic web crawler which focused on stand-alone mode.through test and analysis, the efficiency and the degree of how the information acquired is related to the topic have been improved significantly. Based on the above research, in the future, the main research direction lies on how to apply this topic web crawler on to the cloud computing platform on all round. Solve the other key technical problems to achieve a real educational topic web crawler on cloud platform.
Keywords/Search Tags:theme network crawler, education public opinion, information acquisition, C/S, antcolony algorithm
PDF Full Text Request
Related items