Font Size: a A A

The Design And Implementation Of Internet Web Content Acquisition System

Posted on:2017-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:M JiaFull Text:PDF
GTID:2348330518495764Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,various websites emerge in endlessly and the webpage data also shows an exponential growth.If webpages can be analyzed quickly and valuable information can be extracted and provided to those who need it,it would be a very meaningful thing.Meanwhile,in the process of collecting information,if we can monitor and manage the whole process,it will reduce the management cost and improve the collection efficiency.Based on the above background and purpose,this thesis presents the design and implementation of Internet Web content acquisition system.The system is divided into two parts:the network crawler back end and the crawler monitoring system.The network crawler back end can crawl the Web data and extract the useful content according to the user's customized crawling strategy.The monitoring system can manage the related content information such as the crawler,the user and so on.Internet Web content acquisition system can provide underlying data support better for the laboratory related projects,and provides an effective platform for the management of the crawler.This thesis achieves an Internet Web content acquisition system based on Akka,Play,Thrift and other technologies.First of all,according to the requirements analysis,this thesis develops the overall design.In the overall design,the logical architecture of system and physical deployment of the design are proposed.The system is divided into four parts,including network crawler back end,monitoring systems,interactive communications and data storage,and the function of each part are also designed.Then,according to the overall design,the function of each part is realized,and the working principle and implementation details are introduced in detail.After the system implementation,the function test and performance test of the system are carried out,the test results are analyzed and a conclusion is made.Finally,the thesis gives a summary of the work,and put forward the improvement plan for the disadvantages of the system.
Keywords/Search Tags:web crawler, monitoring system, Akka, Play
PDF Full Text Request
Related items