Font Size: a A A

Design And Implementation Of Monitoring And Exception Analysis Module In Content Convergent Subsystem

Posted on:2019-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:L WeiFull Text:PDF
GTID:2348330545955580Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet information,more and more enterprises acquire the required data through the web crawler and integrate the data.The content convergent subsystem enables data crawling in multiple network stations through customizable crawlers,to provide data services for the China Broadcasting Cloud platform.However,customizable crawler tasks all run in the script form for a long time.Administrator can't grasp the real-time running status of different crawlers,which making it difficult to manage the crawlers uniformly.At the same time,administrator needs to manually view the log to analysis when the crawler task exception occurs,it is difficult to quickly locate the exception location and reason and high maintenance costs.In order to overcome the shortcomings of customizable crawler tasks in management and exception analysis,this paper presents the monitoring and exception analysis module used in content convergent subsystem,which provides functions including management,monitoring,control,exception warning and exception analysis of crawler tasks.Users can grasp the real-time running status of different crawler tasks,manage and control them uniformly.When a large number of exceptions occur during the period of running,the user will receive an exception warning email and process it timely.After the crawler task runs,the user can view the report to quickly grasp the situation,and quickly locate the cause and location of the exception.To achieve the above functions,the monitoring and exception analysis module is divided into the crawler task management submodule,the crawler task real-time status visualization submodule,the crawler task control submodule,the crawler exception warning submodule and the crawler exception analysis submodule.The database collection structure is firstly designed to support the storage of data related to the crawler.Based on the design of the database collection,the crawler task management function is implemented in the crawler task management submodule.The duplex communication links based on WebSocket are created in the crawler task real-time status visualization submodule,and the real-time display of current running status of the crawler task is implemented.The start and stop functions of the crawler task is implemented in the crawler task control submodule.The crawler exception warning submodule achieves a significant exception warning of the crawler task.Based on the built exceptional reason classification rules,the crawler exception analysis submodule implements the exceptional reason classification and match of the exceptional log generated by the crawler task,achieves the exception reason and location analysis of the customizable crawler task,finally generates report to the user.After requirements analysis and researching the key issues,it comes the design and implementation of the submodules.The paper designs and executes the test cases for the submodules,the result prove the submodule meets the requirements.Finally,it states a summary of the whole paper.
Keywords/Search Tags:Web Crawler, Monitoring, Exception Analysis, Web Real-Time Communication, Data Management
PDF Full Text Request
Related items