Font Size: a A A

Design And Implementation Of The Water Conservancy Information Aggregation System Based On Web Crawler

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:N YanFull Text:PDF
GTID:2370330599458687Subject:Hydraulic engineering
Abstract/Summary:PDF Full Text Request
With the development of water conservancy informatization,a huge amount of water-related information has accumulated on the Internet.The information on the Internet has a large amount of data,discrete distribution,and complex structure,using traditional manual methods to collect information has many problems such as inefficiency.Using web crawlers can quickly and easily get information of water conservancy from the Internet with a large amount of data and high relevance.The system builds a platform for gathering information based on the focused website crawler to realize the getting and using of water conservancy information.The main research contents of the thesis are as follows:(1)Design the functional modules of the focused website crawler.Select the framework of the focused website crawler,filter the collection of seed websites,define the thesaurus for the description of the topic,select the crawling strategy,design the scheme of link extraction,and analyze and improve the topic relevance analysis algorithm.(2)Design a scheme to obtain water conservancy information in GIS format.Information in GIS format is a distinguishing feature of the water conservancy from other industries.Most of the geographic information of water conservancy on the Internet is provided in the form of an interface,and users cannot directly obtain the original data.According to the tile pyramid model,design a map stitching and scaling algorithm to realize the acquisition of water conservancy information.(3)Design the schemes to obtain water conservancy information in multiple formats.The focused website crawler customize different crawling schemes according to the characteristics of different formats of the information to comprehensively obtain water conservancy information in various formats such as text,data,pictures,video images and maps.(4)Design a module to denormalize the water conservancy information.In order to solve the problems of non-uniformity and non-standard of water conservancy information,the water conservancy information standardization module converts the information into a common format according to the type,and performs corresponding algorithm processing on the irregular information to standardize them.(5)Build a water conservancy information aggregation platform.Based on the focused website crawler,build the water conservancy information aggregation platform.The platform aggregates water conservancy information and provides users with a variety ofservices such as display and research the water conservancy information.The water conservancy information aggregation system uses the focused website crawler to crawl information,and builds a information aggregation platform to provide users with various characteristic information services.The system brings great convenience and good experience for users to collect and utilize water conservancy information.
Keywords/Search Tags:water conservancy, the focused website crawler, relevance analysis, GIS, information aggregation
PDF Full Text Request
Related items