Design And Implementation Of Distributed Agricultural Network Data Acquisition Platform Based On Scrapy-Redis

Posted on:2020-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:L L Wang

Full Text:PDF

GTID:2493305981453034

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

After years of informatization construction in the agricultural field,it has achieved relatively good results,and the agricultural information resources on the major websites are all in full bloom.However,as the agricultural informatization process is carried out under the old institutional framework,its foundation is relatively weak,which makes the agricultural information resources have the characteristics of mass and decentralization,resulting in the contradiction between the increasingly rich network information resources and the lack of personalized data acquisition platform,and the coexistence of the waste of agricultural information resources and the lack of acquisition platform.Under this circumstance,how to innovate the data collection methods in the agricultural field has become an important application topic of modern information technology in the agricultural field.Web crawlers are considered as the key to solving this problem as a tool for collecting data.In order to solve the above problems,based on Python and Scrapy framework environment,taking meteorological websites and prices of agricultural products as crawling objects,this research exploratorily designs a subject content recognition algorithm based on BERT model,which is used to evaluate the relevance between Web links and subject content.Finally,a distributed agricultural network data acquisition platform based on Scrappy-Redis is implemented.The whole work of this research is mainly divided into five parts:(1)Aiming at the disadvantage that traditional search engine returns results without professionalization,this research designs an Xpath topic content extraction algorithm based on Python and an agricultural topic content recognition algorithm based on BERT model,and focuses on how to evaluate the relevance between Web links and subject content through the agricultural topic content recognition algorithm based on BERT model.It is applied in projects that collect agricultural product prices.The research shows that the algorithm has a relatively high recognition effect in the natural text analysis of the agricultural field.(2)Aiming at the problem of whether web crawler technology can be applied in agriculture,this research chooses Scrapy framework with simple operation and complete functions,designs an experiment of data acquisition of Agrometeorological network based on Scrapy framework,verifies the applicability of Scrapy framework in the field of farmer topics,and lays a foundation for subsequent use of web crawler to collect agricultural network data.(3)Aiming at the slow speed of collecting information by general web crawlers,a distributed crawler framework based on Scrappy-Redis is designed and applied to collect agricultural products price.The Schedule component and Item Pipeline component in Scrapy stand-alone frame are redeveloped for agricultural projects,enabling them to perform distributed acquisition tasks.The distribution module is composed of one Master host and four Slave slaves.The research shows that compared with single-machine network crawler,the distributed crawler has a multiplier improvement in data acquisition speed.(4)Aiming at the attack of some websites on the crawler program,a crawler protection mechanism is designed,and some strategies to deal with the anti-crawler are preset,such as sending User-Agent to check the anti-crawler and adjusting the frequency of access,which effectively avoids the risk of attack,strengthens the robustness of the crawler system and consolidates the stability of the network data platform in the agricultural field.(5)A network data acquisition platform for agriculture was designed.Using Qt and other program frameworks,the interface of each acquisition module is designed.Based on the above work,a distributed agricultural network data acquisition platform based on Scrapy-Redis is implemented in this research,which combines the theme content recognition algorithm and web crawler technology.

Keywords/Search Tags:

Agricultural Big Data, Reptile, Distributed Acquisition, Scrapy-Redis, BERT Model

PDF Full Text Request

Related items

1	The Design And Implementation Of The Mass Sensor Data Acquisition System For Smart Agriculture
2	Research And Implement On The Distributed Storage System For The Intelligent Agriculture
3	Construction Of The Plantingbase Data Acquisition System Based On The Facility Of ModbusRTU
4	Research On The Application Of Big Data In Agricultural Internet Of Things
5	Research And Implementation Of Pig Breeding Report System Based On High-Performance Parallel Computing
6	Research Topic Identification And Evolution Analysis In The Field Of Agricultural Engineering Based On LDA-BERT-K-means Mode
7	Design And Implementation Of The Agricultural Microclimate Data Acquisition Station
8	Research And Development Of Agricultural Greenhouse Environment Intelligent Acquisition System Based On Wireless Transmission Network
9	Agricultural Data Acquisition System Based On Radio Frequency Networking
10	Design And Implementation Of Data Storage Management System For Agricultural Internet Of Things