Font Size: a A A

Design And Implementation Of Distributed Wheat Pest And Disease Theme Search System

Posted on:2020-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LiuFull Text:PDF
GTID:2393330578466862Subject:Agricultural informatization
Abstract/Summary:PDF Full Text Request
Wheat is one of the important food crops in China and is also required to guarantee absolute safety,but the presence of pests and diseases has a huge impact on its production and quality.Therefore,the development of a search system for specific crop pests and diseases,real-time,accurate,full-text search and management of relevant data on the network,is of great significance to improve the knowledge dissemination of agricultural pests and diseases,improve the efficiency of agricultural technicians,and promote the prevention and control of pests and diseases.And production value.Based on the topic of wheat pests and diseases,this paper designs and implements a wheat pest and disease theme search system based on distributed information collection and data storage by using vertical search technology.The system is divided into three modules,the main findings are as follows:(1)Design and implement the theme data acquisition module.This paper analyzes the operation principle of the single crawler framework Scrapy,and builds a distributed crawler system by combining the Reids database with the customized development of its core module.The distributed crawler uses multiple machines'bandwidth and processor to download network resources in parallel to realize fast,stable and scalable crawling of network resources.The system introduces Bloom filter to implement URL deduplication in the crawling process,and improves the Redis host.Memory utilization;improved vector space model algorithm(VSM),using the TF-IDF value based on web page label weighting as the weight of the feature item,the topic information crawling effect is improved by about 10%.(2)Design and implement the system index module.Introduce and optimize the Elasticsearch distributed search engine to achieve high-availability and high-scalability distributed storage of large-scale data.The IK tokenizer is used in the analyzer to implement the hot-update lexicon function,which improves the Chinese word segmentation effect;design and build the inverted row The index library improves data retrieval efficiency.(3)Design and implement the system search module.Designing the search data cache layer to reduce the frequent query operations of the index library during user search,and greatly improving the response speed of the search system;developing a prototype system based on the Django framework,docking the Elasticsearch search server,providing search services to users;In the case of the search function,the popular search function and recent search history function are implemented to optimize the user experience.By comparing with the search results of the general search engine,under the theme of wheat pests and diseases,the precision of the system is higher than that of the general search engine,which has certain practicability,and can provide wheat pest and disease knowledge for wheat industry technicians and new agricultural subjects.Search service.
Keywords/Search Tags:wheat pests and diseases, subject search system, distributed crawler, distributed storage, topic crawling
PDF Full Text Request
Related items