Font Size: a A A

Research On Crawler Algorithm Based On Memristive Neural Network

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2348330569995780Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,massive network data makes the shortcomings of traditional focused crawler technology increasingly prominent.In the face of ever-increasing user information search requirements,focused crawler technology urgently needs improvement and optimization.In recent years,the development of artificial intelligence has provided new ideas for focused crawler technology.Using artificial intelligence technology to study focused crawler algorithms has become a hot topic in web crawler field.In this context,this thesis is based on the memristive neural network model and mainly studies the memristive neural network crawler algorithm and Scrapy-based memristive neural network crawler system.The specific work is summarized as follows:1)Research on crawler algorithm based on memristive neural networkThis thesis proposes a crawler algorithm based on the memristive neural network,elaborates the activation propagation process of the neural network,synthesizes the breadth-first search and the best-first search strategy to design a search algorithm based on the memristive neural network,and proposes topic-relevant similarity analysis algorithm based on memristor model and topic-relevant similarity algorithm based on information entropy.2)The design and implementation of Scrapy-based memristive neural network crawler systemA web page blocking algorithm based on visual information and a clustering algorithm based on DBSCAN(Density-Based Spatial Clustering of Applications with Noise)are introduced in this thesis to design a block clustering algorithm for parsing web pages.URL filtering based on the framework of Scrapy is improved with analyzing scheduling relationship of crawler module and using Bloom filter.Based on Scrapy open source crawler framework,focused crawler system based on memristive neural network is designed.3)Algorithm Application and Experimental AnalysisThe crawler algorithm based on the memristive neural network and the focused crawler system are applied in actual project to collect and demonstrate the development achievements of Tibet.The experimental results show that the precise of the crawler algorithm proposed in this thesis is more than 50% after crawling a large number of web pages.Compared with the classic crawler algorithms based on Breadth-First Search and Best-First Search and neural network crawler algorithm,which is Hopfield net spider,the precision rate is increased by more than 10%.By introducing a block clustering algorithm to analyze web pages,the precision rate of focused crawler is increased from 40% to 60%.Therefore,the crawler algorithm and crawler system proposed in this thesis are effective and feasible.
Keywords/Search Tags:focused crawler, memristive neural network, Scrapy, block clustering
PDF Full Text Request
Related items