| At present,the network information rich and diverse,in order to obtain the target information from massive data,or to search for the key information,search engine technology emerges as the times require.In the search engine technology,used to obtain and analyze the data of the unit for the web crawler,a lot of web crawler existing types,functions,characteristics of the crawler itself,can also be applied to the technical field of hackers.Web crawler is currently the most common and also the most widely used one is to provide data to support the retrieval of web crawler search engine,the web crawler in order to provide users with the latest and comprehensive data retrieval operation,all the time.In this paper,by using the search engine crawler mechanism of data,conducts the research to the web crawler,based on analyzing the existing web crawler type and characteristics and working principle of the search engine,the operation mechanism of the web crawler,operation principle and characteristics are analyzed,especially for the incremental crawler mechanism was studied,and based on the incremental crawler mechanism,the design and implementation of search engine system.In this paper,the main research contents are as follows:First of all on the web crawler and its operation principle is studied,and then through the study of search engine system,data update mechanism for search engine indexing mechanism as well as the data for the study,with the help of JavaEE design pattern in Linux platform to achieve the basic search engine system,the general type of reptile and incremental crawler in search engine system were set up,and data acquisition and update of crawler mechanism of different experiment,through the experimental results and experimental data visualization,analysis and summary of the general type crawler and incremental crawler,the incremental crawler in search engine system and the advantages of incremental crawler,based on the design and Implementation of search engine system with perfect function. |