With the advent of the Internet on the explosive growth of information change rapidly,how to obtain a more accurate, more detailed, more deep professional resources, be on thesearch engine technology puts forward higher requirements.Therefore, the major theme of thevertical search engine system emerge as the times require; at the same time, digital productsfor the rich people’s lives and inject new blood, and online group purchase digital productshave become popular consumption mode.In such a big market and technology combined withthe research background, using multiple vertical search technology, domestic well-knowndigital product on the Internet site in the digital products of information content as acollection of resources in full text retrieval object, open source tools software package on theplatform of Lucene, research and Realization of digital products information has thespecialized search engine search function.In this paper, in addition to details of the vertical search engine working principle, aswell as Lucene core platform technology, also introduced including3tectonic search enginesystem is closely related to the key technologies include: focused crawler working principleas well as the Heritrix crawler technology, in the traditional crawling strategy is proposedbased on the analysis of crawler eaters crawling strategy algorithm is studied, in which thealgorithm introduced several weighting parameters such as link popularity, importance andfind a shortest path algorithm; introduced the Web webpage information extraction method,and the common method of classification, finally proposed the digital product design ruleextraction method research, research for general web5design rules structure are analyzed,and the in view of the several design rules presented content extraction scheme; introductionto natural language processing technology is the user search before termination of export,according to the actual efficiency principle to the system using the maximum probabilitysegmentation algorithm, semantic search key technology research.The realization of thefunction structure of the system design, the function of each module design, data flow chartdesign, database design and coding.Research and practice shows that, this paper study and design of digital product ofvertical search engine in a project is feasible, the system can meet the expected designphilosophy and objectives. |