Font Size: a A A

Design And Development Of Mobile Product Information Vertical Search Engine

Posted on:2012-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:D N HuaFull Text:PDF
GTID:2178330335452653Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Today's Internet is developing rapidly, and every page number is growing in geometry way, so search engines have been applied more and more. Universal search engine solves a part of information search problem, but as the number of pages returned by the universal search engines increases sharply, it becomes difficult for a user to find satisfying information. Each user has different demand for, but the universal search engines do not distinguish between all the results, as a result,the search engine has caused great inconvenience to a userAs a universal search engine development trend, the vertical search engine arises at the historic moment. Today, the industry gets further refined and division of labor, every industry grows, so how to develop a search engine for a industry becomes a new direction. Vertical search engine, also known as the professional or special search engine, is a query tool designed for inquiring information for one theme. Vertical search engine specially included information for one aspect, an industry or a theme. It is more effective than general search engine in solving some actual inquires.In fact, the vertical search engine is a certain special information integration on the web library, extracts data from directional points,and treats the user with a returned particular form. The biggest difference between Vertical search engine and ordinary web search engine is the web information structured information extraction, that is extracts pages for specific structured information data. If minimum unit of the web search is a page, vertical search is structured data. The vertical search will store these data in the database for further process, classification, word segmentation, index,at the end,it will meet user's information needs with structured data for search. With the whole process, data is transformed from the extracted unstructured data into structured data.After processing in depth the data is returned to the user in unstructured way or structured way.This paper introduces a fulltext retrieval system named Lucene, analyzes its structure and main work principle. and analyzes deeply the open-source crawler Heritrix.We analyse the crawler's each core components in detail.Based on before study, we designed and implemented a search engine system, and demonstrated its functions. This paper introduces the design and realization of the system, and introduces a theme web collecting algorithm based on the improved genetic algorithm and heterogeneous web page algorithmthe extraction algorithm, etc.Experiments show that the system has a certain feasibility and practicability.It has reference value for constructing a vertical search engine system.
Keywords/Search Tags:theme, vertical, search engine, Lucene, Heritrix, genetic algorithm
PDF Full Text Request
Related items