Font Size: a A A

Research On Individuated Vertical Search Engine

Posted on:2008-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:W Z LiFull Text:PDF
GTID:2178360215472442Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
At present the main search engine in Internet field main facilitator is Yahoo, Baidu and Google, etc, which provide the customer to find horizontal and large numbers of information. Go with the continuous update and evolvement of Internet, If the ordinary network user wants to find the necessary data it just like looking for a needle in a bottle of hay, the large numbers of information is no longer the main power of further development, that is consciousness and timeliness are the real motive force. The key problem of the Internet development is not to provide and transfer information for customer fleetly and largely, but to make our customer to obtain anticipant information at anticipant time and destination in anticipant mode and cost. We can satisfy the largely information's research in horizontal way by common search engine, however ,it is very difficult to give consideration to the accuracy and the relevant of search quality. The value of common search engine lies in the navigation of in a large amount of information, which is lack of direction for trade customer whose demand for information is relatively centralized and classifying is more detailed. To solve this problem becomes the chance to the development of search engine. It also becomes the focus of the scientific research institution to competitively study in the future. The new search mode Vertical Search Engine is just produced under this background.The investigation of this dissertation constructs a prototype system of Vertical Search Engine by theoretic analysis and idiographic design. The text will introduce the investigation content detailedly in five parts.The introduction part of chapter one has introduced the development history of the search engine in detail, in which have pointed out the problem at present that the comprehensive search engine faces and the route to solve these problems. That is the direction of the dissertation studies: Vertical search engine. Through the comparative analysis with comprehensive search engine in information service and key technology, it points out that the vertical search engine is provided with enormous advantage and development space. Finally, it analyzes the state of development at home and abroad of the vertical search engine and proposed the problem that this text should solve.Overall frame analysis and design that builds up the chapter two, which provides overall design plan and workflow of the vertical search engine, and then analyzes it's own characteristic. In addition, it provides collection information model which is in common use in gathering strategy, and analyzes the kernel idea and the deficiency of the commonly collection algorithms– comparability matching algorithms based on the vector space model. Finally, through the introduction of ontology, it proposes the implement way of the intelligent information gathering strategy based on the ontology repository, which is to resolve the problem that one word more than justice and one justice more than word in the course of information collection.The chapter three is the Lucene frame research part which detailedly analyses the classic opening code full-text retrieval frame. Including the introduction of retrieval technique of the full text, the source of the project, the introduction on how to construct the frame, the introduction on the very important inverse arranging index technology and marking mechanism which the index and search function that Lucene provide, and show the core code of how to construct the index and realize the search. Finally, also introduces the participle technology in Chinese and the realization principle of Lucene.Chapter four describes with the opening code reptile Heritrix and the Lucene frame design how to realize the individualized vertical search engine, and construct one prototype system of vertical search engine which faced to the mobile phone product information. It is implemented in three parts, Part one realizes that gathering function of information based on Heritrix frame and designs the procedure of information structurization collection. Part two designs the participle tool facing mobile phone product information, and make use of Lucene frame to realize the index of the structurization text information. Part three designs the inquiry interface based on that MVC frame, realizes the search function of the prototype system. Thus it provides beneficial reference and guidance for the vertical search engine on the aspect of technology. Chapter five summarizes and expects have carried on the brief summary to the work of this text, has put forward the development trend of the vertical search engine and several directions studied in continuation.There is a famous motto in the search field: " the customers are unable to describe what he wants to look for, unless let him see the thing he wanted to look for ". A technologist of Microsoft research institute says: " There are almost 75% content that we can't search them out in the common search engines ".As a branch direction of the technical development of the search engine, the vertical search engine is necessity result that the Internet customers'search that inclines to the originally simple hope to search overallly in content convert to not only overallly in content but also improve the accuracy and timeliness of the information .It will provide us related service that is not only in quantity but also more professional and individuation. Compared with the traditional search, it is more smart. So the vertical search engine market have its existing necessary condition and expansive development foreground. But as a new technology at the early-stage , there are a lot of places need to improvement and break through, this essay's study on the technology of the vertical search engine will provide realistic directive significance for the development of vertical search.
Keywords/Search Tags:vertical search engine, ontology, Lucene, indexing, information extract, MVC
PDF Full Text Request
Related items