Font Size: a A A

The Research And Realization Of Index Technology In Chemical Search Engine

Posted on:2009-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:M TanFull Text:PDF
GTID:2178360245474713Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and WWW, the resource in Internet become more and more, therefore, many kinds of search engine based on Internet develop quickly. The general search engines such as Google, Baidu are very excellent at their search functions, you can search almost all information in every field. But they are unfitable in a certain professional field. It is necessary to develop professional search engine in specifically field in order to searching information effectively.Chemical search engine is one type of professional search engine for searching information in chemical field. In this paper, base on comprehending relative knowledge about index technology in search engine deeply, we analyze and research the source code of Lucene in Apache full-text search system, and master the system structure, the basic data type, the index structure in memory, the structure of index file in Lucene system. We understand the process of indexing and the measure of indexing, and master the measure of controlling index weight and optimizing index. Based on the research, the project of multi-indexer is designed to decrease the time of establishing index effectively, the efficiency of searching chemical terms can be improved by optimizing sorting process of term of index lexicon file, the indexed documents are added different types of parameters values to improve accuracy of searching chemical terms, This index database is more suitable for searching chemical information.This indexer which can eatablish inverted index database for chemical document database fits not only chemical search engine system, but also the chemical document search system based on full-text search technology. There are important references for other professional search engine to establish index database.
Keywords/Search Tags:Lucene, index database, multi-indexer, sorting algorithm, boost values
PDF Full Text Request
Related items