Font Size: a A A

Search Engine Design Analysis And Query Improvements

Posted on:2008-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:X H WangFull Text:PDF
GTID:2208360212999789Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
More and more people now want to acquire information through the Internet because its growth is dramatic. So under this background,search engine technology comes up. Due to its thousands of millions of information, fast response to uers'requirement and rational rank of websits,search engine quickly becomes popular between all kinds of people. Along with the requirement becomes more and more professional, many kinds of search engine are designed.such as image search engine and news search engine are designed to meet these professionals who belong to image and news business, windows-based search engine and linux-based search engine meet the professionals who want to job acorss two platforms.The main task of this paper is to improve a linux-based search engine, from its beginning, the search engine endeavors to improve itself on how to build index fastly and efficiently and on how to provide search more quickly. Its biggest characteristic is that the part of web crawler and that of index is separated, which can support diversity of web collection and the most important the time is greatly saved to build index. At the same time,the search engine provide flexiable text-based user interface on linux, so user can control not only the amount of the output but also the display style. This paper's task is to add new function of phrase search to the search engine based on the current functions. At the beginning, the paper briefly introduces the related technology of search engine, includes search engine architecture, web crawler, document pretreatment, index technology, search technology and ordering technology. Then search engine's index subsystem and search subsystem are discussed on the base of detailed analysis of the source code. Finally, the improvement of the search engine is introduced, phrase format is first discussed, then phrase parsing and phrase search are subsequently discussed. Many revisements should be added to the original code and the paper tries to avoid affecting the original functions of the search engine. At last of the paper, a test of the added function and the conclusion are provided.
Keywords/Search Tags:search engine, index, search, phrase search
PDF Full Text Request
Related items