Research On Data Store Of Search Engine

Posted on:2006-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:H He

Full Text:PDF

GTID:2168360152971158

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As appearance and popularization of Internet, people can access much more information. The method of gaining information is changed Internet become main source of information. How to retrival interesting information rapidly become an attractive research area with the exponential growth of information on the Web. Search engine was introduced to solve this problem.Search engine is the combination of traditional information retrieval and web.Traditional information retrieval get information from document repository, which core technique is to index and search text information. Traditional directory and full-text search are used in process of search. It can meet the needs when information is not large. When racing distributed, volatile and large volume data, traditional information retrieval can't find the exact information rapidly.Search engine is the extendability of traditional IR techniques, concerning the key techniques : data collection, Chinese word segmentation, inverted index, retriving hidden data, distributed architectures, huge data store, analysis of human behavior, etc. Search engine consists of information collection, indexing, query. At first, Search engine collects web page from internet using crawler. Then, the web page data are analysed by indexer and indexes are created. Searcher accept user query requests, find relevant results through indexes. Finally, the results are sent to user after sorted.The processed data in search engine mainly include web page data, indexes data and url data. They have different characters in capacity, the update period. How to manage the data efficiently is one of the key technique of search engine and the key content in this thesis.In this paper, basic concept and current status of research of Web search engine are firstly introduced, the architecture of search engine and key technique are illustrated; then the style and characters of the data which stored in search engine are analysed, some designs of data store are poposed and the data store implements of other search engine are discussed; finally, an implement of data store system named WDB is illustrated in detail, which be used to support the data store of crawler.

Keywords/Search Tags:

Internet, Web, crawler, data store, search engine, information retrieval, inverted index

PDF Full Text Request

Related items

1	The Design And Implementation Of Vertical Search Engine For Position Query
2	Architecture And Optimization Of Prallel Crawler
3	Application And Research On Vertical Search Technology In Exploration Portal
4	Based On Research And Optimization Lucene Inverted Index Performance
5	Design And Implementation Of News Search Engine Based On MySQL
6	Research On Techniques Of Real Estate Information Vertical Search Engine
7	Research On Several Key Issues Of Crawler In Search Engine
8	A Study On Internet Information Retrieval And Developing Trend
9	Research And Implementation Of Data Analysis On Vertical Search Engine
10	Uyghur Khazak Kirgiz Multilingual Search Enigne Inverted Index Module And Its Implementation