Research On Information Retrieval Model Based On Deep Web

Posted on:2009-08-27

Degree:Master

Type:Thesis

Country:China

Candidate:B H Wu

Full Text:PDF

GTID:2178360245954996

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development and diffusion of Internet, people have different understanding and requirements about information access compared with the past. They need more accurate and rapid access to substantial information on the Web. More and more traditional resources are being transferred to the Internet, which lead to the rapid expansion of the number of online resources. However, the traditional resources retrieval methods cannot meet people's requirements. As its powerful and easy-to-user features, Web Search Engine is the most frequently used tool for information organization and retrieval. However, conventional Web search engines can not find all the information on the Internet for the existence of certain resources known as Deep Web. Therefore, taking the hidden resources behind the Deep Web as a starting point, it is significant to study how to fully utilize the information on the Web.From the scene of information resources on the Internet, the paper performed a systematic and deep analysis on the distribution and structure of Deep Web. To solve the problem of low information coverage of conventional search engine, the paper designed and implemented a deep Web crawler which can discover and download more pages from Internet. Also, the paper proposed an information retrieval framework based on this crawler. Studies can be concluded as follows:(1) Defects of a Web crawler can lead to low information coverage of conventional search engines through analyzing shortcomings of them.(2) Study characteristics and features of resources hidden in the deep Web.(3) Propose an information retrieval framework based on deep Web and define its purposes and features.(4) Design and build a Web crawler for deep Web according to the page collecting mechanism of deep Web.(5) Improvements are made to the crawler in order to collect more pages with fewer resources.(6) Propose a better algorithm for Chinese words segmenting and build a prototype system providing an existing full-text indexing library. Experimental results prove that the system is effective.

Keywords/Search Tags:

information retrieval, Deep Web, retrieval model, search engine

PDF Full Text Request

Related items

1	Research On Algorithm Of Deep Convolution Network And Feature Fusion For Cross Modal Commodity Retrieval
2	Research And Implementation Of Full-text Retrieval Combining Word Matching And Context Interaction
3	Current Status Research And Improved Design Of Meta Search Engine
4	Design And Implementation Of Based On Vector Space Model Of Local Search Engine
5	Research And Implementation Of Intelligent Information Retrieval Technology In American Health Care System
6	Research On Retrieval Technology In Search Engine
7	Research On Information Retrieval Technology
8	Personalized Web Information Retrieval System, Design And Implementation
9	A Study On Internet Information Retrieval And Developing Trend
10	The Research On Web Based Information Organization And Retrieval System Model