A Parallel System Of Incremental Web Information Retrieval

Posted on:2006-08-21

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhou

Full Text:PDF

GTID:2168360155970790

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With information rapidly expanding in the Web, many Web services accordingly boom up. As a basic foundation and important component of these services, Web crawling is applying in many fields, such as search engine, site structure analysing, and web graph evolution,. However, facing with people requesting more and more rigorous and prolific, traditional scalable Web crawling technology do not satisfying people's needs well. It can not gather data adequately and timely. Thus, we get into the research on how to crawl information effectively in some sections of Web, which is also called parallel web crawling technology. Based on the long-time accumulation in the field of web crawling, and combining the current developing technology on parallel web crawling, this article bring forward a structure design model of the parallel incremental web crawler, In order to downloading web pages parallelly, we adopt means of multiple thread. We adopt the latest character of JAVA language. We adopt the right means for URL dispatching to make sure that threads would parally work, through page analysis, we extract url for threads to download, In order to reduce redundancy ,we chose footprint algorithm. At last, we get the test result, Within our expect, It can effectively improve information gathering performance.

Keywords/Search Tags:

Web, Information Crawling, Information Gathering, Search Engine, Parallel

PDF Full Text Request

Related items

1	Focused Web Crawling Technology
2	Research And Implementation Of The Strategy-Extensible Search Engine
3	Alert Based On Search Engine Technology Research And Implementation Of Information - Gathering System
4	Design Of A Parallel Web Crawling System
5	Design & Practice Of Topic-Specific Search Engine System
6	Vertical Search Engine For Crawling The Web Page Design And Implementation
7	Crawling and searching the hidden Web
8	Research And Implementation Of Vertical Search Engine
9	Research And Implementation Of Vertical Search Engine Based On Distribution
10	Research On Web Crawling Technology In Image Search Engine