Font Size: a A A

An efficient scheme to remove crawler traffic from the Internet

Posted on:2003-09-16Degree:M.ScType:Thesis
University:University of Alberta (Canada)Candidate:Yuan, XiaoqinFull Text:PDF
GTID:2468390011486544Subject:Computer Science
Abstract/Summary:
One of the first things that any Internet ‘neophyte’ learns is how to search for information using search engines. Search engines tackle the daunting task of categorizing myriads of documents on the Web by using web crawlers—‘search agents’. This method, however, has been shown to place a significant load on the Web servers as well as tax the underlying network infrastructure. We address the aforementioned problem by introducing an efficient indexing system based on active networks. Our approach employs strategically placed active routers that constantly monitor passing Internet traffic, analyze it, and then transmit the index data to a dedicated back-end repository. Therefore, our proposal obviates the need for Web crawlers and effectively eliminates their adverse effect on Web servers and network resources. Our simulations have shown that our active indexing system is up to 30% more efficient than current web crawler based techniques. It is also shown that, given a limited network bandwidth, our system achieves a better throughput introduced by human clients and clients get responses more quickly since more bandwidth is made available to human requests.
Keywords/Search Tags:Efficient, Search
Related items