Font Size: a A A

Web-based Network Search Technology Research

Posted on:2008-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:C J GuoFull Text:PDF
GTID:2208360212479091Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With exponential growth of web pages, uses always meet the dilemma of choosing exact information that they need. Search engines have provided useful methods to index and search web sources, which becomes one of the most important tools in obtaining modern information. Intelligent web search technology which faces different users has combined kinds of characteristics of search engines. Furthermore, this technology summarizes user's browser behavior and provides personal web search service to uses, which has bright prospect.The paper first introduces history and present of search engines. And then list some technique details and further development in this area. The whole paper is organized as the three steps of search engines: web collection stage, web analyzing stage and web index stage. In every stage, the paper first introduces some basic techniques involved in this field, and then describes techniques that focus on personal search of users in details.In web collection stage, the paper first describes some methods in web collection of search engines, and discusses spiders used in full text search engines. The paper focuses on focused crawlers, including architecture of FIA (Focused Intelligent Agent), prediction of unfetched URL, analyzing of download web pages, discovery of topic hub pages and improving quality of collected web pages.Web page analyzing stage has summarized a series of techniques for identifying web content and semantic web information abstract, which is the typical application of data mining on the web. This section includes using regular expressions to identify web content in web page resources, web preprocessing, abstracting and quantifying web concepts, building inverted files and dealing near-replicas of the documents on the web.In web index stage, the paper first introduced defines of information index and classical methods of ranking web pages, such as PageRank and HITS. And then describe kernel objects and process in building models of our intelligent RSS online reader. Finally, discuss index and rank of the system.
Keywords/Search Tags:Search Engine, User Personalization, Web Mining, Information Retrieve
PDF Full Text Request
Related items