Font Size: a A A

Research On Search Engine Based On Automatic Web Page Classified

Posted on:2005-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:S H YuFull Text:PDF
GTID:2178360185964134Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In this article the pnncipium and structure of search engine was discussed. As the search engines available is lowly precision and can't supply varied granularity query, a design based on automatic web page classified is presented. In addition, we talk about automatic web page classified, the system of rank web page, and how to accommodate varied user query granularity, and so on.Web page that is a kind of hypertext documents, contain both text information and structure information such as hypertext tag. In this article a classifier that make synthetically use of text and structure information is designed. The classifier is a module of the search engine, that get better accuracy and operation efficiency.Ranking algorithm is a important aspect of search engine research, that display the most valuable information at the frontest of query results. A betterment algorithm for PageRank is presented to adapt to varied user query granularity. First, found a frame that classified all the web page. Next step, for each web page, computing the pagerank that baised for the classification the page belong to offline. At the query time, the user decide to query in which classification and what granularity, the ranking system sort the web page in the classification and return the query results.At the last, the article suggest that the reduplicate content in the query result must be wiped out. It is say that search engine should find and delete the complete same webpage in the context by contrast.
Keywords/Search Tags:Search engine, Automatic web page classified, Ranking Algorithm, PageRank
PDF Full Text Request
Related items