Font Size: a A A

A Study Of Page Sorting Algorithm Based On User’s Habit

Posted on:2014-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:N YangFull Text:PDF
GTID:2268330425483274Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In today’s information age, the internet has already become the most important way to get information for people, and at the same time, the search engine is widely used, it is the most important tool for the information retrieval technology. According to the survey of user habit, they usually access several websites which appears in the top search result. As this reason, it becomes important whether it is good enough for the search engine to give the right sort result and whether the search result can meet the user’s need.The famous PageRank algorithm helps Google become the world largest commercial search engine company. It provides a great contribute to improve the accuracy by using the link model and its iterative calculation method. In the research field, the HITS algorithm is also a very good page sorting algorithm just like the PageRank algorithm. And the page sorting algorithm based on term position and weighted method is always used in the research field.Although these famous page sorting algorithm make a great contribution to the development of search engine, they also have some inadequacies.This article proposes an original algorithm creation, BUHP algorithm (base on the user habit topic sensitive PageRank algorithm). This algorithm is an improvement of the PageRank algorithm, and this algorithm can solve the theme drift problem of the PageRank algorithm effectively. And at the same time, this algorithm can return different web sorting result which will satisfy the user’s habit based on different users, and this can improve the satisfaction of the user and the quality of the search engine. The BUHP algorithm propose an entire solution of the extract of the user habit information and generating the number data of this information, it also propose a formula of the BUHP algorithm.This article makes an experimental search engine platform while using the open source project Lucene and Nutch, crawls the sample web pages in this platform, calculates these sample web pages with the BUHP algorithm. Comparing the result in the sample web pages of the BUHP algorithm with the result of the PageRank algorithm, and through the analyzing of the theory of the BUHP algorithm and the result of the web sorting in the experiment, this article concludes a conclusion that the BUHP algorithm improves the PageRank algorithm, and makes the user have a better satisfied.
Keywords/Search Tags:search engine, page sorting, PageRank, topic sensitive, user habit
PDF Full Text Request
Related items