Font Size: a A A

A Random Walk Based Win/Loss Graph Aggregation Algorithm For News Metasearch Engine

Posted on:2014-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:F Y ZhaoFull Text:PDF
GTID:2268330401482923Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The explosive increasing of the information on the web guides people feel harder and harderwhile they are retrieving the needed information by the single search engines. For metasearchengine, the ranking is the key technology that whether the rank algorithm is good or notwould confirm the function directly. In this paper we propose a novel results aggregating rankalgorithm for metasearch engines, called random walk based Win/Loss graph aggregationalgorithm, then we apply this algorithm into news metasearch engine. We develop thismethod according to the Win/Loss graph concept used on single search engines which isproposed by Lin Li et al.[1]. It can to sort the query results very good. As the competitionmechanism of Win/Loss graph method proposed by Lin Li et al. is simple, and has somedefects, so in this paper we also improve the competition mechanism of the nodes inWin/Loss graph.For the competition of the nodes in graph, we propose the concepts of node’s energyvalue and a kind of novel competition model. In this model we simulate each node to be acontestant, and give the energy value to everyone. Then the process of competitions just canbe the process of energy transfer. A node has higher energy value it would have moreinfluence power, this is to say the web page which represented by the node has higher quality.In this paper our method has two steps: At first we aggregate the results returned by leaguersearch engines and construct Win/Loss graph, at the same time we compute the energy valueof each node according to the nodes’ Win/Loss relationships; then we apply the random walkmechanism on the Win/Loss graph and compute the final ranking values of web pagesiteratively until all the values are stable, in this process the energy values of nodes are used todistribute the weight of the edges in the Win/Loss graph. While apply this method into newsmetasearch engine, we notice the timeliness of the news. Then we divide the web pages intothree sets according to their issued date: intraday, two to seven days ago, more than sevendays. We apply our method on these three sets respectively, and then ranking them by thethree sets.In the experiment of this paper, we select ten topics for query and sending them to theleaguer search engine to obtain the query results. In the construction of Win/Loss graph, wecalculate the node energy value in different competition order, and compare and analyze thefinal ranking results, then obtain the conclusion that the competition order will not affect thefinal ranking quality. Finally, this paper introduces several ranking evaluation algorithm andselect the DCG algorithm[2,3]as the most effective one to the experimental evaluation. In theexperiment, the algorithm proposed in this paper and other4kinds of metasearch engine ranking algorithm were compared and finally we get the conclusion that our method is betterthan the other4algorithms in the sorting accuracy, but our method do more time-consuming.
Keywords/Search Tags:Metasearch Engine, Results Aggregating Rank, Competition Model, Win/Loss Graph, Random Walk
PDF Full Text Request
Related items