Font Size: a A A

Incorporating quality metrics in agent-based centralized/distributed information retrieval on the World Wide Web

Posted on:2000-04-17Degree:Ph.DType:Dissertation
University:University of KansasCandidate:Zhu, XiaolanFull Text:PDF
GTID:1468390014460643Subject:Computer Science
Abstract/Summary:
With the development of the World Wide Web, almost anyone can publish documents that reach a global audience. As a result, the quality of available information varies widely. However, most information retrieval systems developed to help people search for information on the World Wide Web rely primarily on relevance ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that low quality documents are retrieved. In this research, I present an alternative approach that combines traditional relevance ranking based on term statistics with quality ranking to retrieve information in centralized and distributed environments. The results from a series of experiments are presented to examine how information quality affects system effectiveness.; Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Generally, all the listed quality metrics, with the exception of the site cohesiveness, were found to improve both centralized and distributed search. The improvement of the search effectiveness made by incorporating the currency, availability, information-to-noise ratio and page cohesiveness metrics in centralized search is significant. The improvement made by incorporating the availability, information-to-noise ratio, and popularity metrics in site selection is significant and that made by incorporating the popularity metric in information fusion is significant. The reason that the site cohesiveness was not found to have effect on either site selection or information fusion may be due to inaccuracies in the method used to calculate the site cohesiveness. A further investigation was conducted on the effect of site cohesiveness by using a modified formula of the site cohesiveness. The results show that the site cohesiveness improved the search effectiveness when incorporated in site selection. The authority metric was not found to have a significant impact in either centralized or distributed search. This may indicate that the authority ratings are not reliable. In summary, however, the results show that incorporating quality metrics can significantly improve search effectiveness in both centralized and distributed search.
Keywords/Search Tags:Quality metrics, World wide, Incorporating, Centralized, Information, Distributed, Search, Site cohesiveness
Related items