Font Size: a A A

Combinatoric models of information retrieval ranking methods and performance measures for weakly-ordered document collections

Posted on:2011-08-12Degree:Ph.DType:Dissertation
University:The University of North Carolina at Chapel HillCandidate:Church, LewisFull Text:PDF
GTID:1448390002964588Subject:Information Science
Abstract/Summary:
This dissertation answers three research questions: (1) What are the characteristics of a combinatoric measure, based on the Average Search Length (ASL), that performs the same as a probabilistic version of the ASL?; (2) Does the combinatoric ASL measure produce the same performance result as the one that is obtained by ranking a collection of documents and calculating the ASL by empirical means?; and (3) When does the ASL and either the Expected Search Length, MZ-based E, or Mean Reciprocal Rank measure both imply that one document ranking is better than another document ranking?;Concepts and techniques from enumerative combinatorics and other branches of mathematics were used in this research to develop combinatoric models and equations for several information retrieval ranking methods and performance measures. Empirical, statistical, and simulation means were used to validate these models and equations.;The document cut-off performance measure equation variants that were developed in this dissertation can be used for performance prediction and to help study any vector V of ranked documents, at arbitrary document cut-off points, provided that (1) relevance is binary and (2) the following information can be determined from the ranked output: the document equivalence classes and their relative sequence, the number of documents in each equivalence class, and the number of relevant documents that each class contains. The performance measure equations yielded correct values for both strongly- and weakly-ordered document collections.
Keywords/Search Tags:Measure, Document, Performance, Combinatoric, Ranking, ASL, Models, Information
Related items