Font Size: a A A

Anytime top-k queries on exact and fuzzy data

Posted on:2007-01-15Degree:M.SType:Thesis
University:The University of Texas at ArlingtonCandidate:Chaudhari, BhushanFull Text:PDF
GTID:2448390005971436Subject:Computer Science
Abstract/Summary:
Top-k queries on large multi-attribute data sets are fundamental operations in information retrieval and ranking applications. In this thesis, we initiate research on the anytime behavior of top-k algorithms on exact and fuzzy data. In particular given specific top-k algorithms we are interested in studying their progress towards identification of the correct result at any point of the algorithms' execution. We adopt a probabilistic approach where we seek to report at any point the scores of the top-k results the algorithm has identified, as well as associate a confidence with this prediction. Such functionality can be a valuable asset when one is interested to reduce the runtime cost of top-k computations. We show analytically that such probability and confidence are monotone in expectation. We present a thorough experimental evaluation to validate our techniques using both synthetic and real data sets.
Keywords/Search Tags:Data, Top-k
Related items