Font Size: a A A

An Experimental Evaluation Of TA-like Algorithms For Processing Top-N Queries

Posted on:2016-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:M Q YangFull Text:PDF
GTID:2308330479478390Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The processing of top-N queries with monotone aggregation function is one of the important issues in the fields of Database and Information Retrieval. Generally, top-N queries are more effective than the traditional database queries, and overcome their limitations. A top-N query can search the results that match the query conditions completely as well as find the close results; meanwhile, the results of a top-N query are sorted by the aggregation function.Fagin et al. proposed various TA-like algorithms such as the FA, TA, TAz, NRA and CA algorithms for different access methods as well as a variety of data resources, but they did not report the experimental results of the TA-like algorithms in their seminal papers. Since then, some of the original TA-like algorithms have been implemented, improved or adapted for different situations and/or applications; however, the original algorithms have not been thoroughly compared and analyzed under the same experimental framework. To address this problem, in this paper, we carry out extensive experiments over eleven datasets to measure the performance of the eight TA-like algorithms and one naive algorithm, and then we provide comprehensive surveys and insights on the natures of the TA-like algorithms from some perspectives based on our experimental results.Firstly, in order to evaluate top-N queries over Web-accessible databases and pipeline techniques, by adding cache methods in TA, TAz and NRA algorithms, we present three algorithms TA1, TAz1 and NRA1 which are modified from TA, TAz, and NRA. The original TA or TAz algorithm caches Top-N tuples by a bounded buffer; the modified algorithm TA1 or TAz1 caches all the seen tuples. The results of NRA may not have the total score, and then the results cannot be sorted; but NRA1 can get all the scores of tuples, which can be sorted.Secondly, we set up a uniform experimental framework based on eleven datasets from 2 to 104 dimensions, and then we measure the elapsed times of algorithms, and the numbers of sorted access, random access and seen tuples. We also analyze the efficiency of the algorithms with different result size N and normalized datasets. By analyzing, comparing and studying the experimental results, we obtain the natures of the algorithms with respect to their environments and applications.Thirdly, we study the cost model of FA algorithm under the uniform experimental framework. By using the theory of limit, we discuss the cost model of FA, and then compare and analyze the theoretical and practical values of FA.
Keywords/Search Tags:Top-N Query, TA-like Algorithm, Information Retrieval, Web database, Aggregation Function
PDF Full Text Request
Related items