An Experimental Evaluation Of TA-like Algorithms For Processing Top-N Queries

Posted on:2016-07-10

Degree:Master

Type:Thesis

Country:China

Candidate:M Q Yang

Full Text:PDF

GTID:2308330479478390

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The processing of top-N queries with monotone aggregation function is one of the important issues in the fields of Database and Information Retrieval. Generally, top-N queries are more effective than the traditional database queries, and overcome their limitations. A top-N query can search the results that match the query conditions completely as well as find the close results; meanwhile, the results of a top-N query are sorted by the aggregation function.Fagin et al. proposed various TA-like algorithms such as the FA, TA, TAz, NRA and CA algorithms for different access methods as well as a variety of data resources, but they did not report the experimental results of the TA-like algorithms in their seminal papers. Since then, some of the original TA-like algorithms have been implemented, improved or adapted for different situations and/or applications; however, the original algorithms have not been thoroughly compared and analyzed under the same experimental framework. To address this problem, in this paper, we carry out extensive experiments over eleven datasets to measure the performance of the eight TA-like algorithms and one naive algorithm, and then we provide comprehensive surveys and insights on the natures of the TA-like algorithms from some perspectives based on our experimental results.Firstly, in order to evaluate top-N queries over Web-accessible databases and pipeline techniques, by adding cache methods in TA, TAz and NRA algorithms, we present three algorithms TA1, TAz1 and NRA1 which are modified from TA, TAz, and NRA. The original TA or TAz algorithm caches Top-N tuples by a bounded buffer; the modified algorithm TA1 or TAz1 caches all the seen tuples. The results of NRA may not have the total score, and then the results cannot be sorted; but NRA1 can get all the scores of tuples, which can be sorted.Secondly, we set up a uniform experimental framework based on eleven datasets from 2 to 104 dimensions, and then we measure the elapsed times of algorithms, and the numbers of sorted access, random access and seen tuples. We also analyze the efficiency of the algorithms with different result size N and normalized datasets. By analyzing, comparing and studying the experimental results, we obtain the natures of the algorithms with respect to their environments and applications.Thirdly, we study the cost model of FA algorithm under the uniform experimental framework. By using the theory of limit, we discuss the cost model of FA, and then compare and analyze the theoretical and practical values of FA.

Keywords/Search Tags:

Top-N Query, TA-like Algorithm, Information Retrieval, Web database, Aggregation Function

PDF Full Text Request

Related items

1	Research And Implementation Of Parallel Query Optimization Based On Table Cache Techniques
2	Research On Algorithms About Temporal Aggregation Query
3	Research On Probabilistic Databases And Efficient Query Technology
4	Based On The Keyword Query Database Search Engine System Design And Implementation
5	Based On Historical Query Relational Database Query Optimization Keyword Research Questions
6	Research On Information Retrieval System Based On Keyword
7	Research Of Relevant Document Retrieval Technology For Question Answering System
8	Information Retrieval System Based On Document Query
9	Effect of different database structure representations, query languages, and user characteristics on information retrieval
10	Research On Web Data Query