Research On Top-k Query In Uncertain Database

Posted on:2013-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:X J Li

Full Text:PDF

GTID:2248330371972082

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Lots of applications such as data miningã€senor networkã€data retrieval generate a large number of uncertain data which widely used in financial military and other fields. Uncertain data provide imprecise information to users. In some cases it is possible to eliminate the imprecisions completely, but this will lose some important information, uncertain data must be managed and stored effectively.Uncertain database is used to manage uncertain data.Users are more interested in the most important (top-k) query answers in the potentially huge answer space.Top-k query is widely applied in traditional databases, its semantics and the query result both are very clear on precisions data.Uncertain data is unrealicable and imprecise, the uncertain truple has two pillars:confidence and generation rules, tuplesâ€™score and uncertainty must be considered in uncertain databases to top-k query.The interplay between sore and uncertainty mankes tradiotional techniques inapplicable. Researches have proposed many top-k query algorithms over uncertain database, these algorithms have different query semantics, and they are not integrated tuplesâ€™score and probability value very well, so the query result could not better satify the usersâ€™needs. Top-k query on unctertain database needs further study.Firstly, this paper studies and analysis uncertain data and uncertain database, on the base of modeling uncertain data, the author define a novel top-k query semantics which has no ambigutity on uncertain database.The novel top-k query return k tuples for the result, when computing which truple will be at rank i, it will compare the most probility truple ranked at i with the second most probility truple ranked at i-1, the optimal truple will be returned for the final result ranked at i. So the novel top-k query better balance the truplesâ€™ score and uncertainty. In addition, according to their different needs, users can define a thres-hold, in query results, all truplesâ€™probability is greater than the threshold. The novel semantics ensures that it is well balance the uncertain truplesâ€™ score and probability.Secondly, this paper implements the novel top-k query. Data modeling leads to possible word space increasing exponentially, modeling all possible word space will cost lots of query time. So, two kinds of optimization method are used to optimize the algorithm, it avoids the running time and reduces the scan depth of truples to make the algorithm more effcicient.Finally, this paper make some experimentals, experimentals prove that the algorithm is effective on different data sets.

Keywords/Search Tags:

uncertain data, Top-k query, uncertain database

PDF Full Text Request

Related items

1	Study On Skyline Query Processing Techniques On Uncertain Data
2	Uncertain Multimedia Data Personalized Query System Design And Implementation
3	Two New Top-k Queries Based On Uncertain Database
4	Research On Uncertain Data Stream Database System
5	Research On Key Techniques For Top-k Query Processing Over Uncertain Data
6	Research Of Top-k Query Processing On Uncertain Data
7	Research On Probabilistic Aggregate Nearest Neighbor Query Method Over Uncertain Data
8	The Research Of Key Processing Techniques Of Uncertain Skyline Query
9	Research On Network Public Opinion Communication Modeling Under Uncertain Environment
10	Study On Indexing And Range Query Processing Techniques For Uncertain Data