Font Size: a A A

Research On Top-K Join Algorithm Based On The Star Schema

Posted on:2012-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:L X CaoFull Text:PDF
GTID:2218330362950446Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Top-k join query returns k join results that users are most interested in. Top-k join has become one of the main research issues recently, and it's dominant in many applications, for example, data mining, web databases and information retrieval. Top-k join query also exists in data warehouse based on the star schema in practical application. For example, sometimes just the top-k join results that the decision maker is most interested in are desirable. However, the current existing algorithms aren't suitable for the data warehouse based on the star schema. In order to efficiently support top-k join query on star schema, we propose two kinds of indices and a multiple top-k join algorithm that is suitable for star schema based on these indices. By using a tighter upper bound than current existing algorithms and a pruning strategy, the algorithm is more efficient than the current existing algorithms. Furthermore, the experiment also shows that our algorithm is more efficient than the current existing algorithm.MTJS is the exact top-k join algorithm. However, obtaining the exact top-k join query results leads to an expensive cost, and sometimes the decision maker is willing to sacrifice the accuracy of the top-k join query results to reduce the execution time of the query execution and resources consumption. More importantly, we find that the current existing approximate top-k join algorithm is not suitable for star schema. In order to efficiently support approximate top-k join query on star schema, we propose an approximate top-k join algorithm MTJS-εthat is suitable for star schema.The algorithm is a variant of MSTJ, it returns the approximate top-k join results by introducing a parameterε. Because MTJS-εuses MTJS's upperbound and pruning strategy etc, its performance is more efficent than the current existing approximate top-k join algorithm. The experiment also shows that MTJS-εis more efficient than the current existing algorithm. In addition, we also find that the result that MTJS-εreturned is much more accurate than the accuracy thatεdefines.
Keywords/Search Tags:data warehouse, star schema, multiple top-k join algorithm
PDF Full Text Request
Related items