Font Size: a A A

Research On Key Technologies Of Skyline Query Processing On Massive Data

Posted on:2019-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2428330599977708Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Skyline is an important operation to return a set of interesting tuples which are not dominated by other tuples in a potentially huge data space.Skyline processing from the maximum vector problem,Borzsonyi et al.introduce skyline to extend the database system in 2001,and Skyline mainly used for preference queries.Skyline has aroused more and more researchers' attention in recent years,because of widely used in multi-objective decision-making,data mining,database visualization and other aspects.And the development trend of data management trend popularization in large data background,the Skyline query processing has aroused more and more researchers ' attention in recent years.Under such background,this paper researches the key technologies of Skyline on massive data,Range-Skyline,and Skyline on incomplete data.In the research of Range-Skyline,this paper first proposes a baseline algorithm BA algorithm,which scans the dataset sequentially,and first determines whether the tuple satisfies the range criteria,and then calculates the results of the set in which tuples are satisfies the range criteria.In order to improve the poor efficiency of the BA algorithm,this paper proposes the PSR algorithm which is based on the preordered table,and generates the preordered table in the preprocessing stage,return the query results though sequential scanning the preordered table,in addition,PSR analyzes the data characteristics to give the early ending condition,the correct query results can be obtained by ending the scan when the early end condition is met.In order to further enhance the query efficiency of the PSR algorithm,this paper proposes the dominate-prune and range-prune dominant strategy,in the implementation of the PSR algorithm,through the bitmap pruning,discarding most of the tuples in the data set,the results show that the PSR algorithm can compute the Range-Skyline results effectively on massive data.In the research of incomplete data skyline,this paper first proposed BNI algorithm which main idea is nested loops as the baseline algorithm,BNI algorithm can solve the Skyline in various situations,but the execution efficiency is poor and execution time of BNI is too long.This paper proposed a novel two-stage algorithm,TSI to solve the problem of Skyline on incomplete data,TSI algorithm scans the whole data sets two times to obtain the result,in phase 1,TSI scans the data set sequentially and maintains the candidate tuples,in phase 2,TSI return the result by discarding the tuple in the candidate set that is dominated by other tuples.The rule of pruning which combines the dominate-bitmap and incomplete-bitmap to reduces I/O for phase 1 is devised in TSI algorithm,the extensive experimental results show that the TSI algorithm can compute the skyline on incomplete data effectively.
Keywords/Search Tags:Range-Skyline, incomplete data, prune
PDF Full Text Request
Related items