Font Size: a A A

Research On Skyline Query Processing Techniques In Distributed Systems

Posted on:2014-05-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:B YinFull Text:PDF
GTID:1368330488499838Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Skyline query processing has recently received a lot of attention in database community,as a result of its importance in many applications,such as multi-criteria decision making,data mining and database visualization,and user preference search.Given a data set,the skyline query returns a set of points that are not dominated by any other point in the data set.Users only need to choose interesting objects in the skyline set,without having to consider the dominated points.With the wide application of distributed networks and the development of cloud computing,distributed skyline query processing has witnessed a grow interest.Skyline queries in distributed environments pose inherent challenges and require non-traditional techniques,due to the lack of global knowledge and special demands of algorithms in different distributed environments.This paper studies the distributed skyline query problem.The major contributions are summarized as following:(1)Skyline monitoring in wireless sensor networks is studied,and a skyline query algorithm based on predicting and data representation is proposed.The sink collects the prediction errors from all sensors,and represents each sensor reading with a hyper-square that centers at the prediction point with side length of two times the prediction error.A sensor node reports its sensing reading only if the sink explicitly asks for the current attribute values.In order to maximize the benefit of making predictions,a piecewise linear prediction model is proposed.Experimental results show that,the proposed algorithm can efficiently reduce the communication cost for skyline queries in the network.(2)The filtering technique in anti-correlated and clustered datasets in wireless sensor networks is studied.A personalized filtering algorithm based on data cluster representation is proposed.The existing approaches usually select the tuples(or related values)with the strongest dominating capability as the filter.The filtering effect is greatly affected by data distribution.The proposed algorithm allocates different filters for different nodes in the network,therefore adapting to the anti-correlated(and clustered)databases.In order to reduce the filtering cost and maximize the pruning power of selected filters,a novel data cluster representation scheme and a sampling technique based on history query results are proposed.The experimental results show that,the algorithm can output the right skyline results and significantly reduces the filtering cost and the overall query communication cost in the network.(3)We further study the continuous reverse skyline query problem in sensor networks,and extend the proposed algorithm for continuous skyline query to continuous reverse skyline query.As the reverse skyline operator is not decomposable(unlike skyline operator),even if a point is identified as a non-reverse skyline points,it could not be discarded because it may affect the reverse skyline judgment of points in other nodes.Thus,we propose the concepts of extended semi-dominance and extended full-dominance.The skyline nodes,non-skyline nodes,and the non-skyline nodes that can be immediately pruned can be then identified based on their representatives.In order to reduce the number of the nodes that need to report its reading,sensor nodes are queried in phase.Experiment results show that,the proposed algorithm can output the right skyline results,and it is energy efficient.(4)The skyline query processing in the client-server architecture is studied.A data partitioning-based distributed skyline computation algorithm is proposed.Data points on each server are divided into partitions based on the dependency relationships among data of different servers,such that in-comparable partitions can be queried in parallel.The in-comparable partitions may come from different servers or the same server.Partitions are then enforced the skyline partial order according to dependencies,which outputs skyline results progressively and helps in effective pruning.We have proved that the height of the query plan is bounded in theory.
Keywords/Search Tags:Skyline query processing, Dominance relationships, Distributed networks, Wireless sensor networks, Data partitioning
PDF Full Text Request
Related items