Font Size: a A A

Research On Skyline Computation In Multiple Environments

Posted on:2016-04-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:B H HuangFull Text:PDF
GTID:1108330482953137Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Skyline computation, also known as skyline query, is an essentially multi-objective optimization problem, which aims to find the information of interest in a given dataset. As a basic operator, skyline computation is widely applied in many fields, such as multi-objective decision-making system, navigation system, environmental monitoring system, data mining, etc. Therefore, since proposed in 2001, skyline computation has been attracting the attention of many researchers.There are three main factors which should be considered in the design of skyline algorithms:dataset character, running environment and design objective. Dataset character includes data generation method, storage structure, spatial dimension and scale, etc. Running environment refers to the computation model and network environment which carry out algorithms. It can be classified as centralized and distributed system. The design objective covers efficiency, progressivity, fairness, user-friendly, and so on. All the factors intertwined leads to the complex diversity of skyline computation environments.Nowadays, there exist a variety of skyline algorithms for different environments, including centralized and distributed systems, fixed and mobile objects, static and dynamic datasets, low and high dimensional data space, etc. However, with the rapid development of new technology, such as cloud computing, sensor network, big data and mobile computing, new environments and requirements have continued to arise. Faced with this situation, it is hard for the existing algorithms to keep up with the new environments.In this thesis, several issues of skyline computation were explored, which mainly include:the efficiency of algorithms in centralized static environment; continuous skyline query in the mobile environment based on LBS; skyline query on distributed networks; reverse skyline query on centralized static datasets. The main contributions are as follows:(1) In a centralized static dataset environment, to improve the efficiency of skyline computation on enormous and high-dimensional datasets, two algorithms were proposed. The first one sorts the dataset by any dimension. Then the pre-sorted dataset is divided into several parts and executed concurrently by multi-core processors. The second one sorts the dataset by any dimension firstly. Then the dataset space is divided into several disjoint regions via a carefully selecting of pivot point. Using the dominance relations among regions, domination test-time between data points can be reduced, which further improve the efficiency. Both of the two algorithms have the characteristic of simplicity, good progressiveness, user-friendly and scalability. The experiment result shows, on enormous and high-dimensional datasets, both of the two algorithms can achieve great efficiency improving.(2) In a mobile environment, where the query point is moving fast toward unpredictable directions, a novel algorithm based on LBS was proposed by combining of data stream technology for the issue of continuous skyline query. R-tree is used to update the dataset quickly in search areas. Then a "passive" data stream is constructed by overlap of two adjacent search areas. This results in the forming of "new" and "fail" sub-datasets, which are dealt with separately next. Due to the full use of the existing results, the amount of calculation has dropped significantly. The experimental result shows that this algorithm is particularly suitable for the high frequency calculation and compared with the one using grid index, its efficiency is great improving with the increasing size of dataset.(3) In a distributed environment, we study the problem of skyline computation on dynamic data stream based on hierarchical P2P network. An algorithm using hierarchical aggregation idea from button to top was proposed. For the groups in low-level network, a routing tree is established to ensure that data points can be effectively filtered at each hop. For the chord in up-level network, to promote the progressivity and efficiency of our algorithm, multi-dimensional data points are mapped into one-dimensional space and calculated by the order of node identifier. The experimental result shows the computation and communication cost are reduced.(4) For meeting the changes in application requirements, the problem of reserve skyline computation on static data sets is also studied. Base on the classical algorithm BBRS, an improving algorithm was proposed. Established the R-tree index for the dataset in advance firstly, then divide the data space into several areas by the query point and finally use a simply window-query method and multi-core parallel technology to promote the efficiency. Experimental result shows that compared with the traditional algorithm, the performance of our algorithm has a certain increasing.
Keywords/Search Tags:skyline query, multi-core parallel, LBS, hierarchical P2P network, reserve skyline
PDF Full Text Request
Related items