Font Size: a A A

The Research Of Key Processing Techniques Of Uncertain Skyline Query

Posted on:2017-03-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ZhouFull Text:PDF
GTID:1368330488977074Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the progress of human society and the development of network technology,data information has been a significant strategy resource as well as substance and energy.In this era of information explosion,it is urgent to extract the key information from the big datasets and provide decision support for customers.Skyline query is one of the important data man-agement operators.It plays a significant role in multi-criteria optimization,decision support,environmental monitoring,data analysis,GPS navigation and so on.In addition,uncertain data exists extensively in may areas such as sensor network,location-based services,radio frequen-cy identification,and web service.This is due to the limitations of data collection equipment,requirements of privacy protection or network transmission delay and so on.Therefore,the research of data management under uncertain data has great value in theory and practical sig-nificance.In this dissertation,the key techniques of uncertain skyline query is researched,and the main jobs and innovations are as follows:(1)Research the uncertain skyline query in the distributed environments.The present al-gorithm for the uncertain skyline query in the distributed environments has the limitations as not taking the total query time into account,the denoted approximate global skyline probability is inappropriate to the case where the local skyline query results have dominant relationship,and it only returns one final results at most after an iteration procedure.Motivated by these,this dissertation proposes a novel distributed uncertain skyline query framework to improve the ef-fectiveness,university and progressiveness of the present algorithm.This framework introduces a node-routing phase to prune the nodes which have no contribution to compute the final query result.Then,based on this framework,an adaptive and distributed uncertain skyline query al-gorithm is designed.This algorithm utilizes a much universal definition of approximate global skyline probability and a new strategy to select the local representative objects.The results of extensive experiments verify the better overall performance of our algorithm.(2)Research the static skyline query under uncertain data.The P-skyline query is a pow-erful tool for managing uncertain datasets.It aims to report data objects whose skyline proba-bilities are larger than a probabilistic threshold.However,the P-skyline query usually returns numerous results which are not always desirable.To concern this,we extend the traditional dominance operator,propose a modified P-skyline(MPS)query,which can get more excellent-quality results,and develop several algorithms for processing the MPS queries.Furthermore,the MPS query may also return a mass of results,especially over large or high-dimensional datasets.We investigate the MPS query under size constraints,and formulate a most represen-tative MPS(MMPS)query based on a novel ranking criterion.This criterion considers,for each skyline object s,both the objects dominating it or dominated by it.The approaches for MPS are extended to handle the MMPS query,respectively.Extensive experiments verify that the proposed algorithms are efficient and scalable,and our MPS query always return much more desirable query results than the P-skyline query with significant reductions of CPU cost,I/O cost and memory cost.(3)Investigate the dynamic skyline query under uncertain data.With the development of economy and the progress of society,we have entered the era of product explosion,which is mainly characterized by the consumer market unprecedentedly prosperous.As a result,prod-uct distributors get lost in the blooming of products.On the other hand,along with the social network popularization,it is more feasible to retrieve the information of customer preferences,which pays a growing significant role in sorting-out commercially valuable information.The preference information could play an important role in personalized service and personalized recommendation.Dynamic skyline query is a powerful tool for customers to select the prod-ucts that can meet their requirements.At present,the dynamic skyline query is placed in the beginning stage.Besides,the present uncertain dynamic skyline query reports different query results due to different probabilistic threshold,and always retrieves many non-ideal results.To concern this,this dissertation formulates and tackles an uncertain dynamic skyline(UDS)query,and proposes effective pruning strategies to reduce the search space of the UDS query processing.Moreover,effective algorithms are presented by integrating the proposed pruning strategies.The experiments clarify the effective and scalability of the proposed algorithms.(4)Investigate the uncertain dynamic skyline query with a size constraint.The dynamic skyline query under uncertain data always reports too many query results to provide heuristic in-formation for customers.Therefore,the uncertain dynamic skyline query with a size constraint is also researched in this dissertation.A novel query type,namely top k favorite probabilistic products(TFPP)query,is formulated.The TFPP query is utilized to select k products which can meet the needs of a customer set at the maximum level.To tackle the TFPP query,we propose a TFPP algorithm and its efficient parallelization.Extensive experiments with a variety of experiments settings illustrate the efficiency and effectiveness of our proposed algorithms.
Keywords/Search Tags:data management, skyline query, dynamic query, Top-k query, uncertain data
PDF Full Text Request
Related items