Font Size: a A A

K Representative Skyline Queries Based On User's Regret Ratio Via Subspace

Posted on:2020-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:S D HanFull Text:PDF
GTID:2428330590472673Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,returning points that users may be interested in is one of the most important functions for multi-criteria decision-making.The k-regret query uses maximum regret ratio as a metric and returns k data points that minimize user's maximum regret ratio.However,existing algorithms for the k-regret query suffer from a heavy burden by taking the skyline points as candidate points,which is not scalable with the dimensionality.Motivated by this,this thesis devotes to finding a small size of candidate set from the entire skyline points so that the k-regret algorithms can be applied efficiently on the smaller candidate set to improve their efficiency.The main contributions of this thesis are listed as follows.?1?The candidate set determination for the k-regret query via skyline frequency is proposed.First,this thesis proposes the concept of frequent skyline points based on skyline frequency which can be regarded as the candidate points for the k-regret query.Then,an algorithm called SFSKYCUBE based on accurate counting is proposed to determine frequent skyline points.To avoid expensive calculation of SFSKYCUBE,an effective approximate counting method called SFAPPROX which extends Monte-Carlo counting algorithm is presented.Besides,two techniques,pre-sorting and dynamic updating threshold,are adopted to further optimize the performance of SFAPPROX.Finally,experiments on synthetic and real datasets show the efficiency and effectiveness of our proposed method.?2?The candidate set determination for the k-regret query via skyline priority is given.First,the concept of candidate points called superior skyline points is defined based on skyline priority to be the candidate points for the k-regret query.Then,a bottom-up method using results-sharing and sorting-sharing strategies called SPBUS is proposed to determine the candidate points.Furthermore,a filtering algorithm is adopted to optimize SPBUS.Based on strategies of partition-and-merging-sharing and par-ent-sharing,a top-down algorithm is presented which reduces redundant computation and achieves op-timal computation cost.Finally,experiments on synthetic and real datasets are conducted to compare the performance of all algorithms and the quality of the candidate sets mentioned above.The results show that the two kinds of candidate sets can both largely reduce the running time of the k-regret algo-rithms meanwhile maintaining a small regret ratio.
Keywords/Search Tags:k-regret Query, Skyline Frequency, Skyline Priority, Candidate Set Determination
PDF Full Text Request
Related items