Font Size: a A A

Answering Skyline Queries Over Incomplete Data With Crowdsourcing

Posted on:2020-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:S GuoFull Text:PDF
GTID:2428330575959713Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the past decade,incomplete data queries are widely concerned.Current approaches mainly focus on data preprocessing,utilizing data imputation methods based on machine learning.Thus,query quality relies on machine power.In some cases,human is more powerful for solving problems that are inherently difficult for machines.Towards this,we are looking forward to optimizing query quality by crowdsourcing.In this thesis,we study the problem of skyline queries over incomplete data.We propose a novel query framework,termed as BayesCrowd.The framework consists of incomplete data modeling phase and crowdsourcing phase.In the modeling phase,we take into account data correlation using Bayesian network,leverage c-table model to represent incomplete objects,and develop an efficient modeling algorithm.In the crowdsourcing phase,we compare the dominance relationship between objects through crowdsourcing to obtain the skyline query results.Considering budget and latency constraints,it is required to crowdsource the most beneficial tasks.Thus,we design a utility function to measure the benefit of crowdsourcing one task,and present three effective task selection strategies to meet different needs.Extensive experiments using both real and synthetic data sets confirm the superiority of BayesCrowd,in terms of execution time,monetary cost,and latency minimization.
Keywords/Search Tags:incomplete data, skyline query, crowdsourcing, query optimization
PDF Full Text Request
Related items