Font Size: a A A

ER-Topk Query Processing On Ucertain Streams

Posted on:2018-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2348330512487159Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Uncertain data is widely found in various areas of information society,including economy,military,location-based service,medical science and meteorology.With the rapid popularization of mobile internet and the advent of new data acquisition technology,the scale of uncertain data grow so rapidly that the technology of managing uncertain data has attracted the attention of researchers from academia and industry.The uncertainty of data appears in relational data,semi-structured data,data stream and multidimensional data.This dissertation studies how to handle Top-k query on uncer-tain stream.Uncertain stream is a large scale of continuous tuple sequence and the main difficulties of handling it include:(1)the data flow rate is too fast that real-time processing is always necessary;(2)the scale of data is too large to be loaded into memory;(3)due to the existence of dimension,it is necessary to design optimization algorithms to reduce the cost of calculation.Although there have been so many research results in academia now,these methods still have limitations when dealing with specific scenario.There is an urgent need for new technology of managing uncertain stream.What's more,in order to improve both throughput and query response,we also design a common framework for handling query on uncertain data streams.The main work of this dissertation includes following aspects:Approximate query algorithm for massive data stream This algorithm solves the problem that handling ER-Topk query and TTk query on uncertain stream may cost too much storage resources.This method can effectively filter the incoming uncertain stream,reduce pressure of data processing and improve the throughput of the system while controlling the data precision.Framework for real-time uncertain stream processing Based on our algorithm,we propose a framework for handling ER-topk query and TTk query on uncertain stream.This framework uses parallel processing technology to achieve fast data processing.An error detection algorithm for uncertain stream Error data always occurs in uncertain stream due to the influence of various factors.In order to avoid the error data impacting the query result seriously,this dissertation provides an error detection method which can judge the credibility of each tuple through data characteristics.Validating the effectiveness of algorithm and framework The goal of this dis-sertation is to handle ER-Topk query and TTk query on uncertain stream.To verify the performance of our algorithm and framework,including throughput,reliability and response speed,we design different experimental strategies and provide special test data which consists of simulation data and formal data.
Keywords/Search Tags:Uncertain Data, Data Stream Query, Top-k Query, Data Mining
PDF Full Text Request
Related items