Font Size: a A A

Complex Rank Query Over Data Streams: Research And Implementation

Posted on:2011-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:D C HeFull Text:PDF
GTID:2178360302474670Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data streams differ from conventional stored data in three main characteristics, which are continuously producing, online arriving, and potentially unbounded in size. Data streams emerge in many applications, just like network monitoring, financial services, web site visiting, stock marketing, and sensor network monitoring. Data streams cannot be easily archived for its unbounded in size, so we only have the chance to read the data we received several times or even once before we report the query results. Conventional algorithms developed for archived data, like K-means, K-centers for data clustering, Apriori, FP-tree algorithms developed for frequent pattern mining, cannot be used in data stream environments without modification. In this dissertation, we focus on one hot research topic in data streams, which is known as rank query. First (chapter 1), we give the definition of rank query over data streams and describe some application contexts, together with disadvantages among all the previous researches. Second (chapter 2), we give an overview of the current situation of this research area. Especially, two former algorithms which proved to be mature are discussed detailed here. Third (chapter 3,4), we propose a much more complex rank query over multidimensional data streams. We name this complex rank query as cluster-based rank query. For this new rank query, we propose a novel algorithm which adopts a multi-level buckets' framework. Theoretical analysis reveals that the algorithm has a worst-case space requirement. The extensive experiments indicate that the proposed approach is efficient in both time and space when performing data stream processing and cluster-based rank query, and can also achieve good clustering quality and guaranteed error precision. In the last part of this dissertation (chapter 5), we give a conclusion as well as future directions in this research area.
Keywords/Search Tags:Data Stream, Multidimensional Data Stream, Rank Query, Quantile,Clustering, Approximate Query, Sliding Window
PDF Full Text Request
Related items