Quality-Aware Scheduling For Key-Value Stores

Posted on:2015-10-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:C Xu

Full Text:PDF

GTID:1228330467965485

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Key-value stores widely served as data platform for various Web applications, ranging from Facebook social networking based on Cassandra to Amazon online shopping hosted by Dynamo, provide a distributed solution to cloud computing and big data management. In order to answer Web queries quickly, key-value stores generally employ weak consistency model. This model relaxes the data consistency to improve system performance on query response. However, it incurs negative impact that data accessed by users might be stale. Hence, there is an intrinsic trade-off between query latency and data consistency, which has become a key factor in the design of large-scale data management systems.In particular, data consistency expresses as the freshness of data accessed by queries at a local node. Clearly, the latency/consistency trade-off at node level further turns on the one between query latency (i.e., Quality of Service (QoS)) and data freshness (i.e., Quality of Data (QoD)). In real application, different Web queries or users would have different expectations in terms of QoS and QoD. In this work, the core issue is to optimize QoS and QoD at node level and the contributions are listed as follows:· The general asynchronous update model in distributed key-value stores as well as the metrics of QoS and QoD are analyzed, so that to formally define the quality-aware scheduling problem. With the help of analysis on data structure model, data repli-cation as well as consistency, user query and system update in key-value stores, the components strongly related to quality-aware scheduling are abstracted and modeled in this work. Meanwhile, service level agreement and freshness level agreement are adopted to measure QoS and QoD respectively. Based on that, the quality-aware scheduling problem is formally defined in distributed key-value stores.· Freshness/Tardiness, Adaptive Freshness/Tardiness mechanisms as well as popularity-aware versions are designed, which efficiently supports quality-aware scheduling for state-transfer updates. Towards the key-value stores with state-transfer updates, Freshness/Tardiness (FIT) is employed to consider the requirement on both QoS and QoD by selectively installing updates, i.e., applying or skipping updates. In order to explore deadlines to further balance the trade-off between QoS and QoD, the variant of FIT, i.e., Adaptive Freshness/Tardiness (AFIT), is adopted in this work. In addition, popularity is leveraged to address skewed data access pattern under popularity-aware mechanism. This mechanism groups the queries access the same data item into a query group as the scheduling unit. Hence, a suite of approaches efficiently support quality-aware scheduling for state-transfer updates. · The approximate algorithms for Freshness/Tardiness mechanism and its popularity-aware version are proposed, which reduces the computational complexity of quality-aware scheduling for operation-transfer updates. Freshness/Tardiness mechanism is extended towards the key-value stores with operation-transfer updates. The decision in FIT for operation-transfer updates goes beyond a simple binary decision of in-stalling or skipping updates to a more general case of deciding how many updates to install, although the principles are the same with the one for state-transfer updates. Similarly, the decision in popularity-aware FIT for operation-transfer updates goes beyond the binary decision to the more general case. Hence, in order to reduce the cost on computing the number of updates under frequent updates, we propose ap-proximate algorithms which reduces the computational complexity of quality-aware scheduling for operation-transfer updates.· The prototype of quality-aware scheduler as well as its benchmark are implemented, on which an online application supported by quality-aware scheduling are demonstrated. By exploring the technique details of implementation on distributed key-value stores, the prototype framework of a quality-aware scheduler is depicted and a prototype system based on Cassandra, namely AQUAS, is implemented in this work. In addition, a benchmark for quality-aware scheduling in distributed key-value stores, as well as the implementation of QCSB benchmark based on YCSB, are illustrated here. As an example, a timeline query on microblog is demonstrated on AQUAS in order to show how to support online applications. Hence, both the prototype framework and the benchmark for this quality-aware scheduler as well as the high-level online applications are provided as a complete and valid solution.In summary, this study focuses on quality-aware scheduling in key-value stores, and designs a prototype framework of quality-aware scheduler, and illustrates the scheduling supports online applications. This research with its coherence and sustainablity forms a relatively complete component itself. This work is based on the comprehensive survey and analysis on state-of-art theories and techniques. The theory analysis as well as ex-tensive experiments show that the proposed solutions for quality-aware scheduling achieve significant efficiency and effectiveness.

Keywords/Search Tags:

Key-Value Stores, Quality of Service, Quality of Data, Data Con-sistency, Scheduling, Microblogging Application

PDF Full Text Request

Related items

1	Quality of service scheduling for heterogeneous data services on wireless systems
2	The Research And Application Of Data Quality In Data Warehouse Environment For Telecom Industry
3	Data Quality Assessment Model And Quality Propagation For Relational Database
4	Research On Scheduling Algorithms For Diverse Communication Requests On Wireless Networks
5	Research On Quality Evaluation Methods For Big Data Application Software
6	A Three Layers Architecture Of Date Warehouse Driving By Data Quality
7	Data Quality Analysis And Optimization In Public Security Intelligence Based On ETL
8	The Research And Application Of Social Media Data Management System Based On Crawler Technology
9	Research O Abnormal Data Analysis Based On Service Industry Data Quality
10	Research On Oilfield Data Quality Supervision And Control Model