Font Size: a A A

Design issues for large-scale distributed systems: Data and resource managements

Posted on:2006-01-18Degree:Ph.DType:Dissertation
University:University of MinnesotaCandidate:Wu, KeqiangFull Text:PDF
GTID:1458390008450540Subject:Engineering
Abstract/Summary:
Current cache consistency approaches in data-shipping DBMS architectures rely completely on a centralized server or servers to provide concurrency control, which imposes a limitation on the scalability and performance of these systems. In addition, traditional asynchronous and deferred protocols are "blindly" optimistic on the cached data and do not exploit data sharing information. Moreover, due to the increasing complexity of generic large-scale distributed systems, the Quality of Service (QoS) design becomes challenging. This dissertation proposes two protocols and a framework to address these issues of data and resource managements, respectively.; This dissertation designs a protocol, Active Data-aware Cache Consistency (ADCC), for data-shipping DBMS. Compared with Callback Locking (CBL), ADCC uses peer-to-peer (P2P) communication to reduce the latency for discovering data conflicts by 50%, while increasing message overhead by about 8% only. In addition, ADCC improves scalability by partially offloading the concurrency control function from the server to the clients.; Second, this dissertation designs a protocol, Self-tuning Active Data-aware Cache Consistency (SADCC), for data-shipping DBMS. By statistically quantifying the speculation cost, clients can self-tune between optimistic and pessimistic consistency control. Compared with Asynchronous Avoidance-based Cache Consistency (AACC) and ADCC, SADCC significantly reduces the speculation cost under high contention environment. In a non-contention environment, both SADCC and ADCC display a slight reduction (an average of 2.3%) in performance compared to AACC with a high-speed network environment. With high contention, however, SADCC has an average of 14% higher throughput than AACC and 6% higher throughput than ADCC.; Finally, this dissertation demonstrates the limitations of adaptive control theory on QoS design for generic distributed systems, and proposes an adaptive dual control framework for mitigating those limitations. By incorporating the existing uncertainty of the on-line prediction into the control strategy and accelerating the parameter estimation process, the dual adaptive control framework optimizes the tradeoff between the control goal and the uncertainty, and displays robust and cautious behavior. In particular, under the medium uncertainty, the average hit-rate ratio provided by the adaptive dual control system and the conventional adaptive control system deviate from the desired hit-rate ratio by about 13% and 40%, respectively.
Keywords/Search Tags:Data, Distributed systems, Cache consistency, Adaptive control, ADCC
Related items