Font Size: a A A

Research And Implementation Of Cloud Storage System Based On Hot Data In IaaS

Posted on:2016-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2308330479991055Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the arrival of mobile Internet and big data, the size of unstructured data grows in an explosive fashion. How to handle and store the big data becomes an imperative challenge today. Cloud storage makes it possible to store big data, it integrates and manages widely distributing storage devices to provide high-performance and low cost storage service on demand. However, the heat of data is not considered in existing storage systems, a study by Rabinovich shows that the probability of data access in storage system conforms the Zipf distribution[1]. Accordingly, it is the key for storage system to improve its performance by taking account the hot data into consideration.The architecture and implementation model of the popular Open Stack Swift cloud storage is studied, and it is found that the utilization of proxy nodes is insufficient in traditional swift architecture at small-scale storage background. So, a combination of storage nodes and proxy nodes, named enhanced nodes is proposed to effectively improve the performance of the system. Because o f the limited storage space of proxy nodes, a novel method which can dynamically migrate hotspots partition to proxy nodes is proposed.Firstly, a statistical model for hot data prediction is proposed in this paper. By analyzing data access trends the future data traffic is predicted, a multi-cycle hot statistical algorithm is used to descript the heat extent quantitatively, the algorithm can calculate the frequency of download data in different periods, which effectively restrains the fluctuations of heat values caused by hot data accessed surge. The dynamic migration policy depends on not only the hot value but also the size of partitions, by using hill climbing algorithm the hungry value based strategy ensures the load balancing of the system.Secondly, a reverse proxy cache policy is designed in this paper basing on the predicted hot value. Thus, the performance bottleneck caused by high concurrent hot data access is mitigated. The hot data migrated into memory makes memory-level response and in-memory calculation, effectively alleviating the pressure of storage nodes caused by high concurrent hot data access.Lastly, the source code of Cosbench performance evaluation tool is heavily modified and three hot data access fashions are simulated: equal probabi lity thermal, periodic thermal, permanent thermal, then the performance of the proposed optimized storage system is evaluated in the three scenarios.
Keywords/Search Tags:Open Stack Swift, hot data, dynamic migration, forecast, reverse proxy cache
PDF Full Text Request
Related items