Font Size: a A A

Read Performance Optimization Of Cassandra Based On Hybrid RAM And SSD Cache System

Posted on:2016-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:X H LvFull Text:PDF
GTID:2308330461490642Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the big data era, it is more and more difficult to meet the needs of storing, querying and analyzing the massive data in traditional methods. As a result, the new storing concept NoSQL databases emerged. NoSQL databases focus on non-relational, distributed, open source, and horizontally scalable, which can satisfy the needs of big data storage and processing.According to its excellent performance and unique design architecture, Cassandra, as a representative of NoSQL databases, is widely used in major IT companies. Cassandra can undertake big data storage and processing tasks very well. However, as a relatively new distributed database system, Cassandra could be improved in various aspects, especially the performance.In practice, we found that the read performance of Cassandra is very poor. Through system architecture analysis and test validation, we found that the system bottleneck is operations of disk IO.However, in case of a hit RowCache of Cassandra,it can greatly reduce the number of IO operations and give a considerable improvement.Unfortunately, the cost of RAM increases dramatically beyond 64GB per server. Solid State Disk has larger capacity, lower cost and lower power requirement than RAM, while delivers much better performance than DISK. As a result we can use SSD as an extension of the RAM.Using SSD as operating system swap area is a common method.However, this solution has lots of limitations.We designand implement a system of Hybrid RAM and SSD to provide high-capacity and high-speed caching services for Cassandra to optimize its readperformance. Using SSD as an extension of the RAM, this hybridsystem has five regions oftwo layers. The RAM layer can be divided into three regions and the SSD layer can be divided into two regions. The specially designed architecture can make full use of the hybrid RAM and SSD storageand reduce the erasure operations of SSD during the operation of reading and writing. And the algorithm of data exchange in each regioncanidentify hot and cold data and retain hot data in memory that has higher 10 speed.The system can be integrated into Cassandra seamlessly. Besides we provide two user interfaces including static and dynamic interface that are easy to use and configure.Compared with swap area solutions, our system provides much better performance. Experimental data show that our new method has 1.4x over swap.
Keywords/Search Tags:Hybrid Cache, Cassandra, Solid State Disk
PDF Full Text Request
Related items