Font Size: a A A

Research And Implementation Of Cassandra Database Index And Caching In Cloud Computing

Posted on:2019-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X LuFull Text:PDF
GTID:2348330563453986Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the increasing of the complexity of the business systems of various Internet applications and volume of user data,the storage and read/write pressures of data are continuously increasing.Using non-relational databases can solve this problem to some extent.Commonly used non-relational database Cassandra has good horizontal scalability and better write performance,but in actual use,found that Cassandra's read operation speed is not satisfactory enough,and concurrent performance has a certain degree of bottleneck.The cache and index can effectively improve the read speed of the data system.Therefore,this thesis will systematically and comprehensively introduce the design and implementation of the cache and index data system based on Cassandra database.This thesis optimizes concurrent read and write performance of Cassandra database on the premise of guaranteeing the distributed character of Cassandra.First of all,the read cache can reduce the direct access to the disk,but due to the high memory prices in recent two years,we will put cache into solid state disk as the backup cache of memory in order to increase the cache capacity and improve the cache hit rate.The use of generational algorithm to separately process the cached data in the memory and the SSD can not only expand the memory cache,but also enhance the caching system's fault-tolerant capability and achieve persistent cache.Secondly,combined with an optimization method of Bloom Filter as an indexing system,combined with the LSM data storage idea adopted by the Cassandra data system itself,the read performance of the entire system is improved through the acceleration of storage addressing speed.In addition,adding a data buffer layer to the LSM data storage mode to ensure that the data is flushed into the disk in batches can enhance the system's write performance to some extent.Finally,combined with the popular Node.js server-side framework and object-relational mapping in recent years,we will implement a data server system which combined asynchronous I/O and multi-process concurrency.In addition,in order to enhance the maintainability of the system,this thesis uses a data visualization method to provide real-time feedback on the operating status of the above modules,ensuring that each module can operate stably.Through the above several optimizations,a data system that can integrate well with existing cloud computing platforms can be realized.At the end of this thesis,we deploy and test the Cassandra data system that we implemented,and test the initial function target and performance of the system respectively.The test results show that the read performance of the improved Cassandra data system has been greatly improved,while the concurrent performance and the write performance have also slightly increased.
Keywords/Search Tags:cloud computing, Cassandra, caching technology, indexing technology, Node.js
PDF Full Text Request
Related items