Font Size: a A A

Research On Efficient Distributed Cache Method Based On Consistent Hashing

Posted on:2023-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y N LiFull Text:PDF
GTID:2558306845491014Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the gradual maturity of big data and cloud computing technology,many users have an increasingly strong demand for high-performance and rapid response systems.Caching technology is very important for the smooth operation of the computer system,but with the exponential growth of hot data on the cloud platform,the stand-alone caching system is difficult to bear the huge traffic on the Internet.To improve the availability,reliability,and scalability of cache systems,distributed cache technology has attracted more and more attention and favor from researchers.Memcache is one of the representatives of distributed cache.Many websites use it to improve access speed and user experience.This thesis mainly studies the efficient distributed cache method based on Consistent Hashing algorithm.The main work is as follows:(1)Aiming at the problem of insufficient performance of Consistent Hashing algorithm used in distributed cache,a new cache data segmentation method based on multi-level equally divided hash ring is proposed.In this method,the primary and secondary partitions,single hash and multi hash models are used to read and write cached data.It improves the availability of meta information by caching the full amount of meta information on the client.It supports multi-point writing of specific data,to improve the concurrent write performance of the cache.In addition,this method uses the client to cache the full amount of meta information to improve the availability of meta information.Then,some comparative experiments are carried out between the designed method and Memcache based on Consistent Hashing.The experimental results show that under the same experimental environment,the reading and writing performance of this method for random data is 6.2% ~ 8.5% higher than that of Memcache,and the concurrent writing performance for specific data is 61.7% higher than that of Memcache.At the same time,the balance delay after the change of cluster size is reduced by 9.83% ~ 12.1%.Finally,the proposed distributed caching method is applied to the actual engineering project.The system test results show that the throughput of the concurrent query of hot data based on the improved method is 3.2 times higher than that of the original system.(2)Aiming at the problems of unstable performance and insufficient scalability of the centralized cache mechanism used by HDFS,an HDFS distributed cache mechanism based on Consistent Hashing is proposed,and a new cache node named CacheNode is designed to join the HDFS cluster.The distributed cache method proposed above is adopted by CacheNode.Through the deployment mode of independent cache nodes,it can isolate the memory consumed by node storage,cache,and calculation to solve the conflict of memory resource occupation.By providing a global unified file query method,the query performance of different nodes is more stable.In addition,Flexible extension is supported by CacheNode,which can improve the file reading performance through horizontal expansion.Finally,the effectiveness of the mechanism is verified by experiments.The experimental results show that,the delay of reading large files is lower,and the reading performance of different nodes is more stable in most cases.Average read latency from different nodes reduced by 20.1%,and the variance is only 0.33.
Keywords/Search Tags:Consistent Hashing, Data Partition, Distributed Cache, NoSQL, HDFS Caching Mechanism
PDF Full Text Request
Related items