Font Size: a A A

Design And Implementation Of Distributed Storage System Based On A Pick- KX Algorithm

Posted on:2015-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:W FangFull Text:PDF
GTID:2308330473953078Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and popularization of computer network technology, there are more and more many large portal websites and e-commerce sites like the "sina", "taobao" and so on,a large amount of data resources are saved in these sites. Due to the limitation of the client browser, user can’t download all of the data resources from one server, so even if the server has a high bandwidth, the user’s access speed also will be affected by a lot. In addition, the data is stored on the physical hard disk, data’s access has to need I/O operations frequently, so when the number of concurrent users become more and more large, I/O operations will become the performance bottleneck of the whole system.In view of the above problems, This thesis put forward a solutions of the distributed storage based on Pick-KX algorithm, we build an extensible distributed storage system, what is more, the system has good cost performance and fault tolerance ability and allows a large number of concurrent access to client, and it don’t have the problem that the client’s request can’t gain the response.My work and innovation of the thesis can be summarized as follows.1. The work principle of HBase,the reading and writing process and the fault-tolerant mechanism are studied,after analyzing the overall architecture and work mechanism of HBase,the system uses the HBase to complete the distributed storage and the data backup of the file.2. The work principle of redis,the data structure,the persistence and the reading and writing separation mechanism are studied,through reading and analyzing the redis source code,the system use redis to optimize the reading operation,we can put the data accessed frequently into redis and remove the data that has not been accessed for a long time from the redis by using the expired mechanism of redis in order to improve the service efficiency of the memory.3. The work principle of zookeeper is studied,the system use zookeeper to complete the configuration and cluster management automatically and ease the workload of programmers and program maintainers. The zookeeper alse can let the master node to know about the online and offline state of the work node in real time so that the master node can adjust the task allocation strategy timely and improve the reliability of the system.4. The basic idea of the Pick-KX algorithm has been analyzed detailedly,the system use the Pick-KX algorithm to implement the load balance of each processing node,aimed at avoiding occurring the situation that some processing nodes are idle for a long time and some processing nodes are busy to die,the experiments show that the algorithm can effectively realize load balance of each processing node.5. The text difference detection algorithm has been analyzed detailedly,when the user modifies the current file,the system utilize the algorithm to synchronize the difference to server in order to avoid the loss of the modification and ensure the integrity of the system,what is more,the system utilize the bloom filter to avoid uploading the same file repeatedly.
Keywords/Search Tags:Distributed Storage, HBase, Redis, Text Difference Detection
PDF Full Text Request
Related items