Font Size: a A A

Research On Adaptive Load Balancing Strategy For MongoDB Based On Hotspot And Performance Difference Among Shardings

Posted on:2018-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:G Q YanFull Text:PDF
GTID:2348330518475037Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the Internet generates large numbers of data because the devices accessing to the network increase exponentially.Traditional centralized storage solution can't satisfy the people's growing demand in the terms of reliability and performance.Thus how to manage the big data storage has become a challenge in the present.In order to find the efficient solutions,all kinds of distributed databases have sprung up,and NoSQL is one of the most widely used solutions for its superior performance.As one of the new NoSQL databases,MongoDB has attracted a great deal of attention in the international market due to its features of horizontal scaling,document-oriented,schema-free and fault auto-transfer.The auto-load balancing strategy guarantees that the stored data can be distributed to all nodes evenly,while the influence of performance difference among nodes and the data hotspot haven't been taken into consideration.However,in the case of Chunks' balance,the load imbalance caused by performance difference among the nodes and hotspot will appear in the actual implementation environment.To solve this problem,this thesis improves the existed solutions by optimizing the MongoDB load balancing strategy and the contributions of thesis as follows.(1)According to the deficiencies of the load balancing strategy,a new architecture for MongoDB is proposed.It includes the access control module to control the client access to the system,the resource management module to forcast the load for each sharding in the preliminaries and to adjust the cycle of load forecast based on it,the load forecast module to estimate the load of each sharding and Chunk,and the data transfer module mainly targeted to control the data migration.(2)The Markov model is introduced in the load forecasting module and its feasibility is analyzed.By building the Markov chain for the data operation on each sharding,the transition probability matrix and the steady state vector of each operation can be obtained at last.Combining the data capacity on the sharding,the performance differences among the servers,the weight for data manipulation and the data access hotspot,we can make a quantitative prediction for each sharding and Chunk.Based on the predicted load,the appropriate source server,the target server and the Chunks which will be migrated are chosen.In order to rank the load of each sharding and Chunk,the ordered binary tree is introduced previous to the choosing.Then the data migration strategy is developed to decide when to start to migrate the chosen Chunk and when to achieve the load balancing.(3)Finally,the various operating load factors are determined through the experiment.And a series of simulation experiments are conducted to compare the load balancing algorithm proposed in this paper with the build-in one in MongoDB.It is verified that the proposed load balancing algorithm can not only balance the load among severs excellently,but also reduce the system response time.
Keywords/Search Tags:Load Balancing, Sharding, Load Prediction, Data Migration
PDF Full Text Request
Related items