The Analysis And Optimization Of Load Balancing Algorithm For Big Data Platform Based On Hbase

Posted on:2019-11-11

Degree:Master

Type:Thesis

Country:China

Candidate:F Shao

Full Text:PDF

GTID:2428330593950511

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The storage of big data has become the key problem in the fast development of the Internet.A large number of data need to be stored and analyzed.The relational database is no longer able to meet the trend of rapid growth of data.In this scenario,the non relational database has been widely used.The non relational databases have showed a great advantage in dealing with massive data storage and high concurrent access.The research on the non relational data correlation has become a hot research field.As a non relational database,HBase has been used as back-end database to store data because of its advantages of extensibility and high availability.As a distributed open source storage system,an HBase database can use inexpensive computers to build a highly reliable storage cluster.HBase is a column-oriented database system.The internal file storage system uses Hadoop's HDFS.HBase is an important member of the Hadoop ecosystem.In pace with the growth in the living standard,more attention should be paid to the health indicators of the body.It is of great significance to build a comprehensive and healthy big data platform to monitor various diseases.The HBase database is used by the healthy big data platform as its back-end database.HBase load balancing is crucial to improve overall performance.The initial load balancing algorithm is analyzed,the basic idea of the policy is to ensure that the number of Regions of each Region Server is the same.However,in a real application scenario,the frequency of data access is not the same,and some data may be frequently accessed as hot data.Because the access of each Region is not equal,it may cause the load to be unbalanced and affect the response efficiency of the request.Some Regions become hot spots,causing some Region Servers to be overloaded.For load balancing in a distributed database,it is very important to take into account the influence of the hot of the data.Therefore,a load balancing algorithm is designed using a prediction method,on the basis of the number of requests for Region Server history,and the hot of the prediction data is used as a load of the Region Server.At the same time,the cluster's Cost scoring function is constructed to take into consideration the five factors of read request score,write request score,memstore size score,StoreFile size score,and locality score.During the construction of the experimental platform,a data table model was extracted based on the “basic data set of urban and rural residents' health records” approved by the Ministry of Health of the People's Republic of China.TheRow Key of the table was designed and the pre partition was used to improve the system performance.The experiment used the HBase version is hbase-1.1.12,the Hadoop version is hadoop-2.5.1.The above-mentioned optimization is verified by experiments,and experiments show that using the optimized solution can improve the performance of the HBase healthy big data platform.

Keywords/Search Tags:

HBase, Big Data, Load Balance, Hot Data, Prediction

PDF Full Text Request

Related items

1	Research On The HBase Region Split And Load Balance Strategy
2	Research On Load Balance In Data Center And Traffic Prediction Based On Deep Learning
3	Research And Optimization On Load Balance Of Data Deduplication
4	Research On Technology Of HBase-based Data Balanced Write And Efficient Read
5	Research And Implementation Of Massive Sensor Information Storage Extension Based On Cloud Computing
6	The Research And Application Of Linux Cluster System's Load Balance Mechanism For Data Retrieval
7	Research And Implementation On Data Placement And Load Balance Strategy For Multicloud Storage System
8	Task Scheduling Optimization Based On Time And Load Balance Under The Hadoop Platform
9	Research On Data Processing Technology Based On HBase
10	Research On Data Compression Technology Based On HBase