Font Size: a A A

Research On Distributed Index Of Hbase Database Based On LSM-tree

Posted on:2018-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:2348330515998098Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the face of the vigorous development of large data,researching and finding effective large data storage management approach,to provide real-time or on-time large data query and analysis capabilities,has been a very worthy of study.NoSql just to adapt to solve the problem of large data in the context of the data read and write with high speed,which HBase is a typical application.HBase provides efficient technology and platform for large data storage and querying,but HBase only has queries for primary key columns.However,in many cases,non-primary key queries are required to meet real-time or punctual queries.To this end,the study of HBase non-primary key query performance is a worthy question.In this paper,the non-primary key indexing method of HBase is studied deeply,and a distributed index scheme based on LSM-Tree is designed.The scheme of index takes a method of reading and writing separation,batch update,The index data of the baseline data is persistently stored,the updates of index is maintained in the memory,and the indexed hotspot data is cached in the memory,and a cache replacement strategy based on the heat accumulation is proposed.Through the efficient cache replacement strategy,access to better than the LRU cache hit rate,and further enhance the HBase non-primary key query performance.This paper mainly solves the following problems:Solve the problem of indexing the massive data based on LSM-Tree architecture;The index maintenance and query strategy based on LSM-Tree architecture is proposed;Verify the feasibility and correctness of the index on the HBase database.Through the research of this paper,the time cost based on LSM-Tree distributed index is completely acceptable,and the space cost is about 50%of the user table,However,query performance has been tens of times,hundreds of times of promotion is worthwhile.Compared with the standard HBase data of non-primary key querying,the index structure of this paper makes the efficiency of querying greatly improved,and the index has very good scalability.
Keywords/Search Tags:LSM-Tree, HBase, Distributed, Index
PDF Full Text Request
Related items