Font Size: a A A

The Study Of Multi-dimensional Index And Maintenance Method Based On HBase

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:G P ZhouFull Text:PDF
GTID:2428330614963627Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and the popularity of the Internet applications,the scale of data continues to expand,and we have entered the era of big data.Although traditional relational database management systems(RDBMS)can provide very mature data storage and processing solutions,in the face of growing data volumes,the ability of RDBMS to analyze and process big data has encountered serious bottlenecks in terms of scalability,No SQL database came into being.It uses a flexible data model to store big data and can easily achieve scalability,so distributed storage systems such as HBase play an important role in data service.In order to better process and analyze big data,one-dimensional secondary indexes have been used to access data in many distributed storage systems.However,it cannot effectively support multi-dimensional range query by simply aggregating multiple secondary indexes.Therefore,this paper proposes an adaptive multi-dimensional indexing strategy based on HBase to effectively perform multi-dimensional range query.This strategy combines a B+-tree and a hash table.First,a B+-tree secondary index is established according to the queried attribute.Then the corresponding row key set obtained by the index query is mapped to a hash table to get the final matching result.The experimental results shows that our indexing strategy can obtain lower response time latency,and under the condition of eight-dimensional query,the response time of the hybrid index is reduced by about 15.75% compared with the iterative index,and about 39.82% compared with the MD-HBase method.Aiming at the balance between consistency and performance caused by HBase updating indexes,the existing update schemes committed to improving performance,but not to the consistency between the index structure and the data table.Therefore,this paper designs the cold and hot data partitioning strategy with LRFU,and proposes an adaptive multi-dimensional indexing maintenance strategy for cold and hot data based on HBase.The cold data and its corresponding indexes are updated asynchronously with query verification mechanism,the hot data and indexes are updated synchronously.So we can achieve adaptive balance between consistency and performance.The experimental results shows that the consistency of the Adaptive-maintenance strategy is stronger than the Async-simple strategy,and the performance of the two is close.
Keywords/Search Tags:HBase, multi-dimensional range query, hybrid index, consistency, hot and cold data
PDF Full Text Request
Related items