Research And Application Of The Storage Of Hbase

Posted on:2015-12-16

Degree:Master

Type:Thesis

Country:China

Candidate:X P Feng

Full Text:PDF

GTID:2298330467963847

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Today Internet has made tremendous progress. Now every day there are vast amounts of data to be uploaded to the Internet. With the increasing scale of data, a large number of business scenarios began to consider expanding the level of data storage, and then the storage services can be added and deleted. But the current relational database focuses more on a machine. The massive amount of data storage becomes a bottleneck, and a single machine can not load large amounts of data. HBase is a top Apache open source project that separated from Hadoop. As it has most of features of Google’s BigTable system and is implemented in Java, therefore it is very popular in the days full of data. HBase’s storage mechanism is different from traditional relational databases, which is based on column storage. It’s a nosql database and there is a great advantage in terms of massive data storage and query. Therefore, the study of storage mechanisms, and query features of HBase has great practical significance.From the storage mechanism of HBase, this paper made a deep study of its characters of loading data and query.The main tasks of this paper are as follows:First, this paper did some research of storage mechanism of HBase. In this paper, a detailed analysis of the storage mechanism of HBase is made. And from the data that is actually stored, the storage characteristics are analyzed. HBase completely modified the data storage format, but it requires more storage space in exchange for fast query efficiency.Then, this paper did some research of loading data into HBase. Before using HBase, you first need to load massive amounts of data into HBase. HBase itself has several different methods to load data, and every method has its own characteristics. HBase provides the interface which allows users to customize loading data into HBase. By using the MapReduce parallel computing framework, this paper realizes the parallel customize loading method and the method has good efficiency.Also, this paper did some research and analysis of query efficiency of HBase. HBase has the advantage of quick random query of massive data. But because it does not support SQL queries, it’s hard to meet the demand of complex business process. However, it provides the integrated interfaces of Hive which allows data to be stored in HBase but using Hive to query. This paper analyzes the query features of HBase and realizes the integration of HBase and Hive.Finally, this paper did some research and analysis of the integration of HBase and MapReduce. Compared to other non-relational databases, HBase’s biggest advantage is that it combined with Hadoop naturally. Hadoop is one of the most popular cloud computing technologies. With the interfaces of HBase, this paper realizes the integration of HBase and MapReduce which allows algorithms to read and write data of HBase directly.

Keywords/Search Tags:

HBase, Hadoop, database, storage, loading data, query

PDF Full Text Request

Related items

1	Implemention Of The Massive Telecom Data Distributed Storage And Query System Based On Hadoop
2	Research Of Big Data Store Query Technology Based On HBase
3	The Describing Of Sensing Device Platform Based On Hadoop Distributed Data Storage
4	The Research And Implementation Of Indexing And Query Techniques Based On HBase And In-memory Database
5	Application And Research On Data Storage Of Rail Transit Maintenance Support System Based On Hadoop
6	Design And Implementation Of Query And Load In OCNoSQL Project
7	Research On RDF Data Storage And Query Based On HBase
8	A Research Of Distributed Storage And Parallel Query Of Spatial Data Based On Hadoop Platform
9	Research On Hbase Data Storage Based On Hadoop Platform In Express Industry
10	Research And Application Of Big Data Migration And Query Based-on Hadoop Platform