Font Size: a A A

Design And Implementation Of Query And Load In OCNoSQL Project

Posted on:2015-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2298330434950534Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The OCNoSQL system is one data processing system used in big data storage and query. Which is developed by AsiaInfo-linkage company, and served for China Mobile, China Unicom and China telecom. Due to the traditional file system can’t save massive amounts of data and the query speed is too low. The OCNoSQL system is designed to solve the above problems. The OCNoSQL system use Hadoop, MapReduce and HBase as a basic platform, and optimized the specific business, to achive high performance in importing and query. For example, Beijing Mobile flow query system, this system generate about one hundred million flow datas per hour, accounting for storing100GB, the total amount of data is around10TB per month. While the OCNoSQL system processing data, the average importing data speed is14min per hour and the time to get the query results from monthly report just hundreds of milliseconds.This article mainly display how the OCNoSQL system to achieve the function of importing and query. The storage of OCNoSQL system is based on HBase, is the middleware of HBase, so some functions of the HBase should be studied firstly."Put" and "bulkload" are different importing type for HBase, The put type mainly target in small-data, to write the data via the client connected with RegionServer directly. The bulkload type use the MapReduce framework, to write the data via bulk load. The query of HBase is divided in "get" and "scan", the get type is to get a specific data through rowkey, but the scan type is to complete range query via startRowkey and stopRowkey. And the scan type also to archive condition query by adding filter.The interface of the OCNoSQL system, which extends the original interface of HBase, and optimize the query and importing from below three points. Firstly, to achieve distribute cache by integrating Redis for query interface. As the Redis equivalent to memory database, which the speed is more quickly than disk10, the OCNoSQL system put the fake table data and page data in the Redis, just read the data from memory and don’t need to interactive with HBase.Secondly, to get rowkey configuration by optimizing the generative rule for the importing interface. About the generative rule, unique and reasonable length to save the space of the storage is needed. Thirdly, to provide the SQL clause support to the OCNoSQL system. Integrate the phoenix in this system, to make HBase to support SQL and jdbc driver, which make convenient for development and service engineers.
Keywords/Search Tags:OCNoSQL, HBase, Hadoop, MapReduce, Query, Loading
PDF Full Text Request
Related items