Font Size: a A A

Research On Multidimensional Data Query Based On Hilbert Spatial Filling Curve

Posted on:2017-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:J W HeFull Text:PDF
GTID:2278330482497684Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recent years have witnessed an explosive growth in data volume, and the big data era is coming. Cloud based data management systems have been widely used in managing big data due to its large storage capacity and elastic computing power. A typical case is HBase/Hadoop system. HBase/Hadoop is a key/value based data management system and any data access must be through the key (or rowkey). However, non-key data query, e.g., multi-fields range query, will make full data-set scan, which incurs large I/O overhead and decreases the query performance.In this paper, a novel method of rowkey design, which based on the space filling curve technique, is proposed to facilitate the multi-fields range query in HBase/Hadoop system. The rowkey is not only a unique data identity, but also contains the data field information. The new design makes data be partitioned based on data content, which means that data with similar field values will be stored aggregately. A good clustering property of multi-dimensional data space is kept in the rowkey of HBase, which reduces the san scope for range query greatly. The prototype system with the proposed rowkey design has been implemented based on HBase/Hadoop and the experiment results show the efficiency and good performance of multi-fields range query for massive data set in this prototype system.
Keywords/Search Tags:Space Filling Curve, Big Data, Multi-Field Query
PDF Full Text Request
Related items