Font Size: a A A

Cloud Storage Technology For Ocean Dynamic Environment Data Based On HBase

Posted on:2020-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:W XiaFull Text:PDF
GTID:2480306305494564Subject:Surveying and Mapping project
Abstract/Summary:PDF Full Text Request
With the continuous development of ocean monitoring and forecasting technology,ocean data accumulate rapidly.The data show the characteristics of diversification of types,richness of dimensions and explosive growth of data volume,which promote the research of ocean field to enter the era of big data.But at the same time,massive ocean data are also facing great challenges in storage management,mining analysis and statistical application.In the current technological environment,non-relational HBase distributed database and memory-based Spark parallel computing framework emerge as the times require.Their excellent performance in massive data processing has been widely used in transportation,finance,e-commerce and social networks.This thesis introduces the cloud storage technology of HBase and Spark to store and manage the ocean data in NetCDF format.Taking the ocean dynamic environment data of China Sea for 45 years from 1972 to 2016 as an example,a ocean dynamic environment data storage and query system based on HBase is designed,which realizes the distributed storage and efficient analysis of data.Firstly,a data storage scheme for ocean dynamic environment based on HBase is proposed.In order to improve the storage efficiency and access speed of HBase,a storage scheme is designed for ocean business requirements,including dimensionality reduction of raw data,load balancing of HBase,construction of HBase tables for specific row keys and data columns.Several compression modes currently supported by HBase are analyzed and compared,and the optimal Snappy compression mode is selected to manage data through experiments.Design and optimize the server cache configuration scheme,make full use of memory resources and improve data reading speed.Secondly,the data processing scheme and query system based on cloud storage are designed and implemented.By using the technology of combining HBase with Spark,the tables in HBase database are transformed into flexible distributed data sets,and the parallel computation is cached in the Spark cluster memory,which further improves the efficiency of data query and analysis.Based on Lift framework technology,the website of the system is designed to realize the communication between the website and Hadoop cloud platform.Finally,the efficient performance of cloud storage scheme in ocean dynamic environment data analysis and application is verified.In this thesis,the data of ocean dynamic environment at the characteristic points of the East China Sea are analyzed by scatter chart,and the extreme value samples of historical wave heights are extracted to calculate the design wave heights and periodic expectations in different recurrence periods.By comparing and analyzing the query efficiency of centralized file storage and HBase cloud storage in specific applications,the feasibility and efficiency of cloud storage scheme for ocean dynamic environment data management are verified.
Keywords/Search Tags:ocean dynamic environment data, NetCDF, cloud storage, HBase, Spark
PDF Full Text Request
Related items