Font Size: a A A

Storage And Processing System Of Marine Data Based On Hadoop

Posted on:2016-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:J F LiuFull Text:PDF
GTID:2180330473956534Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The ocean is a big resources circle and more and more countries pay attention to it. With the development and utilization of marine resources, mass marine data had rapidly emerged in large numbers. These data have the characteristics of massive, diverse, complex, heterogeneous and so on, so it should have its own strategies to process and storage the data. But at present the marine scientific data are lack of uniform standards and specifications of the storage and collection. So it become one of the key problems for marine scientific research that how to manage and store these data in order to use these data efficiently.Traditionally, these technologies such as parallel computing, distributed computing and grid computing, are mostly used of processing large-scale data, but it need expensive computing resources, but also how to put the mass data effective segmentation and allocate computing tasks reasonably requires tedious programming to accomplish. Hadoop as a representative of the cloud computing data technology can just provide an effective way to solve these questions.This thesis studies the Hadoop distributed storage and processing technology that based on a marine exploration system which developed in the process of practice. It also studies how to manage and store the marine data by using the Hadoop technology, and at last we designed the architecture of the system and carried a thorough research of the system. The system is based on the Hadoop that to processing the marine data. This thesis analyzes the construction demand of the marine scientific data processing system and summaries the key technology of the cloud computing:Hadoop, HBase, HDFS. This thesis analyzes the characteristics of the marine data and summaries the needs of processing the marine data. It also combined with the needs of the users and at last designed the architecture of the system. The system realized the function that stored data on the HDFS and it used a data deduplication technology to Optimization the upload data. It used the function of the<key, value> that from the mapreduce and realized to read the data from the netcdf file and converted it into the.txt format data. The system also realized that store the marine data to the HBase database and designed the table which is faced to the column. Finally the system realized various operating functions for the frontend users and it can provide different operating depending on the different logged-onusers. we carried out two tests that one is data stress test and the other is data retrieval test that proved the data stored in the Hadoop is feasible.Storage and processing system of marine data based on Hadoop provide a viable solution for processing the marine data. It provides practical application value.
Keywords/Search Tags:Hadoop, HDFS, HBase, MapReduce, Marine Data
PDF Full Text Request
Related items