Font Size: a A A

The Design And Implementation Of Maritime Big Data Query Service Platform

Posted on:2016-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:T ShenFull Text:PDF
GTID:2308330473958209Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,the data assets are explosively growing in emerging Internet services or traditional industries such as telecommunications, transportation.etc. National maritime system has accumulated the information resources of massive, timely, accurate AIS,VTS and ships.etc. As theses information resources have scattered among various systems and have not been integrated together and reused, Maritime Safety Administration put forwarded to the massive and authoritative AIS information resources urgently to make the requirements of be vitalized. It aims to provide two main services which are leadership decision making and shipping travel through transforming data into service, improving the maritime public service ability and strengthening the capability of maritime service support. This article designs and implements a maritime big data query service platform based on the requirements proposed by Maritime Safety Administration.Firstly, this article analyzes the development of maritime data query service and the development of HBase secondary index and summarizes its disadvantages. Then it focuses on the distributed storage system HBase used in this article and introduces the Hadoop distributed file system which is HDFS for short that the HBase relied on. Subsequently, it will introduce full text retrieval application server Apache Solr, which is used in this article for analyzing its internal principle of index and retrieving in detail. Detail analysis has been done to arithmetic of goehash and the principle of spatial search in Solr for the requirement of spatial search. Furthermore, it analyzes and researches the principle of SolrCloud, which is Solr distributed mode to build distributed index for maritime big data.In this thesis, it takes advantage of HBase as a data storage layer to solve the problem of maritime big data storage, taking usage of Solr as data indexing layer. The indexing layer makes up the disadvantage that HBase can only scan data by rowkey is the only one dimension and implements multi-dimension queries, such as Boolean queries, fuzzy query and spatial search. In this way, it separates index and the real data, which need to use server-side hooks provided by HBase Coprocessor framework to realize process that inserting data into HBase. At the same time, it could index HBase, proving the consistency between index in Solr and real data in HBase, avoiding to modify HBase code. Based on the core solution,the article set up a maritime big data query service platform,which is divided into four modules that are data processing module、data storage module、data indexing module、data querying module. They are designed and implemented separately. Furthermore, it would tune the system to provide better performance.Finally,providing a query interface for user.At the end of the article,the research develops a series of tests to verify the correctness of the data inserting process. Test of the data inserting speed at an average value. It could test the response time under different situation and system performance under Multi-user concurrent queries. Still, it would test the stability of system under Multi-user concurrent queries which lasts 30 minutes. The test results were already evaluated through the maritime safety administration. And maritime data public service website has been developed and been running online based on this platform.
Keywords/Search Tags:HBase secondary index, Solr, distributed index, maritime big data
PDF Full Text Request
Related items