Font Size: a A A

Marine Environmental Data Storage Optimization And Distributed Management Based On The Database Cluster Technology

Posted on:2009-08-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y LiuFull Text:PDF
GTID:1488302750452374Subject:Detection and processing of marine information
Abstract/Summary:PDF Full Text Request
With the rapid development of ocean probing technologies like satellite remote sensing,the obtained marine environmental data volume is exploding to hundreds of GB levelor sometimes even TB level.The marine data has characters of multi sources,multiformats and huge volume.Speed,efficiency and availability are required whenaccessing the data through internet or intranet.Research works are done in this paperon massive data storage optimization and database cluster based distributed datamanagement technology.Experimental results show that the overall performance ofmarine environmental data management and the system reliability are improved.Themain research works and results are as following:1.Research on marine environmental data storage optimizationBecause marine environmental data is query oriented,query efficiency and storageoccupancy are the most important factors to consider when designing oceanenvironmental databases.According to these practical requirements the storageoptimization research works are done from three aspects:traditional relational schemareorganization,data fragmentation and corresponding data manipulation method.A new grid structure relational schema Grid_R is proposed to manage marineenvironmental data.This structure is similar to real geographical longitude and latitudegrid.All the longitudes columns are included besides time and latitude.Datafragmentation method and the corresponding data manipulation methods are based onthis Grid_R storage model.With this data storage and organization optimizationmethod storage space is reduced to about one third of original value and single tablequery performance is improved four times.2.Research on distributed management of massive data based on database clustertechnologyThe traditional centralized database system has problem of effectively support WebMarine Information System functionality.As the database query is increasing andquery method is becoming more and more complex,the workload of database server isincreasing and response to single query is getting slower and slower.Massive data alsochallenges the single server's storage ability.In this paper the database clustertechnology is used on marine environmental data management.Data is distributed todifferent database nodes.Cluster middleware system is responsible for the cooperationand parallel processing of different nodes.The system thus gets good performance,availability and extensibility with the breaking through of the performance bottom neckof traditional DBMS running on centralized database server. 3.Research on critical marine environmental database cluster technologyBased on optimal storage structure and distributed management strategy,research workis done on data distribution,load balance and parallel query for the database cluster:Propose a new data distribution algorithm:Two Phase Distribute Algorithm(TPDA).TPDA means uniformly allocate fragments and then allocatereplications through different weights of cluster nodes.Propose a replication based load balance algorithm to improve systemperformance.The load balancer can solve the problem of hot node and improvethe system's reliability.Propose a parallel query algorithm to realize transparent data access.User'sglobal query is parsed and converted to local queries according to the metadata.Call load balancer to get best query node and execute the local queries inparallel.At last reconstruct the results and return them to users.With the above research result,based on the MAGIS (Marine and AtmosphericGeographical Information System) platform,the multiple nodes marine environmentaldatabase cluster is constructed,the cluster middleware Distributed OceanEnvironmental Data Manager is developed,the marine data storage optimization anddistributed management are realized and the system overall performance and reliabilityare improved.
Keywords/Search Tags:Massive Marine Environmental Data, Relational Schema, Database Cluster Middleware, Marine Geographic Information System, Parallel Query
PDF Full Text Request
Related items