Font Size: a A A

Application Research On Marine Environment Data Warehouse And Data Mining

Posted on:2012-05-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J SunFull Text:PDF
GTID:1228330377453245Subject:Marine Geology
Abstract/Summary:PDF Full Text Request
"Digital Ocean" is an overall marine information system; it monitors extensive, multi-resolution, multi-temporal, multi-type marine data and analyze those data by a set of algorithms and models. As part of marine-based data platform of "Digital Ocean", marine environment data warehouse integrates extensive multi-source, heterogeneous, distributed marine environment data, in order to serve marine research, marine management and marine sustainable development well. This research project offers solid data framework and theoretical base.This paper deeply studied the marine environment data system’s planning, data modeling, data warehouse construction techniques, and OLAP and data mining technology for integrated data warehouse. First systematically proposed a completed framework, it included marine environment data’s integration, application and analysis.For marine environment data integration and application request, this paper designed a four layers marine environment data warehouse integration and application framework, which included data resource, data loading, data warehouse and application. In order to satisfy different requests from different users like the original data requirement and the decision support requirement, the data warehouse is divided into basic marine environment data warehouse, marine environment data warehouse, data marts layers. It also included data warehouse management tools, data access right management to entire data warehouse system and security management and so on.After summarized the type and characteristics of marine environment data, and analyzed the marine environment warehouse’s construction method and key model, this paper proposed marine environment data warehouse architecture, and its detailed structure design, theme design and multi-dimensional model design. The basic marine environment data warehouse stores original data, which fully integrates the previous special survey of ocean, the conventional marine survey data, operational data and marine environmental monitoring data and other information on international cooperation to meet the end-user demand for the raw data. Processed by isomorphism, transformation, integration but statistics, the data of the basic datawarehouse were loaded in the marine environment data warehouse organized bymulti-dimension model. Data mart is created based on user demand for data storage;it filters, aggregates and interpolates the data in the marine environment datawarehouse to build data cube, which is used to support OLAP, data miningapplications.This paper studied the marine environment data warehouse’s performanceoptimization methods and proposed a detailed optimized strategy; concurrent dataaccess based on index optimization, fragmentation greatly enhanced the massivemarine environment data warehouse’s performance. ETL is the key to building a datawarehouse. Based on the characteristics of the marine environment data, this paperdesigned and developed the marine environment data warehouse ETL prototypesystem after studied the data cleansing, transformation and integration rules. ThisETL prototype system provided a variety of data access interfaces. To ensure thequality of data, rigorous data cleansing is needed, the data which do not meet therequirements is filtered out or amended according to the cleansing rules, thusensuring the data quality and the accuracy of marine environment data analysis anddecision-making in the future. Considering the marine environment data warehouseis massive, many historical data, small updated frequency, this research presented acompleted set of marine environment data warehouse incremental update mechanism,which greatly improve the marine environment data warehouse efficiency.In order to observe and analyze the marine environment data from multipleangles, this paper studied the construction method of the marine environmentspatial-temporal data cube. Using the spatial and spatio-temporal interpolationalgorithm, the irregular marine environment data is gridded to construct variousgranularity data cube with time and space dimension. Various measures can becalculated by internal or custom statistical analysis functions, to build a data cubewith different measure or multi-measure. With the examples of marine hydrology,meteorology fields’ data analysis, researched how to use the OLAP operations torealize the various application requirements of the marine environment datawarehouse. In order to find the useful patterns or rules which are hidden in the massivemarine envrionment data, researched marine envrionment data warehouse’s datamining methods, and conducted a preliminary application. Regression analysis canbe used to predict future data trends through the establishment of a regressionequation. This paper used regression analysis to establish the prediction model of theconcentration and turbidity of suspension. By using clustering method, the elementsof surface sediment of the Yangtze estuary have been classified in geochemistry. Theelements in the sediment can be divided into four types. Considering it geologicalbackground, the causes were analyzed also.
Keywords/Search Tags:Digital Ocean, Marine Environment, Data Warehouse, ETL, OLAP, Data Mining
PDF Full Text Request
Related items