Font Size: a A A

Design And Implementation Of Data Warehouse Management Module For Mobile Reading Platform

Posted on:2017-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:2348330518995520Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,enterprises are facing increasingly rich and complex data,which generated by the computer system is much larger than before.The maturity of data warehouse technology provides an effective solution to enterprise data management.In the process of building,constructing and using data warehouse,the complexity of large data collection,storage,computing,management and access has become an increasingly prominent issue.For users,they handle many heterogeneous data sources,which have different indicators and explanations,thus result in statistical inconsistencies.At the same time,the business understanding and the actual development are not synchronized.For business and technical personnel,they have to deal with multiple systems,and the definition of business term is not compatible with the development of the system.They do not have a standard information-carrying platform.How to build a robust warehouse management platform is a key problem in the complex data environment.The Hive Hadoop data warehouse platform of China Mobile's mobile reading base has been operating stably for nearly a year,and the development task is mainly to satisfy the business application needs on the application level of the warehouse.However,the management and maintenance of the data warehouse itself has not formed a complete set of functions.Users can only rely on manual to get the required information.In order to manage and maintain the the whole warehouse platform scientifically,this project designs and implements the data warehouse management module which is in line with its own characteristics.A good design of warehouse management module is not only convenient for IT personnel,technical personnel and maintenance personnel to better manage and use the data warehouse resources,but also can help ordinary business personnel to use the vast amounts of data provided by the warehouse to a large extent.This project designs and develops the warehouse management platform which conforms to the characteristics of Hive Hadoop data warehouse for the mobile reading base,providing metadata management,task scheduling monitoring,and data lineage analysis function.Metadata management allows users to identify the data they are concerned about efficiently,which is also the basis for task scheduling monitoring and data analysis.Task scheduling monitoring allows user to get the Hive operating status in real time,and to get the information and running status of the upstream and downstream nodes.Data lineage analysis provides user the data map,so that users can easily understand the source and destination of the data.Data lineage analysis also provides reliable data support for the follow-up warehouse structure optimization.The organizational structure of the thesis is as follows:The first chapter briefly describes the research background,content and significance of the project.The second chapter describes the related background and technology,including the mobile reading BI(Business Intelligence)platform,data warehouse,metadata,Oozie and current situation of data warehouse application.The third chapter mainly describes the demand analysis and the overall design of the warehouse management module for the mobile phone reading platform.The fourth chapter mainly describes the detailed design and implementation of the warehouse management module,including the implementation of the key functional modules of the system.The fifth chapter includes test and analysis of the system,including the function test and the detailed comparision analysis of system before and after the application of the effecte.The sixth chapter of the paper summarizes the work,and proposes further research in this work.
Keywords/Search Tags:Data warehouse management, Metadata, scheduling monitoring, Lineage analysis
PDF Full Text Request
Related items