Font Size: a A A

An Application Study For ETL Techniques In Realization Of Data Analysis System

Posted on:2012-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:J XueFull Text:PDF
GTID:2178330332985812Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The data warehouse has become an important solution to the large financial business data processing, aggregation and analysis, and ETL (Extract-Transform-Load) process plays a key effect in the data warehouse application. ETL process could extract and transform isolate and heterogeneous data sources and load them into the data warehouse, its main functions are cleanness, standardization and aggregation of various types of business data, then providing high-quality data for data warehouse-based decision and analysis.The author's research work and innovation can be summarized as follows:1) Researched the related technologies and methods of the data extraction, data conversion and data loading in the ETL process.2) Implemented the ETL process having parallel processing capability, supporting for multiple data sources, flexibly configuring the task and easily expanding the ETL functions;3) According to the characteristics of large amount data in the financial sector, proposed to use clustering, load balancing, and Oracle RAC (Real Application Cluster) three parallel processing technology to improve system performance in the ETL process;4) Based on the the proposed securities business needs in the risk control systems, given the system architecture of ETL in the cluster environment, and the main module design of data extraction, data conversion and task management;5) Detailed the realization of the load scheduling, task management and scheduling implementation, and given the main data model and key classes.The ETL system has been successful on-line, running stably. It is more efficient than current main ETL tool. The fact proved that combination of parallel processing and cluster load balancing technology is reliable and effective in improving ETL performance.
Keywords/Search Tags:DataWarehouse, ETL, Parallel Processing, Load Balancing
PDF Full Text Request
Related items