Font Size: a A A

The Design And Implementation Of Data Automation Generation And Configuration Platform

Posted on:2016-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ShaFull Text:PDF
GTID:2308330476452773Subject:Software engineering
Abstract/Summary:PDF Full Text Request
One of the responsibilities of ebay Shanghai is to maintain and develop company-wide data warehouse. That is a process which extract data from the live production database every day and push the data to the data warehouse. In this process we organize and re-generate these data so that they will be more suitable for business analyzer. The data warehouse has a strict access control, so it is difficult for business user to get the data in data warehouse. Normal process is that the data warehouse team will write ETL code to extract the data and send the data files to analyzers. But this way will cost too much time. Corresponding solution is to design a software to provide the service.This paper describes various techniques, including those which are used in ebay internal and some background information. We design and implement a software, which allows users to submit requests, automatically generates corresponding code scripts to automatically deploy to production processes, and finally run automatically every day.The main functions are:(1) Automatically generate configuration files and configuration scripts. System will get the what the user submitted on the page. Then it will automatically generating a series of configuration files and scripts according to the received demand. System users can extract data from the data warehouse needed to run this function through the production of scripts and configuration files.(2) Generate scheduled task scripts automatically. This function will generate corresponding timing tasks script to run each user’s data based on the needs of users.(3) Configuration files’ and scripts’ automatic submission. System will submit theconfiguration file and the timing schedule script files to the code base and deploy to production processes to run every day.(4) User request management. Created a standardized process to handle user requests. In the course of processing the user can get the current processing status of his request..
Keywords/Search Tags:ETL, data warehouse, data feed, automation generation
PDF Full Text Request
Related items