Font Size: a A A

Design And Implementation Of Job Scheduling System For Electronic Commerce Data Warehouse

Posted on:2018-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:L HuFull Text:PDF
GTID:2348330512486410Subject:Software engineering
Abstract/Summary:PDF Full Text Request
After the explosive growth a few years ago,electronic business has entered a stable stage of development.Effective data management has become the core competitiveness of enterprises,who can effectively manage these huge amounts of data,and can effectively mine the valuable information,will stood at the highest point of strategy,grasp the market opportunities.Along with the electronic commerce data warehouse business becomes more and more complex and the data is more and more big,the data warehouse scheduling system is more and more high requirements,both to ensure the efficiency and accuracy of job execution,but also the logical relationship between the data.These requirements are the new challenges for the construction of job scheduling system.At present,there are not many open data warehouse job-scheduling systems in China,only a few open source job-scheduling framework,such as timer and quartz framework.The existing timer and quartz frameworks are time based job scheduling framework,although it has excellent performance in dealing with timing tasks,it is difficult to deal with dependency dependent trigger jobs.After summarizing the scheduling requirements in our daily work,we design a set of customized job scheduling engine,which can be used to manage the time triggered task and the dependent trigger.Due to the increasing amount of data,the performance and scalability of traditional relational databases have suffered a bottleneck.The birth of the Hive database is to meet the needs of the era of big data.This system uses Hive as the data warehouse architecture big data platform,make full use of the Hadoop cluster stable high scalability advantages,the distributed cluster can satisfy the electronic commerce enterprise data warehouse on stable / high performance / economic demand.
Keywords/Search Tags:Data warehouse, job-scheduling, ETL, HIVE
PDF Full Text Request
Related items