| In the management mode of electronic commerce enterprises in recent years,the e-commerce taking ‘flash sales' as the core began to spring up.This new e-commerce mode which emerged from Vente Privée which is a French website has been gradually applied in some well-known e-commerce in our country.Its marketing mode of time limit,quantity limit and price limit caused a buy mania in the e-commerce market,and all kinds of “flash sales” e-commerce businesses are increasing,and at the same time,the data size of enterprise system is also rising sharply.In this context,flash sales e-commerce have more urgent requirements for large data storage and data warehouse ETL technology.In big data platform,the presence of large amounts of unstructured data made the management analysis of “flash sales” e-commerce become more effective and profound,but it also increased the difficulty and efficiency that the enterprises uses these data.Because it especially involves in a diversified database type,it is very difficult for the e-commerce to conduct an unified and collaborative application scheduling when they are faced with these databases.Therefore,how to search for the relationship between these assignments in a complex system and conduct an efficient scheduling has become a problem that needs to be solved urgently by the big data platform application of flash sales e-commerce.This thesis put forward an ETL job scheduling scheme which is suitable for the big data platform of flash sales e-commerce,and on the basis of the scheme,it designed the relevant system.First of all,based on the demand analysis of the job scheduling system of flash sales e-commerce big data platform ETL,this thesis proposed the business requirements of ETL job scheduling system in the aspect of metadata management,task management,extraction load management,job scheduling management,monitoring management and so on.Meanwhile,it also put forward the system's performance requirements in the aspect of reliability,ease of use,security,maintainability,scalability,high cluster utilization and so on.On this basis,the thesis conducted the design of ETL job scheduling system of flash sales e-commerce big data platform,and put forward the logical architecture based on the front-end WebApp and back-end compute cluster,and designed the technical architecture with the aid of the Hadoop distributed service framework for the system.Secondly,the paper further designed the key modules and businesses of ETL job scheduling system of flash sales e-commerce big data platform,including metadata management core content,metadata management structure model,task management function,job definition design,job dependencies,job operation design,extraction loading process,job scheduling strategy and selection,scheduling exception handling,etc.Finally,it tested and applied ETL job scheduling system of flash sales e-commerce big data platform,and the test results showed that this system has a good performance in the aspect of function and performance test,so it can be operated online.After the formal operation,it found that this system had an obvious promotion effect on the data storage,data mining and job scheduling under the background of big data platform,and its application effect was very good. |