| As the business line of the enterprise continues to expand,data sharing between various business units and product lines becomes a problem,resulting in data that is too fragmented and redundant,and data interconnection and cooperation between various departments becomes particularly difficult,With the expansion of the business,the user data of each business unit product has become very large,which makes it integrate and summarize the data resources originally stored in each product and provide users with better product service and experience desire through data analysis.In order to solve the problems of waste of resources and low resource utilization caused by repeated research and development in various business divisions of the enterprise,the company's big data department has developed a one-stop big data intelligent cloud research and development platform that integrates data import,data management,data development and task scheduling.This paper introduces the development status of the data development platform,the significance of the one-stop data development platform for the enterprise,the related technologies used in the platform development,the overall architecture design of the platform,and the implementation details of each sub-module system.The platform uses the open source data synchronization framework DataX at the bottom of the data integration module to realize data synchronization of heterogeneous data sources,making the data synchronization task simple and controllable.The data development module uses the Airflow scheduling framework to implement task scheduling for different data development tasks.The data query module and the editor function of the data development module are based on the Monaco Editor open source library for implementing SQL,Python,and Hive script task writing functions..At the same time,the data development module is based on the Joint.js implementation to link upstream and downstream script tasks in the form of DAG.making the development task more intuitive.I am mainly involved in the front-end development of data integration,data development,data query,data management related functions and the back-end development of data integration modules in the one-stop big data intelligent cloud R&D platform.This platform is a one-stop big data development solution that solves the problems of data synchronization and integration between heterogeneous data sources,task scheduling of data development tasks,editor functions in Web form,and task association in DAG form.The 1.0 version of the one-stop big data intelligent cloud R&D platform has been released,meeting the basic needs of the company's big data department,but in order to better provide data development capabilities,truly solve the company's various business units and product lines in data sharing.The problem,the platform is still iteratively improved through the way of agile development. |