Design And Implementation Of Real-time Data Integration Platform Of Logistics System

Posted on:2022-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhouFull Text:PDF
GTID:2518306524490204Subject:Master of Engineering
With the rapid development of Internet technology,the application of Internet technology in traditional industrial fields has become common.The importance of industrial data is self-evident.However,the lack of data management capabilities and data management tools of enterprises in traditional industries,coupled with the diversification of data sources and high heterogeneity,lead low quality or information islands between different business systems, a large amount of industrial data,as a consequence of which the information in the data cannot be effectively used,and a large amount of valuable wealth contained in it is wasted.In view of the above scenarios,efficiently integrate and clean multi-source heterogeneous data,extract data from different data sources and different structures,then store them in a unified structure is a problem faced by industrial data management and need to be solved.This thesis studies and designs a real-time logistics system data integration platform,which is developed based on open source ETL tool Kettle,and the B/S architecture is adopted on which to improves the flexibility of the platform and saves users the time spent in downloading and installing executive applications.Combining the real-time extraction function based on change data capture technology and metadata-driven,make up for the shortcomings and deficiencies of Kettle in real-time extract.This platform supports data integration and cleaning of data from multiple relational databases,non-relational databases and file storage forms.In addition,this platform provides convenient cleaning,integrated operation design user interface,complete data integration operation scheduling function,and complete authorization authentication capabilities,which improves system and data security.Finally,this thesis also implements a client data analysis system,the data source of which is the business data of a logistics company integrated using the real-time logistics system data integration platform,the front end of which is developed based on the Echarts chart library,which provides convenience for data analysis.From the perspective of software engineering research and development process,this thesis first divides the real-time logistics system data integration platform into sub-modules and subsystems such as real-time extraction module,task management module,authorization authentication module,and client data analysis system.Reasonably conduct requirement analysis and function introduction for each module,and design the specific functions,attributes and logic of each module.Then design the overall system architecture,including its software architecture,module structure,etc.After that,this thesis introduces the specific realization process and function flow of each functional module and the realization principle of some core functions,and carries out the B/S structure implementation of Kettle,discards its original C/S structure to improve its flexibility.Finally,the performance test and function test of the real-time logistics system data integration platform and client data analysis system is carried out to verify the usability and concurrency of the platform.
Keywords/Search Tags:Multi-source heterogeneous data, Kettle, Data integration, Data extraction
