Font Size: a A A

The Design And Implementation Of The Enterprise Data Exchange Platform Based On ETL Tool

Posted on:2017-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:C L ZhangFull Text:PDF
GTID:2308330509457571Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the society,enterprise modernization and digitization has become an inevitable trend of development, and each department of the enterprise has an independent business systems. Because of the historical factors, human factors and geographical factors, the data of the system can not circulate, this type of “isolated island of information” has seriously affected the way of enterprise modernization. So, the project has based on the demand of the cross-regional data exchange, and pose that the data exchange platform is the best way to solve the problem above.ETL(Extract-Transform-Load) is one of the most popular data exchange technology in recent year, and this paper bases on the open source ETL-tool Kettle, analysis and research the development of data exchange platform, modify the ETL-tool Kettle. From a technical perspective on the improved the ETL process and Timing start, and for the future of use and management, the project provides real-time data exchange platform information.The most feature of Kettle is written in pure java, so this paper uses Java as the basic development language, the data exchange platform is consist of different data exchange nodes in different geographical. The process of data exchange is divided into local data exchange and remote data exchange, In the local network data exchange process, importing and exporting the database can realize by the interface provided by Kettle, can also export or import by other means. The main content of the paper are as follows: 1)In the remote network data exchange, Kettle can solve the problem of heterogeneous data, use the new socket function instead of binding with FTP server or VPN connection. The function of socket is one of the results for Kettle secondary development. After deploying the nodes of data exchange, data can exchange in different place. 2)In order to enhance the efficient 、stability、safety of data transmit on internet. The project add the nested file scan function 、break point resume function、route synchronization function at the process of socket send, and this paper use Genetic Algorithm to solve the problem of the multi-objective path selection. 3)When the remote data is exchange, the data is transmit by the way of file. In the process of data transmission, temp file can record the location of the breakpoint of the data exchange process, the data can also be encrpyted in order not to be stealed.After the test of the data exchange platform proved that the platform can implement the basic data exchange functions, and the platform can support the local or remote different date exchange. The data from starting-node can long-term stability arrive terminal node. Until now, the platform has deployed simply, running well, and meet the basic needs of managers.
Keywords/Search Tags:ETL, Data exchange, Data Monitor, Genetic Algorithm, Performance optimization
PDF Full Text Request
Related items