Font Size: a A A

Research And Implementation Of The Data Exchange Module For The Massive Data Analysis Platform

Posted on:2016-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:W T LvFull Text:PDF
GTID:2298330467993198Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of economy and the progress of technology, human’s demand for data is growing.The research on huge amounts of data becomes one of the hotspots in the field of computer. The processing and analysis for big data can use Hadoop distributed system, which can realize the parallelization of the task and greatly improves the efficiency. Huge amounts of data transmission rate between the user and the distributed system will affect the efficiency of the whole mission.The massive data analysis platform is an application platform which is based on cloud computing and data mining. Users can realize the big data’s analysis, processing, mining and storage through the Web interface. Hadoop is used as the underlying task execution environment. This subject realized a common data exchange module in the massive data analysis platform. The data exchange module mainly researches the data transmission between Hadoop file system HDFS (Hadoop distributed file system) and the local file system, relational database, URL. The main works are as follows:1. Background and status of data exchange technology are analyzed. Data exchange technology is introduced, it mainly includes Hadoop, HDFS(distributed file system), MapReduce(parallel computing framework), JDBC(Database programming interface). Data exchange tools are discussed, the tools mainly include data exchange interface provided by Hadoop, Sqoop and FTP service. This subject also compares the data transmission performance of different tools.2. The requirement of the common data exchange module is analyzed. The data exchange between HDFS and different data source is designed and realized. It includes the data exchange between HDFS and local file system based on FTP service, the data exchange between HDFS and URL and the data exchange between HDFS and relational database based on DB interface, Sqoop and FTP service.3. The data exchange module is integrated into the the massive data analysis platform, then its function is verified.
Keywords/Search Tags:data exchange, transmission, HDFS, Sqoop, FTP
PDF Full Text Request
Related items