Design And Implementation Of Metadata Management System For Big Data Processing Of Transportation Logistics

Posted on:2019-01-09

Degree:Master

Type:Thesis

Country:China

Candidate:R Hu

Full Text:PDF

GTID:2428330596996675

Subject:Computer technology

Abstract/Summary:

With the development and progress of science and technology,big data processing technology has been deeply studied and widely applied in many industries.This paper takes the construction of logistics platform in Huai'an City as the research object.Map Reduce is adopted as the basic structure and Hadoop is used as the support of technology.Relevant services can be carried out for transportation and government departments,big data processing and related work can be carried out for the application of SaaS basic technology.Because the application development of traffic logistics big data processing involves many fields.Therefore,how to integrate multiple data and multiple goals to solve problems better is the key to this paper.At the same time,for the work engine of Map Reduce module and assembly process operation of Oozie supported big data processing,how to avoid Oozie's low execution efficiency when executing because of the data basis between its nodes is also the key content studied in this paper.Based on the background analysis above,Map Reduce and Hadoop are used in this paper.Oozie and other related technologies are introduced at the level of introducing Map Reduce process parallel to upstream and downstream node modules to optimize the processing mode of transport logistics big data.Through the final test and practical application,we can see that the system is effective and feasible.Compared with other systems,the features of this paper are as follows:First of all,for the low utilization of the Map Reduce workflow,we use the Reduce Task job feature inside Map Reduce to implement the operation mode for different time.The downstream node can partially take precedence over the upstream node work operation.Further,the upstream and downstream node modules can be simultaneously implemented.These can greatly improve the efficiency of the system.Second,on top of the original Hadoop,the structure of Map Reduce basic operations that support implementation and append inputs is adopted.For the Map Reduce operation that is being performed,the input data can be appended so that theupstream and downstream modules can operate in parallel to guarantee their own work.on the original Oozie module,the work mode of Map Reduce that supports upstream and downstream parallel operations is carried out.This module has the structure of a double actuator.It can perform execution and analyze by identifying upstream and downstream modules of Map Reduce jobs contained internally.As you can see from the actual test results,when the number of Reduce tasks is greater than the number of concurrency contained in the Reduce Slot of the cluster,the partial execution mode increases its own work efficiency by about 20%.Finally,research is carried out on the big data processing problem of the multi-decision module.The research focuses on the integrated development operation contained in the big data processing module proposed by the developer.Based on the Hadoop Eclipse plugin contained in this integration module,we propose a test sandbox module.It allows developers to implement specific tests,actual deployments,and so on in this environment.

Keywords/Search Tags:

big data processing, Map Reduce, Hadoop, Oozie

Related items

1	Design And Implementation Of Traffic And Logistics Big Data Process Sytstem Based On Hadoop
2	Massive Data Processing Based On Hadoop2.0
3	The Design And Implementation Of CPD Data Report Based On Oozie
4	Research On Techniques Of Incremental Processing For Big Data Based On Hadoop Platform
5	The Research On Distributed Task Scheduling Algorithms Based On Hadoop Platform
6	Reach On Map-reduce Application Based On Hadoop
7	Reach On Map-Reduce Application Based On Hadoop
8	The Design And Implementation Of An Online Video Site Data Statistic System
9	Web Log Analysis System Based On Hadoop Platform
10	The Help Of Book Lessons For Early Education Eased On Hadoop