Massive Data Processing In Complex Application Scenarios

Posted on:2016-08-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z Dong

Full Text:PDF

GTID:2308330461487386

Subject:Computer Science and Technology

Abstract/Summary:

With the rapid development of computer technology and network technology, more and more information produced in human activities is digitalized. Not only the volume of data grows dramatically, but also the data sources present heterogeneous characteristics. On the other hand, the value of the data is emphasized, and people expect to discover useful information and patterns from vast amounts of diverse data. Therefore, how to process massive data efficiently becomes a hot research in recent years, appealing a lot of attention from both the academic and the industry.In general, there exist two kinds of pattern in massive data processing scenario, namely offline processing and online processing. In offline processing,the data has been stored, therefore it is static and historical, and the throughout is addressed. In online processing, the data flow in continuously, therefore it is dynamic, and the real-time is addressed. In recent years, both offline processing and online processing have been researched widely, and many excellent theory and products emerge.In this paper, we focus on a general class of application scenario in which both offline processing and online processing of massive data are needed. There are two main worksin this paper:1) We propose a distributed system architecture which could be adopted in the application scenario mentioned above. Firstly, the architecture could support the access of high-speed incoming data from multiple data sources efficiently;Secondly, the architecture could offer consistent data for subsequentoffline processing module and online processing module in a scalable way;Thirdly, the architecture could support aggregating the results of offline processing and online processing at the application layer. We give theoretical analysis of the rationality of the architecture, and we prove the effectiveness of the architecture through experiment and application.2) We propose a decentralized method to support the task assignment in distributed environment.There doesnâ€™t exist a central node in our method, and all involved nodes run the assignment algorithm independently which could avoid system failure when central node encounters exceptions in master-slave based method. We also discuss the effectiveness of the method in theory, and we give comparative analysis of our method and centralized method.The architecture and method proposed in this paper have been adopted in real production environment for a long period of time, demonstrating the practical value of them.

Keywords/Search Tags:

massive data processing, architecture, decentralized, task assignment

Related items

1	Research On Decentralized Architecture For Massive MIMO System
2	Research On Theoretical Method Of Distributed And Robust MIMO Transmission
3	The Design And Implementation Of Task-based Massive Insurance Data Processing System
4	Decentralized task allocation and task execution using autonomous agents cooperating in dynamic mission
5	Research On Task Distribution Algorithms In Mobile Edge Computing
6	Research On The Method Of Hunting Task Assignment Based On Energy Balance
7	Research And Implementation Of Task Assignment Method Based On Data Features
8	Development of a task assignment tool to customize job descriptions and close person-job fit gaps
9	Analysis And Optimization Of Massive Data Processing On High Performance Computing Architecture
10	Research On Ticket Price Forecast And Task Assignment In Cloud Computing