Font Size: a A A

Distributed Data Aggregation And Big Data Processing In Cyber Physical Space

Posted on:2017-12-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:H L LuFull Text:PDF
GTID:1318330536467192Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an combination of cyber space and physical space,cyber physical space integrates the application demand of each object in the case of information sensing,information exchange and computing ability.Such object relation contains information in 3-dimensions,that is physical domain,cyber domain and social domain,and owns the characteristics of large scale,distribution and dynamics.Therefore,how to design and develop applications in cyber physical space is becoming a hot research area.In the case of infromation sensing,we have to handle the challenges introduced by distributed designing and integrating the large scale objects and their relations information,respectively.Moreover,the relationship among objects includes different information types required by applications.For example,the relative position in physical space,the logical relation expressed by the social attributes associated with the user et al.Afterwards,in the case of information aggregation,we have to face the challenge introduced by the dynamics of objects and relationship between them.Such dynamic includes the variety of state and position,the establishment and invalidity of object relationship.Besides,for information exchange,we have to face the challenge introduced by efficiently processing large scale heterogenous data,which includes specific type of data processing and data processing in different platforms from various applications.This paper is proposed to study on how to develop effective information aggregation and how to efficiently handle data processing.Then,we study how to efficiently extract the relationship between objects in distributed manner;how to overcome the dynamic challenges introduced by movement;how to meet the demands for constructing stable relationship(also the stable sub-net)from applications;how to design and implement data processing system for certain type of data;and how to integrate different data processing frameworks and optimize the tasks schedule for different types of workflows and tasks.The main work of this paper include the following four aspects:1.Low cost relationship discovery mechanism based on distributed data aggregation model: By establishing distributed relationship discovery model in cyber-physical space,this paper considered for developing multiple relations.And designed a minimum spanning tree based distributed relationship discovery algorithm MSTRD.In this way,the relationship discovery process in cyber-physical space is changedfrom centralized manner to distributed manner.At the same time,we also ensure the efficiency of discovery process by reducing the amount of data transmission and time consumption.Numerous simulations show that MSTRD algorithm outperforms centralized strategies case.In addition,the MSTRD can reduce about 1/3data transmission,and reduce about 1/2 time consumption in some scenarios.2.Stable sub-network construction mechanism for distributed dynamic aggregation network: Through the establishment model of distributed sub-network in cyber physical space,this paper designed a distributed objects selection algorithm MRMC.In such mechanism,we tradeoffed the balance of enlarging the amount of entity and solidifying the remaining unchaged interval of object relationship.Large number of simulation were validated the algorithm and the results show that the MRMC algorithm can enlarge the remaining unchaged time RUT of relationship between objects as much as two times.3.Matrix abstraction based large scale data processing framework in cyber-physical space: This paper designed a bulk key matrix(BKM)data structure for managing data both in memory and out of memory.And designed an optimization strategy and a parallel algorithm for data reorganization and asynchronous computation.Also,unary and binary operators were deisgned to support general matrix computation.Based on these elements,a general computation framework for matrix computation is developed which is called Matrix Map.The simulation results show that,compared with Spark,the Matrix Map can handle some common graph algorithm with a maximum 10 times performance enhancement.4.Big data platform integration for diversity data processing in the cyber-physical space: Due to the diversity and complexity of big data platforms,this paper analyzed their advantages and disadvantages and was proposed to integrate the platforms to provide data processing services.At the same time,we designed the framework for scheduling different data processing tasks in the integrated platform and developed genetic algorithm based optimization scheme MMRC for task clustering and scheduling,so as to improve the efficiency of data processing and reduce the cost of resources.Simulation results show that,compared to single platform data processing,the MMRC method can reduce the averaged processing time of each task by about 10%,while the usage of resource consumption is only occupied about 30%.
Keywords/Search Tags:Cyber Physical Space, Big Data, Matrix Processing, Task Schedule, Entity Relationship, Distributed Processing
PDF Full Text Request
Related items