Font Size: a A A

Coded Distributed Computing Schemes Based On Nodes Grouping Method

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhouFull Text:PDF
GTID:2518306770972069Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Distributed computing is the process of breaking up an application into many small parts that are then distributed to multiple computers(nodes).As the amount of data and the number of nodes increase,the data exchange between the nodes becomes more and more frequent,which leads to more and more time spent on data exchange,affecting the performance of distributed computing.In order to solve this problem,the paper for the first time to introduce coding into distributed computing,and put forward Coded Distributed Computing(CDC),it is to use the node carefully designed redundant computing power to create coding multicast between nodes,so as to reduce the number of data exchange between the nodes,to achieve the purpose of reducing the communication load.At the same time,the paper proves the inverse relationship between the computation load and the communication load in the coded distributed computing.That is,if the computation load increases by r times,the communication load will also decrease by r times,and gives a scheme to achieve the optimal communication load.However,to implement this scheme,a large number of input files and output functions are needed,which is difficult to be applied in practice.In order to reduce the number of the input files and output functions required,this paper considers two different scenarios:(1)a homogeneous computing network with the same storage,computing and transmission capacity of each node;(2)a heterogeneous computing network where the storage,computing and transmission capacity of each node is not necessarily the same.The specific research contents are as follows:(1)In the homogeneous scheme,the nodes are grouped by modeling the markers of the nodes to reduce the number of input files;to reduce the number of required output functions,each output function is calculated by at least all the nodes in one group,so this scheme is suitable for the case where each output function is computed by multiple nodes.In order to ensure the same number of signals transmitted by each node during the communication between nodes,a method is proposed to let nodes in the same group transmit each other.Experimental results show that the proposed new scheme can reduce the number of input files and the number of output functions while satisfying certain parameter constraints,and the resulting communication load ratio is always less than 2.993 compared with the optimal communication load.(2)In the heterogeneous scheme,the case where each output function is calculated by multiple nodes is considered.In order to achieve this goal,several homogeneous systems are grouped together.The corresponding CDC schemes are constructed by giving the number of times that each output function is calculated.In order to reduce the number of required input files,all nodes in the same group store the same number of input files,and nodes in different groups store different numbers of input files.In order to reduce the number of required output functions,all nodes in the same group calculate the same output function when allocating output functions.When communicating between nodes,all nodes in different groups can transmit signals.Experimental results show that the proposed new scheme can reduce the number of input files and the number of output functions while satisfying certain parameter constraints,and the resulting communication load ratio is always less than 3 compared with the optimal communication load.
Keywords/Search Tags:Map-Reduce framework, grouping design, coded distributed computing, homogeneous system, heterogeneous system
PDF Full Text Request
Related items