Distributed computing mainly studies how to use distributed systems for computing,that is,to connect multiple distributed computers to form a distributed system through the network,and then divide the data to be processed into several parts,which are sent to multiple computers scattered in the system to calculate at the same time,and then combine the calculation results to get the final result.One of the current popular distributed computing frameworks is the MapReduce framework,which is divided into the Map phase,Shuffle phase,and Reduce phase.Research has shown that in this framework,the execution time spent in the Shuffle phase accounts for the largest proportion of the total runtime.Therefore,in order to reduce the communication load during the Shuffle phase,Li et al.proposed encoding distributed computing,which reduces the communication load between servers by increasing the computational load on each server.They have demonstrated that this solution achieves a basic balance between computational and communication loads.However,while reducing communication load,this scheme also creates the problem of requiring too many input files and output functions,which makes it difficult to apply encoded distributed computing in practice.This article proposes two solutions to reduce the number of output functions and the communication load during the Shuffle phase,in response to the problem of requiring too many output functions and the high communication load during the Shuffle phase.The details are as follows:(1)We propose a coded distributed computing scheme based on fewer output functions.This scheme uses modulus operation to group the nodes that calculate the output function,and each output function is calculated by all nodes within the assigned group.Therefore,this scheme requires fewer output functions to satisfy the calculation of multiple nodes,thereby reducing the number of output functions.We also demonstrated that as the number of servers increases,1)compared to the encoding distributed computing scheme proposed by Li et al.,this scheme significantly reduces the number of output functions;2)The communication load ratio of this scheme to Li et al.’s scheme is less than 1.9981.(2)We propose a coded distributed computing scheme based on master-aided compression.This scheme considers a distributed computing system with multiple edge computing nodes and a master node,where the master node helps the edge node calculate the output function.In this scheme,we introduce compression technology into master-aided coded distributed computing,where edge nodes and master nodes respectively use compression technology to compress the intermediate values of several identical functions into pre-combined values.Then,encoding technology is used to encode the pre-combined values of different functions and create encoded multicast messages.Finally,the encoded messages are multicast to other edge nodes.Research has shown that our new scheme significantly reduces the communication load in the Shuffle stage of distributed computing compared to the original scheme.In this paper,the coded distributed computing scheme based on fewer output functions reduces the number of output functions while maintaining a lower communication load;The coded distributed computing scheme based on master-aided compression mainly reduces communication load. |