The rise of distributed computing frameworks such as MapReduce has made distributed computing greatly improve the processing efficiency of massive data.However,when we implement a Coded Distributed Computing(CDC)scheme with an optimal communication load,the number of input files and output functions required by the system will explode as the number of nodes in the system increases,which results in the actual processing time of the system being significantly higher than the theoretically predicted value.At the same time,data exchange between nodes will inevitably become more frequent,which will lead to an increase in communication load.In order to solve these two problems at the same time,this paper studies coded distributed computing from the perspective of combinatorial design,and proposes several schemes.This paper first uses a classical structural balanced incomplete block design in combinatorics to formulate a data exchange strategy between nodes,and proves that these nodes can get the required data after data exchange.We then construct two cascaded coding distributed computing schemes using the combined structures of several known symmetric balanced incomplete block designs.In the first type of scheme,we design the allocation scheme of input files and output functions by symmetrically balancing the property that the number of blocks is equal to the size of the set of elements in the incomplete block design.Then we design a data exchange scheme between nodes by utilizing the properties of balanced incomplete block design that some nodes store files and responsible output functions in this scheme.In the second type of scheme,we design the allocation scheme of files and functions by using the complementary set of elements set by blocks,and also use the files stored by some nodes and the responsible output functions to meet the requirement of balanced incomplete block design Nature designs a scheme for data exchange between nodes.We compared these two schemes with the current mainstream coded distributed computing schemes,and found that their communication load and the number of input files and output functions have achieved good results.Specifically,in each new scheme,the number of input files and output functions and the number of computing nodes are less than any of the current mainstream schemes,and their number of input files and output functions has a linear relationship with the number of nodes.Meanwhile,when the number of nodes is very large,the communication load in each new scheme is close to the optimal communication load of the CDC scheme proposed by Li et al.This paper also proves from a mathematical point of view that when the number of nodes increases,the communication load of the two schemes is very close to that of the CDC scheme proposed by Li et al. |