| The vigorous development of big data has ushered in a new peak of machine learning.The application of artificial intelligence is gradually refreshing people’s understanding of AI,but it also brings many new problems.First,there is the problem of data.Training models with high availability and high accuracy often means that large-scale data is in need for support,while if the data needs to be applied to the model for training,one has to go through multiple processes such as data collection,data preprocessing and data annotation,which requires a lot of labor and cost.Second,there is the problem of computing power.The huge data scale and complex machine learning model put forward higher requirements for the computing power on the training side,while the small-scale training equipment cluster often cannot reach the standard of the computing requirements of the large-scale model.In view of this,distributed machine learning provides ideas for solving computational power problems,and federated learning based on distributed machine learning further solves data problems.Federated learning,as an important field in modern machine learning,has played a key role in many data privacy sensitive scenarios.The current federal learning field mainly faces the following problems:1.The existing federated learning method mainly adopts the network topology structure based on central server,which is not applicable in some reality scenarios.For example,due to the sensitivity of computing tasks,governments,banks and other institutions will not rashly offer private computing tasks to private computing parties,and the communication cost of connecting central servers always appear to be unaffordable to some of the small and medium-sized third-party service providers.There may also not exist any suitable central server that can connect all users,resulting in the current federated learning requirement for a decentralized topology.2.Federal learning also requires that data sets be divided equally among participants.However,how to implement the interaction between horizontal and vertical partitioned participants so that federated learning can have a more diverse set of participants is a challenging problem.How to realize the training between heterogeneous clients and heterogeneous federated learning in decentralized topology is also an urgent problem to be solved.3.The cryptographic encryption technology in the existing federated learning implementation scheme often requires huge computing overhead or communication overhead.How to balance computing overhead,communication overhead and protocol security becomes a key point to ensure the availability of the model.To solve the above problems,the contribution of decentralized and heterogeneous federated algorithm is as follows:1.The horizontal data partitioning client interaction algorithm and vertical data partitioning client interaction algorithm are proposed to solve the decentralized demand of federated learning under decentralized topology.Through P2 P method,the application of decentralized horizontal federated learning in the cross-silos scenario and decentralized vertical federated learning scheme are realized.At the same time,the problems of longitudinal federated learning data feature fragmentation,model transmission in mere local-feature mode in the training process and reliance on cryptography encryption strategy are respectively overcome.2.The horizontal and vertical heterogeneous client interaction algorithm is proposed to solve the needs of completing federated learning training under heterogeneous data distribution.In order to solve the problem of federation learning training for heterogeneous data partitioning without central aggregation server,the horizontal and vertical heterogeneous client interaction algorithm realizes the federation learning interaction algorithm for horizontal and vertical heterogeneous data partitioning through model segmentation,model alignment and other schemes,while not causing any more dependence on the third party central aggregator.3.The local iteration method is proposed to reduce the computational bottleneck caused by homomorphic encryption for federated learning.In the training process of longitudinal federated learning,the large-number computation of homomorphic encryption is the main bottleneck in computation,while the local iteration method reduces the encryption time by reducing the number of repeated large number operations,and reduces the computational level of difficulty in distributed heterogeneous federated interaction algorithm.To sum up,a decentralized federated learning communication protocol that includes both horizontal and vertical heterogeneous clients can be realized by implementing the decentralized heterogeneous federated interaction algorithm.The decentralized architecture is implemented by means of broadcast,and the training between heterogeneous clients is completed by combining homomorphic encryption and secret sharing.A local iterative method is proposed to reduce the huge computing overhead brought by homomorphic encryption from the perspective of protocol.Experiments show that the decentralized and heterogeneous federated interactive algorithm can accelerate the encryption process by more than 10 times,which greatly reduces the level of difficulty in encryption calculation,with the precision loss less than 5%. |