| Deep neural network is an important model of deep learning,which often has a very large number of hierarchical structures,including input layer,output layer,and a series of hidden layers.But if we simply add more hidden layers,the deep neural network is still a simple linear combination of operations,and does not have the real meaning of artificial intelligence.At this time,the addition of the activation function adds a nonlinear expression ability to the neural network.With the development of integrated circuits,more and more neural networks are implemented using integrated circuits.During the implementation of neural network accelerators,the activation function is often the most expensive and difficult part of hardware implementation.This paper proposes three hardware implementation methods for Swish activation function,which are piecewise quadratic function approximation,lookup table approximation and neural network-like scheme.The main purpose of the piecewise quadratic function approximation is to divide the Swish function into three different quadratic functions for fitting,and then implement the calculation of the quadratic function on the hardware,and design the state machine to reduce the use of adders and multipliers.Lookup table approximation is mainly to use the LUT to realize the cache unit to complete the storage of the input and output map of the Swish function,so that the essence of the function calculation is changed to the input data to find its corresponding output data in the lookup table.In addition,based on a perceptron structure similar to a single-layer neural network,this paper proposes a scheme to implement Swish activation function with a simple neural network.The neuron data of the input layer of the neural network is 0 or 1.All the neurons of the input layer are combined together to form the two’s complement form of the input data of the activation function.The output layer is the output data of the activation function.Set the input layer like this The purpose is to cancel the multiplication calculation in the neural network calculation,only use addition to complete the calculation of the neural network,reducing the consumption of computing resources.This paper also designs three architectures for activation function hardware multiplexing for the above three schemes,and uses a unified hardware architecture to complete the realization of multiple activation functions on the same device to facilitate seamless switching of activation functions for specific applications.This paper finally designs a simple convolutional neural network hardware accelerator implemented with several cache units and an arithmetic array on a dedicated hardware architecture,and uses the Swish function hardware implementation method proposed in this accelerator for verification the feasibility and accuracy of this method.Experiments show that,compared with the traditional activation function hardware implementation scheme,the activation function accelerator designed in this paper can combine the above two traditional general activation function implementation schemes in terms of hardware resources and computing efficiency on the premise of meeting the accuracy requirements of the neural network.Advantages: For hardware resource consumption,the accelerator designed in this paper is reduced by about 50.4% compared to the lookup table method;for computing efficiency,the accelerator designed in this paper is about 57.1% faster than the piecewise quadratic function method. |