Research On Hardware Implementation Of Activation Function In Neural Network On Many-Core Processor

Posted on:2022-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:J J Liu

Full Text:PDF

GTID:2518306308499824

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The role of the activation function is to introduce nonlinearity into the neural net work,so that the neural network can better learn complex nonlinear functions,which i s the core of the neural network to solve nonlinear problems.The selection of the acti vation function plays a very important role in the training of the entire neural network.The selection of activation function affects the time and accuracy ofnetwork training.In a real processor,selecting an appropriate nonlinear activation function can improve the training accuracy,but the calculation of the nonlinear activation function often co nsumes many computing cycles.The delay caused by the software implementation wil 1 affect the training effect of the entire network.The hardware-level implementation re duces the delay while there's a certain unity,and often only a fixed calculation method can be implemented.Now,there is still room for research on how to efficiently imple ment the activation function at the hardware level so as to improve the timing.Based on the artificial neural network,this paper studies the selection of activation function i n the activation layer,and how to fully consider the timing delay problem under the c onditions required for training,while supporting the flexible use of multiple activation functions and precision modes.On the basis of the existing research,the hardware implementation method of the activation function studied in this paper is improved for the long delay and single incompatibility of the hardware implementation of the activation function.It is no longer limited to the implementation of a certain activation function,but takes the selection of the activation function as an extensible choice,and considers the different neural network training requirements for the different approximation accuracy of the activation function According to the requirements of the system,a configurable look-up table structure of hardware and software level cooperation is realized.For this reason,based on the hardware implementation method of linear piecewise fitting and look-up table method,this paper designs a parallel look-up table method that supports scalability.The main work is summarized as follows:1.This paper proposes a method of active function approximation based on parallel look-up table.Based on the existing general operation core architecture of a domestic many-core processor,the idea that memory support 16 separate value modes and one look-up table can correspond to the output approximation values required by multiple activation layers is proposed to realize the transformation of multiple training networks from serial execution to parallel execution.The accuracy requirements of fitting function are verified through experiments,and the feasibility of fitting function replacing the original function into neural network training is verified.2.An extensible parallel look-up table instruction model is proposed.Based on RISC instruction architecture,the instruction description method is analyzed,and a special instruction is designed to realize the parallel look-up table function with configurable software layers.Combined with the instruction structure,configuration parameters are constructed to solve the scalability of parallel table lookup instruction,which can not only set the approximation interval,but also set the approximation accuracy,reduce the execution delay and improve the execution efficiency.3.Corresponding to the proposed parallel look-up table instruction,the parallel look-up table module is designed,which is mainly used to process the input of the look-up table instruction according to the configuration parameters.The specific address in the look-up table stored in memory is obtained by parsing,and the memory access is completed.When the approximation accuracy is controlled within 0.0003 error,the throughput of the look-up table is increased by 8 times,and the efficiency of the whole activation function approximation process is increased by 3 times compared with the traditional method in fitting the same number of fitting values.The experimental results show that the proposed method can get better effect than the traditional hardware implementation method.This research result has been used in a many-core processor chip and is taped out.

Keywords/Search Tags:

activation function, linear approximation, look-up table, RISC instruction, hardware implementation

PDF Full Text Request

Related items

1	The Design And Realization Of Instruction Memory Management Unit In RISC Microprocessor
2	Research And Design Of Multiplier-Accumulator Uint Based On RISC-V Instruction Set Microprocessor
3	Design And Implementation Of RISC-V Extended Instruction Set For Network Application
4	Design And Implementation Of Specific Instruction Microprocessor Based On RISC-? Isa
5	Research On Real-time Simulation Of RISC-V Instruction System Based On Shenwei CPU
6	Extension And Implementation Of CNN Vector Instruction Set Based On RISC-V Architecture
7	Design And Realization Of DSP Instruction Extended RISC-V Processor
8	Design Of Arithmetic Module Based On RISC-V Instruction Set Microprocessor
9	Research On PUF Based Hardware-assited Software Authentication Scheme Of RISC-V Platform
10	Design And Implementation Of RISC Instruction Set Simulation Based On DSP