Font Size: a A A

Hardware Implementation Of Quasi-newton Neural Network Training Algorithm Based On Approximate Computation

Posted on:2019-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2428330626452343Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
In recent years,artificial neural network(ANN)has developed rapidly and has been widely used in many fields,such as electronics,economics,medical treatment and so on.Training is an important step in developing an ANN model.Quasi-Newton method is considered to be one of the most effective neural network training methods.However,Quasi-Newton training often needs long time especially when the ANN architecture is large.To accelerate the training process,Quasi-Newton method based on floating point arithmetic has been implemented on FPGA.The implementation consists of six modules: Pseudorandom Number Generator(PNG)module,Line Search(LS)module,Gradient Calculation(GC)module,H Updating(HU)module,Objective Function Evalution(OFE)module and Computing Schedule Controller(CSC)module.By evaluation of each module,it is found that there is still a large space for optimization no matter in resource usage or execution time.This paper aims to give a futher optimization to the hardware implementation of Quasi-Newton algorithm by approximate computation.Firstly,the resource usage evalution shows that the HU module is the most computation and memory intensive part.Therefore,a fixed-point hardware design of HU module is proposed.A matrix property checking and search direction switch scheme is proposed to address the overflow issue,and a precision scaling scheme is proposed to deal with the low precision issue caused by the reduced wordlength.The experimental results show that compared with the single-precision floating point version,the mixed precision design with fixed point B matrix updating reduces the usage of LUTs,FFs,DSPs and BRAMs by 10.9%,20.2%,2.2% and 18.1%,respectively.Then,the execution time evalution shows that the LS module is the most time-consuming module.In order to solve the issue,an inexact line search method is implemented to replace the original exact line search method.For the highest training speed,an end-to-end FPGA version of Quasi-Newton method using inexact line search method is implemented.Moreover,the inexact line search method makes hardware-software co-design possible.The objective function evalution unit in line search module which consumes the most computional resource is moved to CPU for aspeed and resource tradeoff.The experimental results show that the end-to-end FPGA design achieves up to 239 times speed up compared with software implementation.The FPGA+CPU design is up to 153.1 times faster than the software implementation and achieves up to 45% LUT,29% FF and 64% DSP reduction compared with the FPGA design.
Keywords/Search Tags:Neural network training, Quasi-Newton method, FPGA, Approximate computation
PDF Full Text Request
Related items