Hardware Implementation Of Quasi-newton Neural Network Training Algorithm Based On Approximate Computation

Posted on:2019-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:J Liu

Full Text:PDF

GTID:2428330626452343

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

In recent years,artificial neural network(ANN)has developed rapidly and has been widely used in many fields,such as electronics,economics,medical treatment and so on.Training is an important step in developing an ANN model.Quasi-Newton method is considered to be one of the most effective neural network training methods.However,Quasi-Newton training often needs long time especially when the ANN architecture is large.To accelerate the training process,Quasi-Newton method based on floating point arithmetic has been implemented on FPGA.The implementation consists of six modules: Pseudorandom Number Generator(PNG)module,Line Search(LS)module,Gradient Calculation(GC)module,H Updating(HU)module,Objective Function Evalution(OFE)module and Computing Schedule Controller(CSC)module.By evaluation of each module,it is found that there is still a large space for optimization no matter in resource usage or execution time.This paper aims to give a futher optimization to the hardware implementation of Quasi-Newton algorithm by approximate computation.Firstly,the resource usage evalution shows that the HU module is the most computation and memory intensive part.Therefore,a fixed-point hardware design of HU module is proposed.A matrix property checking and search direction switch scheme is proposed to address the overflow issue,and a precision scaling scheme is proposed to deal with the low precision issue caused by the reduced wordlength.The experimental results show that compared with the single-precision floating point version,the mixed precision design with fixed point B matrix updating reduces the usage of LUTs,FFs,DSPs and BRAMs by 10.9%,20.2%,2.2% and 18.1%,respectively.Then,the execution time evalution shows that the LS module is the most time-consuming module.In order to solve the issue,an inexact line search method is implemented to replace the original exact line search method.For the highest training speed,an end-to-end FPGA version of Quasi-Newton method using inexact line search method is implemented.Moreover,the inexact line search method makes hardware-software co-design possible.The objective function evalution unit in line search module which consumes the most computional resource is moved to CPU for aspeed and resource tradeoff.The experimental results show that the end-to-end FPGA design achieves up to 239 times speed up compared with software implementation.The FPGA+CPU design is up to 153.1 times faster than the software implementation and achieves up to 45% LUT,29% FF and 64% DSP reduction compared with the FPGA design.

Keywords/Search Tags:

Neural network training, Quasi-Newton method, FPGA, Approximate computation

PDF Full Text Request

Related items

1	Hardware Acceleration Of Quasi-Newton Method And Its Application In Neural Network Training
2	Training Neural Network With Second-Order Algorithm
3	Deep Nets Training Via Distributed Approximate Newton-Type Method With Adam-Based Local Optimization
4	Parallel Algorithm Design Based On Heterogeneous Computing Platform For Neural Network Training
5	Optimization approaches to the training of neural networks with RF/microwave applications
6	Research And Application Of Unconstrained Optimization Method Based On BP Neural Network
7	Research On Application Of Neural Network In Nonlinear Predictive Control
8	The Research On The Key Problems Of The Approximate Bayesian Computational Inverse Problem With Uncertainty
9	Learning Algorithm For Neural Network Based On Analytic Optimization Methods
10	Image Reconstruction Algorithm Based On Simplified Numerical Optimization Back Propagation Neural Network