Hardware Acceleration Of Quasi-Newton Method And Its Application In Neural Network Training

Posted on:2018-02-26

Degree:Master

Type:Thesis

Country:China

Candidate:R Y Sang

Full Text:PDF

GTID:2348330542479455

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

Artificial neural network(ANN),inspired by synaptic structure of animal brain,is a mathematical model that can process information in parallel.It has been widely used in biology,medicine,electronics,economy and other fields.Training,which adjusts NN weights to make expected values close to actual output values,is the most important step in developing a NN model.There are many optimization algorithms used in NN training,such as gradient descent method,conjugate gradient method,quasi-Newton(QN)method,particle swarm optimization method and so on.The quasi-Newton method is popular for its fast convergence and lack of second-order derivative computation.However,the iterative optimization process implemented in software is very time-consuming,unable to meet the needs of embedded applications.So it is high time to provide a hardware-accelerated method for NN training.Recently,with high parallelism,flexibility(compared to ASIC),low power(compared to GPU)and increasing density,Field Programmable Gate Arrays(FPGAs)have become a more attractive alternative to accelerate scientific computation.Therefore,this paper aims to propose an FPGA-based customized hardware implementation for NN training using QN method.In this work,the algorithm is divided into four modules by analyzing QN method.They are Gradient Compute(GC),H Updating(HU),Line Search(LS)and Objective Function Evaluation(OFE).Each module is implemented as a hardware block in Verilog,and its architecture is customized according to the operations involved in the module.Each module uses pipeline technology and module reuse technology.This paper presents two hardware architectures,the DFP architecture and the BFGS architecture.DFP architecture uses a DFP-QN method and a general approximate gradient architecture,which is applicable various object functions,but is time-consuming and has a zero-overflow problem.The BFGS architecture effectively improves the shortcomings of the DFP architecture,using the BFGS-QN method and the gradient expression particularly deduced from the NN training.Both architectures are easily scalable and able to cope with different network sizes while it supports on-line training.Two hardware designs are synthesized and implemented on the Net-FPGA SUME(xc7vx690tffg1761-3)board using Xilinx Vivado 2014.04.The performance of two hardware designs is evaluated in terms of resource utilization,execution time and power consumption.Experimental results demonstrate that the speed up of 17 times is achieved by DFP architecture and 106 times is obtained by BFGS architecture,compared to the corresponding software implementations.In addition,the BFGS hardware architecture is tested in the real scene,and results show that the design has better acceleration characteristics for the network of multiple output neurons.

Keywords/Search Tags:

Neural network training, Quasi-Newton method, FPGA, Hardware acceleration

PDF Full Text Request

Related items

1	Hardware Implementation Of Quasi-newton Neural Network Training Algorithm Based On Approximate Computation
2	Research And Implementation Of Convolutional Neural Network Acceleration Method Based On FPGA
3	Training Neural Network With Second-Order Algorithm
4	Research On CNN Network Acceleration For Image Classification Based On FPGA
5	Parallel Algorithm Design Based On Heterogeneous Computing Platform For Neural Network Training
6	Research On FPGA Acceleration Of Neural Network Algorithm
7	Research And Implementation Of FPGA Acceleration Method For Convolutional Neural Network
8	Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA
9	Research On FPGA-based Convolutional Neural Network Accelerated Computing Method
10	Design And Implementation Of Convolutional Neural Network Acceleration Based On FPGA