Font Size: a A A

Hardware Acceleration Of Quasi-Newton Method And Its Application In Neural Network Training

Posted on:2018-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:R Y SangFull Text:PDF
GTID:2348330542479455Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Artificial neural network(ANN),inspired by synaptic structure of animal brain,is a mathematical model that can process information in parallel.It has been widely used in biology,medicine,electronics,economy and other fields.Training,which adjusts NN weights to make expected values close to actual output values,is the most important step in developing a NN model.There are many optimization algorithms used in NN training,such as gradient descent method,conjugate gradient method,quasi-Newton(QN)method,particle swarm optimization method and so on.The quasi-Newton method is popular for its fast convergence and lack of second-order derivative computation.However,the iterative optimization process implemented in software is very time-consuming,unable to meet the needs of embedded applications.So it is high time to provide a hardware-accelerated method for NN training.Recently,with high parallelism,flexibility(compared to ASIC),low power(compared to GPU)and increasing density,Field Programmable Gate Arrays(FPGAs)have become a more attractive alternative to accelerate scientific computation.Therefore,this paper aims to propose an FPGA-based customized hardware implementation for NN training using QN method.In this work,the algorithm is divided into four modules by analyzing QN method.They are Gradient Compute(GC),H Updating(HU),Line Search(LS)and Objective Function Evaluation(OFE).Each module is implemented as a hardware block in Verilog,and its architecture is customized according to the operations involved in the module.Each module uses pipeline technology and module reuse technology.This paper presents two hardware architectures,the DFP architecture and the BFGS architecture.DFP architecture uses a DFP-QN method and a general approximate gradient architecture,which is applicable various object functions,but is time-consuming and has a zero-overflow problem.The BFGS architecture effectively improves the shortcomings of the DFP architecture,using the BFGS-QN method and the gradient expression particularly deduced from the NN training.Both architectures are easily scalable and able to cope with different network sizes while it supports on-line training.Two hardware designs are synthesized and implemented on the Net-FPGA SUME(xc7vx690tffg1761-3)board using Xilinx Vivado 2014.04.The performance of two hardware designs is evaluated in terms of resource utilization,execution time and power consumption.Experimental results demonstrate that the speed up of 17 times is achieved by DFP architecture and 106 times is obtained by BFGS architecture,compared to the corresponding software implementations.In addition,the BFGS hardware architecture is tested in the real scene,and results show that the design has better acceleration characteristics for the network of multiple output neurons.
Keywords/Search Tags:Neural network training, Quasi-Newton method, FPGA, Hardware acceleration
PDF Full Text Request
Related items