Font Size: a A A

Algorithm Research And FPGA Design Of Lightweight Neural Network For Image-Text Description

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:W Z LiuFull Text:PDF
GTID:2518306557486994Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Image-to-text description task has many application scenarios,which can automatically match the title for the image or can be transformed into text to help the visually impaired better understand the content of images in life.Convolutional Neural Network(CNN)has the ability of feature extraction while Long ShortTerm Memory(LSTM)can process data of time series.These networks are combined,which can automatically generate the text for image description.However,for the actual demand,the calculation of deep neural network layer is enormous and the serial execution mode of CPU can not utilize the network parallelism to complete computing,which puts forward high requirements for the equipment and consequently can not support the real-time application in the low-power mobile terminal.In view of the above problems,the lightweight neural network of image-to-text description is studied from two aspects of algorithm research and hardware design.At the algorithm level,CNN processes the input image into a vector containing feature information and transforms the vector into a readable sentence.Through the analysis of diamond receptive field,a method of diamond convolution and diamond pooling is improved,which is suitable for hardware and reduces the amount of parameters and calculation.Combining it with lightweight group convolution and the channel rearrangement of the feature map,the lightweight image-to-text description algorithm is obtained after the 8-bit data width is quantized.The lightweight imageto-text description algorithm is tested in the Flickr30 k image annotation dataset,and the tested bleu-1 index score reaches 45.2.At the hardware level,this design reduces the network's intermediate storage by changing the internal implementation sequence of CNN.After fully mining the parallelism of convolution,a reasonable convolution calculation method and storage method are given.The calculation unit and the realization method of matrix multiplication are planned.After analyzing the calculation characteristics of matrix multiplication,the design scheme of LSTM network is formulated,and the activation function is realized by piecewise linear function.On this basis,the hardware accelerator is implemented by Verilog HDL language,and a verification system with the accelerator is built on the platform of PYNQ-Z2.The comprehensive power consumption is0.993 W,and the calculation energy efficiency is 9.2 GOP/s/W.The lightweight neural network accelerator for image-to-text description uses less resources to achieve lower power consumption,which meets the requirements of real-time and power consumption of image-to-text description applications in mobile terminals.
Keywords/Search Tags:Image-to-text Description, CNN, LSTM, Hardware Accelerator, FPGA
PDF Full Text Request
Related items