Research And Design Of Reconfigurable Array Structures For Convolutional Neural Networks

Posted on:2023-05-16

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Zhu

Full Text:PDF

GTID:2568307127482954

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

The complexity of Convolutional Neural Network(CNN)models typically increases with the complexity of the task,which presents more significant challenge to traditional processor computing power.Reconfigurable array structure features the high flexibility of traditional processors and the efficiency of dedicated integrated circuits,providing new ideas for designing Artificial Intelligence(AI)chip.However,the implementation of CNN algorithms on reconfigurable array structures still faces problems in term of computational complexity and large storage requirements,and the paper conducts in-depth optimisation studies in the following aspects to address these problems.Firstly,the paper proposes a pruning and quantization fusion approach for the compression of network models to alleviate the pressure of network models on the storage space of reconfigurable array structures.structured model pruning technique is used to reduce the number of model parameters by pruning neurons that are not insensitive to the computational results,and random quantization rounding of the compressed floating-point parameters could effectively reduce the consumption of hardware resources.LeNet5 and AlexNet are selected for validation,and the experimental results show that the accuracy loss is about 2%and the parameter reduction is about 56.3%.Compared with pruning model compression,the compression rate is increased by up to 19.9%with essentially the same or improved network recognition accuracy.Secondly,in order to break through the limitations of the application scope of the current reconfigurable array structure,MAC,MAX and AVE instructions are added to the CNN algorithm in the reconfigurable processor element(PE),and the corresponding hardware structure design is completed in the execution unit of PE according to the new instructions.The experimental results show that the CNN-oriented reconfigurable PE could quasi-complete convolution,pooling and function activation operations,reduce the number of clock cycles by 58.8%compared with the generic instructions,and decrease the hardware resource usage by 35.9%compared with similar structures.Then,the data reuse optimization strategy of cyclic chunking and unfolding is proposed for the problem of large amount of data repeatedly accessed by convolutional operations under reconfigurable arrays.In order to maximize the advantages of reconfigurable structures,a cyclic chunking optimization design for convolutional operations and a cyclic unfolding design based on convolutional kernels and input feature maps are carried out.Results from tests with various sizes of convolutional operation show that data accesses could be reduced by up to 83.6%,Compared to sliding window-based reuse methods,the number of convolutional operations multiplying accumulation is reduced by up to 16.25%.Finally,to verify the effectiveness of the CNN-oriented reconfigurable structure optimization,a reconfigurable implementation of the AlexNet network is proposed and the hardware testing and performance analysis are completed based on the Xilinx ZC706 development board.The results show that adopting the reconfiguration scheme of this paper under the CNN-oriented reconfigurable structure,a single cluster PE utilization rate of up to 100%could be achieved,and the speedup ratio of multi-threaded convolutional operations of all sizes can reach up to 2.45 compared with that of single-threaded.In summary,the CNN-oriented reconfigurable structure optimization can effectively improve the efficiency of CNN algorithm operation,and its maximum operating frequency can reach 147MHz.Compared with the literature[49],the combined improvement in processing speed of the completed AlexNet network is approximately 60.6%.Compared with the literature[52],[53],the structure of the processing network is more complex with a similar consumption of hardware resources.The hardware resource consumption for processing the same convolutional neural network is reduced by 45.8%compared to the literature[54].

Keywords/Search Tags:

Convolutional Neural Network, Model Compression, Reconfigurable Array, Data Reuse, Computer Architecture

PDF Full Text Request

Related items

1	Compression Algorithm And Circuit Design Of Convolutional Neural Networks
2	Research On Technology Of Model Compression For Convolutional Neural Networks
3	Research On Compression And Acceleration Of Deep Convolutional Neural Networks
4	Research On Convolutional Neural Network Compression Method Based On Dynamic Pruning And Weight Resetting
5	ZYNQ-Based Reconfigurable Convolutional Neural Network Accelerator
6	VLSI Architecture Design For Binary Convolutional Neural Network Accelerator
7	Deep Model Compression In Computer Vision
8	Research And Implementation Of FPGA Accelerating Compressed Convolutional Neural Network
9	Structured Deep Neural Network Compression Based On Computer Vision
10	Convolutional Neural Network Artificial Intelligence Chip Research On Weight Compression And Fast Data Buffering